On Sat, 2009-07-04 at 00:56 +0200, Dag Wieers wrote: <snip> > John, > > I agree for the website. But for the wiki, we need some fixes too and > we're not going to replace the wiki by a 'Web App'. OK, that Wiki is a Web Application. I'm not really advocating to replace it. It is dynamic and static in nature. > So whatever we can do to improve the pages (as stated in those threads) we > should do as soon as possible. > > I noticed a lot of other centos website are being created by people to > fill in the void there is because our wiki simply sucks from a Google > point of view: I guess that some people like to have there on sites and the freedom not to be held down by agreeing to a license. Possibly some do not know they can come to the wiki and write there content on it and keep the page maintained by them like a few others do. Perhaps some are a little selfish. The wiki is really not advocated on the main centos.org site. Hopefully that will change. There is a link to it on the main site but it is just bland in nature. It says "wiki" but no clue or indication of what it has or contains. Another problem is 50% of the people searching do not know how to search for something. An extension to convert the page to a PDF so the viewer could save it would be well kinda nice. <snip> > > The last ones are really to cry for because lots of these results do have > a link to our wiki, but the wiki is simply missing from the first 100 > results... > > Even searching for: > > x200s site:wiki.centos.org > > results not in the page I was hoping for. As if the wiki (or at least > big parts of it) is simply ignored by Google. Strange[1][2] Stranger, Bing even pops ups the wiki url for "centos wireless". Hit number two! What happened to google? Bing has crawled the site real recently. > We have a robots.txt that has nothing really in it. From what I can find, > an empty robots.txt should have the same effect as no robots.txt, but ours > is not exactly empty in the true meaning of the word. So maybe we should > empty it, or simply remove it altogether ? Not really true. Depends also on the structure of the web page. For example the index.html or for that fact the very base page can have also in the Meta Content Section follow or nofollow. Which in turn google or bing bots will follow and crawl the site. Screen Scraper: Actual Wiki Base Page. So there fore it will get followed by any bot. This probally really confuses some people at a gasp. <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,follow"> > PS The wiki does seem to be indexed, so I doubt it is the robots.txt Yes, there is a XML indexing file and I did verify that back at previous discussion of which I want publicly throw out the url to it. It could generate unforseeable actions. Mainly because it is regenerated on every request to it. Something I would look into right now is the server logs to see when the last time google crawled the wiki. I really think it has been awhile since it has. [1] < http://www.bing.com/search?q=site%3Awiki.centos.org +wireless&go=&form=QBRE&scope=web&qs=n > [2] < http://www.google.com/webhp?hl=en&btnG=Google+Search#q=site% 3Awiki.centos.org+wireless&hl=en&sa=2&fp=A3HD1AZvc28 > John