[CentOS-docs] Improving the website and forums

On Sat, 2009-01-03 at 21:17 +0100, Ralph Angenendt wrote:
> JohnS wrote:
> > 
> > On Sat, 2009-01-03 at 19:41 +0100, Ralph Angenendt wrote:
> > > JohnS wrote:
> > > > Hey Dag, does the Wiki even have a Site Map? If not that can greatly
> > > > help with the google searches (has to be submitted to Google). 
> > > 
> > > http://wiki.centos.org/TitleIndex is there. No idea if that is what Google
> > > accepts as a sitemap.
> > 
> > No it has to be Submitted to Google.
> 
> Yeah, I know that :)
> 
> > http://wiki.centos.org/TitleIndex?action=titleindex This can used for
> > the site index What needs to be done is append www.wiki.centos.org to
> > each line.
> 
> That's bad. Append or Prepend? In what form? You seem to have a bit of
> knowledge, so let me bother you instead of searching myself >:)

Ok in the pure text form of index "/TitleIndex" would need
www.wiki.centos.org in front of it like so:
"www.wiki.centos.org/TitleIndex" But that's the hard and old way keep
reading there's a better solution on down.

> Regarding the rest of your mail - I have to chew and digest that first.

OK maybe this will clear a lot up for you:

Google has a Python Script that will generate the Site Index for the
Wiki. Also what is very 
important in doing so is the inclusion of the last modified date for the
pages in question.

Indexing of the Wiki is very simple to do so. The forums on the other
hand is well not so simple
because it will mostly have to be done via XML and use the Python
XML:SAX Libary to Escape the URLs.
Although it could probally be done in text form also. By just using
Indexing you should not have
to change the MoinMoin code base for anything whatsoever.

Also the internal MoinMoin Search could really be improved upon. There's
no sense in haveing a 
Title and Text Search (poor design). It has a little to Much "LIKE%" in
the search string 
and not enough "LIKE" if you know what I mean. Both of those can be used
in MSSQL and MySQL, 
and Python I do not know the syntax for it. Someone can translate those
or your self 
should be able to. It pulls in everything /under_the_sun and it needs
specifics.

Example: What DAG pointed to and is chewing on...

<title>AdditionalResources/HardwareList - CentOS Wiki</title> Wrong Way

Needs to Be:

<title>HardwareList - CentOS Wiki</title> Correct Way

To do that still the code base has to be modified no matter how you look
at it. The way around 
it??? Create a Site Map and Submit it to Google. Then create a
"robots.txt" containing "nofollow" 
inside it or a "follow" on specified dates.

If there is a major update to the Wiki as in using a different one then
keep in mind of how it
handles the Meta Name and Title Content on page creation. Very Important
in Web Design. 
Further more you may inspect larger Incorporated Sites and see Meta Name
id=121212. That number inturn gets tranlated to a index that get auto
generated and usually gets put into an 
SQL Server of Type. Then submitted to a search engine. All of which can
be done automatically. I
suspect this can be done by a Cron Job using the Google Site Map
Generator code Once Daily.

In all honesty it should not matter about how a Project like CentOS
caters to there users as 
long as there satisfied and the content delivered. What I mean is the
project needs to 
take a step back and reconsider how content is distributed. Things like
stability of the 
platform that is delivering it. I could give a lot of options on what to
use. I suspect 
something more the lines of "The CentOS Content Delivery System" on the
Java Platform. I
suspect the sources would not be that hard to build from upstream. ??? 

References:
http://www.google.com/support/webmasters/bin/answer.py?answer=34654&topic=13452
http://www.google.com/support/webmasters/bin/answer.py?answer=34575
http://code.google.com/p/sitemap-generators/