A new categorization system for Wikka
[This page was written by a non-native english speaker].Weaknesses of actual categorization system
Our actual (as of version 1.1.6.3) system of categorization is based on word search. For example, if we have a category named CategoryBook, when we want to list all pages related to this category, the system searches in database for all pages containing the word CategoryBook. The main problems with this system are :- Inefficiency: The search may take a relatively long time to complete. On a big database or on an overloaded SQL server, it may take up to 10 seconds, or even longer.
- I do not trust FullTextSearch : If the FullTextSearch is available, Wikka uses it as optimization. But the problem is that FullTextSearch can't be trusted 100% to be true : It may arrive that a page really containing the term CategoryBook is not returned by the query.
- If you named another category CategoryBookJournal, You would normally put the word CategoryBookJournal on each page related to that latter category. But since the word CategoryBook is also retrieved in the word CategoryBookJournal, all pages related to the CategoryBookJournal will be also listed as related to CategoryBook.
- Higher risk of miscategorization : The categorizing system searches for the content of the entire page, not only the last sentences. A big page not related to the category CategoryBook may contain a word like càtégorybook (spelled differently), but MySQL won't make any difference between the letter a and à, and the Query will return also that page.
New categorization system proposed
The new categorization system proposed consists of using linktracking. If a page named MyBook is related to a category CategoryBook, it is normal that that page contains a link to the category page CategoryBook, isn't it? Fortunately, Wikka tracks also links between pages, so the pages MyBook and CategoryBook will be linked in table [table_prefix]links. Then, to find what pages are related to the category CategoryBook, it will be sufficient to search at the links table for pages linking to CategoryBook. That is to say, the pages related to a category CategoryBook are just the backlinks of the page named CategoryBook.In other words (again), the new rules for the new categorization system proposed are :
- A page related to a category should link to that category, not only mention it as for now
- A page not related to a category should not link to that category. Thus, if you should write the word CategoryBook in a page not related to that category, you have to enclose the word CategoryBook in doubledouble quote in order to unlink it. (""CategoryBook""). The actual corresponding rule is to insert a space anywhere in the word, like Category Book.
Problem of implementation
BUT, there is an issue in actual (as of version 1.1.6.3) linktracking system. For pages created by the Installer, the corresponding entries in the links table are not created. And they won't be created until you modify each page. In consequence, Category pages will be blank after initial install, and some pages will be missing on upgrade.A fix is planned in the installation/upgrade to 1.1.7: The links table will be rebuilt just after install.
http://wush.net/trac/wikka/changeset/179