Proposal for an Advanced Category System
How categories work now
Right now the information to which categories a page belongs is stored inside the page itself.This leads to some problems:
- Adding or removing a page to/from a category requires a page edit (minor problem)
- Mentioning a category in a page which is not part of that category is difficult (major problem)
- Moving a subcategory requires 2 page edits (minor problem)
- Renaming a category requires n page edits, where n is the number of pages in that category (major problem)
Possible solution
Move the information about categories out of the pages and store them in 2 seperate tables, one for the categories and their hierarchy, one for the relations between pages and categories.Implementation
Relationship between categories
The categories can be modelled as a tree. (This allows a non-hierarchical system too by simply placing all categories directly under the root node of the tree and hiding the root node itself from beeing displayed).Since most of the time information about categories is read and only seldom written, Nested Sets are a good way to map our category tree to the mysql database. (See http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=1266295 (english) or http://www.develnet.org/36.html (german) for details on how nested sets work).
Relationship between categories and pages
One category can have 0 to n pages, so the most simple way is a table with two 'foreign keys', with each entry mapping a page(-version) to a category-id (so you can simply rename a category).One page can belong to 0 to n categories (explicit; I would prefer 1 to n categories, with the root category beeing implicit if no other category is specified for a page)
Problems here
- Deleting/moving a category or a subtree produces high load on the database. For trees with less than 100 nodes on halfway decent servers it should be done in less than 1 second tho, most propably in less than 0.1 second - and I doubt someone would want more than 100 categories
- a lot of wikka code has to be modified to fit the new system (see below)
Which parts of wikka need to be modified
- the edit handler needs to be adjusted so you can easily set/change the categories a page belongs to
- a new action to manage the category hierarchy is needed (I am working on this right now)
- the category action needs to be re-written (most of the code can be taken from the management-action tho; I am even thinking of completly integrating the functionality there)
- header and/or footer actions should be adjusted to show the category of the current page (or, which I would like even better, show a breadcrumb navigation)
Extras possible with the new system
With the new system it would be possible to handle ACLs more easily (and add inheritance to the ACL system). In my development version for each category I store ACLs as well (the fields can be NULL). My plan is as follows:- From all categories a page belongs to, one (and only one) can be marked as the "main parent"
- If a page has no page-specific ACLs assigned it will inherit the ACLs of its main parent
- If the main parent has no ACLs assigned it will inherit them (recursivly) from their parent
- If the root category has no ACLs assigned the default ACLs from wikka.config.php will be used.
When will this all be available
I hope to get this done within the next week, maybe two, but I can not promise anything. I will update this page whenever I have news.State of the code
Right now approximately 20% of the code are working.CategoryUserContributions
- In "Relationship between categories and pages" where you say "each entry mapping a page(-version) to a category-id" I guess you mean a *page* (as identified by its tag), not a *page version* (identified by the record id), right ?
- I'm also in favor of handling categories through a separate field in the edit form (I personally don't like the current idea of 'adding categories through a link', since this raises some of the problems you mention at the top of this page). A separate field for adding categories look also a quite promising case for implementing an ajax-like system, like those currently implemented at del.icio.us or flickr.
Looking forward to seeing the actual implementation...
For example, you have categories for actions in alpha, beta and 'gold' state.
When an action changes the state the page usually will be edited and at the same time moved to a new category.
When you do a revert tho the old version of course belongs to the old category, which would then happen automatically.
I can even imagine an advanced category action which (optionally!) lists the newest version of pages belonging to a category, even if those versions are not the latest versions of the pages.
More explicit:
The page CategoryActionInfo is member of the category ActionsInBetaState, the ID of the current page version is 42. A lot of time has passed and the last bugs have been fixed, so the action enters gold state. Page is edited with the last bugfix (new version has ID 76) and moved to category ActionsInGoldState. On all category pages the category action is included with the parameter "listExPages" set to true. Now on ActionsInBetaState the page CategoryActionInfo is still listed (maybe in a different color or whatever), with the link pointing to the version with id 42, while on ActionsInGoldState the same page is listed with the link pointing to the latest version (76).
One more reason which speaks for id instead of tag: The tag might change (there is no rename handler yet as far as I know, but I wouldnt be surprised if someone writes it one day).
Additional note: apart from like a (page) rename handler I could also imagine a "category rename handler" - that would be easy as long as categories (like pages) have an id and linking is done with that id rather than the category name.