Proposal for an Advanced Category System


How categories work now

Right now the information to which categories a page belongs is stored inside the page itself.
This leads to some problems:
(There might be more, feel free to add them.)

Possible solution

Move the information about categories out of the pages and store them in 2 seperate tables, one for the categories and their hierarchy, one for the relations between pages and categories.

Implementation


Relationship between categories
The categories can be modelled as a tree. (This allows a non-hierarchical system too by simply placing all categories directly under the root node of the tree and hiding the root node itself from beeing displayed).
Since most of the time information about categories is read and only seldom written, Nested Sets are a good way to map our category tree to the mysql database. (See http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=1266295 (english) or http://www.develnet.org/36.html (german) for details on how nested sets work).

Relationship between categories and pages
One category can have 0 to n pages, so the most simple way is a table with two 'foreign keys', with each entry mapping a page(-version) to a category-id (so you can simply rename a category).
One page can belong to 0 to n categories (explicit; I would prefer 1 to n categories, with the root category beeing implicit if no other category is specified for a page)

Problems here

Which parts of wikka need to be modified

Extras possible with the new system
With the new system it would be possible to handle ACLs more easily (and add inheritance to the ACL system). In my development version for each category I store ACLs as well (the fields can be NULL). My plan is as follows:
  1. From all categories a page belongs to, one (and only one) can be marked as the "main parent"
  1. If a page has no page-specific ACLs assigned it will inherit the ACLs of its main parent
  1. If the main parent has no ACLs assigned it will inherit them (recursivly) from their parent
  1. If the root category has no ACLs assigned the default ACLs from wikka.config.php will be used.
Now if you have a set of pages that should only be edited by certain users, for example official documentation to your software, simply create a category, assign ACLs to that category and add the set of pages to it (the category management action will allow to add/remove a group of pages to/from a category in 1, maybe 2 steps).

When will this all be available
I hope to get this done within the next week, maybe two, but I can not promise anything. I will update this page whenever I have news.

State of the code
Right now approximately 20% of the code are working.


CategoryUserContributions
Comments
Comment by DarTar
2005-06-24 11:58:06
TimoK, this proposal looks interesting. Some quick remarks:

- In "Relationship between categories and pages" where you say "each entry mapping a page(-version) to a category-id" I guess you mean a *page* (as identified by its tag), not a *page version* (identified by the record id), right ?
- I'm also in favor of handling categories through a separate field in the edit form (I personally don't like the current idea of 'adding categories through a link', since this raises some of the problems you mention at the top of this page). A separate field for adding categories look also a quite promising case for implementing an ajax-like system, like those currently implemented at del.icio.us or flickr.

Looking forward to seeing the actual implementation...
Comment by TimoK
2005-06-24 13:14:44
DarTar: Actually I really mean page version. Of course when editing the new version will inherit the category from the last version, but it might make sense in some cases to assign different versions to different categories.

For example, you have categories for actions in alpha, beta and 'gold' state.
When an action changes the state the page usually will be edited and at the same time moved to a new category.
When you do a revert tho the old version of course belongs to the old category, which would then happen automatically.
I can even imagine an advanced category action which (optionally!) lists the newest version of pages belonging to a category, even if those versions are not the latest versions of the pages.

More explicit:
The page CategoryActionInfo is member of the category ActionsInBetaState, the ID of the current page version is 42. A lot of time has passed and the last bugs have been fixed, so the action enters gold state. Page is edited with the last bugfix (new version has ID 76) and moved to category ActionsInGoldState. On all category pages the category action is included with the parameter "listExPages" set to true. Now on ActionsInBetaState the page CategoryActionInfo is still listed (maybe in a different color or whatever), with the link pointing to the version with id 42, while on ActionsInGoldState the same page is listed with the link pointing to the latest version (76).

One more reason which speaks for id instead of tag: The tag might change (there is no rename handler yet as far as I know, but I wouldnt be surprised if someone writes it one day).
Comment by JavaWoman
2005-06-24 23:08:48
Linking a category to a page versions makes sense to me; that way *changes* to the category a page belongs to can be recorded in the history as well. A new page version would then inherit the current page category/categories unless the edit *also* changes what categories the page belongs to. IOW: that's exactly as I envisioned it as well. :)

Additional note: apart from like a (page) rename handler I could also imagine a "category rename handler" - that would be easy as long as categories (like pages) have an id and linking is done with that id rather than the category name.
Comment by TimoK
2005-06-25 06:27:18
Renaming categories of course will be possible.
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki