Wiki source for CategorySystemOverhaul

Show raw source

===Category System Overhaul===

The category system used in Wikka needs improvement, and I'm looking for ideas and comments. -- JsnX

==Current system:==
User adds a wikiword category tag to each page, such as CategoryDevelopmentArchitecture.
For more information, see WikiCategory.

Issues with current system:
- Not newbie friendly.
- Non-obvious, making it easy to forget to add a category tag.
- Resource wasteful: Category changes are stored as page updates, thus added database entries.
- Time wasteful: Category changes cause pages to show in RecentChanges, forcing people to review them for what changed.

==Proposed system:==
User selects a category from a dropdown box during page editing.

- Create a new table named Categories.
- Create an action for adding items to, and removing items from, the category table.
- Add a field to the pages table named 'category'.
- Show a dropdown box during page editing that lets the user select a category. The value in the dropdown box is then saved in the 'category' field for that page.
- Create a new category action that will list the available categories.

Issues with this proposal:
- As proposed, this will only allow one category per page.


''**Multiple categories per page**
You say: //this will only allow one category per page//
This is not necessary. If we want to allow more categories per page, we might think of storing //sets of values// (I recently had the very same problem with a user management system in which users can belong to different categories):
- Make the category field in ##wikka_pages## accept comma separated values (see MySQL's [[ SET type]]);
- On page editing/creation, instead of a dropbox, display a list of check boxes
The ##wikka_category## table might then contain two separate fields for each category:
1) a system-generated **unique-key** (TINYINT), invisible to the user and handled by the system for cross-referencing the ##wikka_pages## table (whose category field will then contain a simple comma separated list of numerical keys, like: "1, 5, 7")
2) a human-readable **category label** (VARCHAR), like "Documentation", or "Development".
Keeping the key vs. category label separate might also help cope with i18n issues
-- DarTar''

Yet another approach:
~1) **##categories##** table:
~~-parent (may be NULL for top-level category)
~~-maybe extra "administrative fields such as timestamp
~This would take care of a categories hierarchy
~2) **##pagecats##** table:

~Now we can store many-to-many relationships between pages and categories; and everything is neatly normalized, too.
~When a page is created/edited a dropdown (rendering the hierarchy!) could allow choosing one or more categories; there should be an extra option to (instead) **create** a new category and assign it to an existing one. Edits should of course also allow to **remove** a category for a page (while still allowing assigning or creating a new one). Note I'm using page_id, not page_name, in **##pagecats##** - on purpose, so the categorization belongs to the page version; a new page version should (initially) inherit all categories, of course.
~That might become a tad complicated for an edit dialog; so maybe "categorization" sould be a separate dialog (handler). I'm sure there are nice examples around for dialogs for maintaining a hierarchy of categories. (Many CMS systems have something like that, I know there are demos around.)
~Extra wrinkle: what to do with pages when a category is deleted? (Probably assign to the parent category.)

JavaWoman, that sounds good. And a page showing the none categoriesed wikipages could be helpfull :)

I am working on something very much like this, see AdvancedCategorySystem for details.
-- TimoK

''JW, I confess I was thinking of a set of //horizontal// categories, not of a //hierarchy// of categories. The two approaches have different kinds of use and different pros/cons. The horizontal approach is certainly much easier to implement/maintain, the vertical approach requires some extra work, but I think it is worth the effort. I welcome the idea of a dedicated handler, instead of cluttering the edit page. -- DarTar''

Kevin Yank from has an example in his book which uses checkboxes rather than a dropdown menu to allow multiple categories to be chosen.

''Why a hierarchy?
The current system (admittedly not very user-friendly or at least not newbie-friendly) does allow a hierarchy of categories. I wouldn't like to lose that capability. The **##categories##** table as I proposed allows such a hierarchy without enforcing it: if the community "decides" they don't need a hierarchy they'd just not assign a new category to a parent - and that's that; it will be as flat as the users make it. (A maintenance handler //could// enforce a hierarchy, given such a data model, but I don't think that's necessary or even sensible.)

Another issue: conversion for an existing Wikka Wiki. A conversion utility (something definitely needed) should migrate all existing categories; and if that Wiki already has a hierarchy of categories, and the new system doesn't support that - what are you going to do? It's not easy to "flatten" and existing hierarchy into something still meaningful.

Definitely extra work, but are we in a hurry? We do have a system that actually works, even if it is hard to use. And I think categorization (and a user interface to support it) is important enough to give it careful thought.
-- JavaWoman''

Perhaps we should at first define/think about, what a category-system could (should?) have for features:
- Pages should be able to belong to zero or more categories
- Cat. should be able to belong to zero or more cat.
- it should be easy to add/delete a category to/from a page
- a admin should be able to rename/delete cat.
- there could a page which lists all pages belonging to no cat.
- a admin should merge two cat./ divide a cat. into two or more
- a action like "nocategory" which prevents adding a cat. to pages
- an alphabetical index like at the page index would be usefull for large categories

==no categories?==
The developer of Comawiki wants to intoduce a system he calls "father-and-son-pages" into his Wiki. That means that a page can have pages belonging to it. With this in my mind, a re-read of my list above and the things I thought for my event-system, I come with another approach:

we dont use extra pages for categories, but make it possible for pages to belong to other pages. This would requiere a table with ##page_id## (of the page) and ##belong_id## (to which page does this page "belong") [needs better names, I know]. And the original page table needs a field with the number of pages belonging to a page.

Instead of a list of categories, we present a list of pages, ordered descending after the number of pages they already have. Pages without pages attached should be shown seperatly.

example: Category Development would become a page like WikkaDevelopment and would be shown very high in the list, because it would have 47 pages belonging to it.

''Such a "father-and-son-pages" system would mean organising pages **as** a strict hierachy (note that this is not the same as organising them **in** a strict hierarchy!). It's an interesting concept but doesn't allow any kind of classification: in fact, it's completely orthogonal to a categorizing system, which I definitely would not want to give up. One could have both, but I do want //at least// a good categorizing system. In fact, I think that's indispensable for a Wiki!'' --JavaWoman

Looks like I should think a little bit about this for myself ;-) --NilsLindenberg

How about instead of ##belong_id## use ##parent_id## --KickTheDonkey

==Idea for Category Support==
I moved my suggestion here, it the better place for it. Thanks DarTar.
The Category Support is currently not perfect. Here is my suggestion:
add the Categorie to the URL in front of the wiki name:
mod_rewrite url: (I left out Camelcase here for easier formatting)
real url:
Whenever a page is called with a categorie override the configs base_url with the url including the the categorie (+ minus sign when using mod_rewrite) . The browser will translate all urls on this page to the url including the current category.
Whenever you want to set a link to a different category use InterWiki links - or something simliar to distingish between Category and InterWiki links.
Whenever a new page is created by clicking on AnonExisting page, the new page will inherit in the category from the linking page.
I thinks it is easy to implement - did this already partially on my page.
A page can only exist in one single Category.
**Open Issues**
~& Added after reading the diskussion above: If you change the category of the parent, all links to the childs are also changed, so you have to change the categorie of all child pages which can be done by a single SQL-query. If you delete a parent, all childs are staying in the categorie. Hierarchical categories are possible by using more delimiter in Url (eg. "-").

== Yet more suggestions...==

Please, please, //please// don't change the current category implementation! It's exceedingly, wonderfully flexible and usable as it is—it just needs a few refinements. :)

A couple of suggestions (some of which have been raised before elsewhere, I'm sure):

1) There needs to be a way to link to a category page //without// having the page actually being put into the category. It should to be a very simple addition (from the user side) to the regular category link.

2) There needs to be a way to tell the category page how to sort a page in the category list. For example, that way a page about a person (e.g. ""AlbertEinstein"") could sort correctly (under E instead of A).

One wiki that I think handles categories spectacularly well is MediaWiki. I don't like a lot of stuff about MW (I think it's pretty clunky after having tried it out), but I do (mostly) like [[ how categories are implemented]].

The things I like:

1) Placing a page in a category is as simple as ##""[[Category:Foo]]""##, and even if there isn't a page created for that category yet, if there are subcategories, it will still display the links to their pages.

2) Linking to the category without categorizing the page doing the linking only changes the link code by one character. ##""[[:Category:Foo]]""## will make the link to the category page, but will not categorize the page in that category.

3) All pages that use the ##Category:## prefix in the name are automatically listed on the ##Special:Categories## page, which is an automatic list of //all// categories in the wiki. Also, the category titles are displayed stripped of the ##Category:## that starts the page link so that they're easily readable.

4) The category pages display subcategories and other pages separately, with dividers for each letter of the alphabet (this would be a good option to make configurable, as some people may really like having the pages separate from subcategories and others might not, as well as the letter headers).
~&I made a small hack which divides between categories and pages (within the category). See PageAndCategoryDivisionInACategory. --NilsLindenberg

5) Sort keys are used to help correctly display pages on the category lists: ##""[[Category:Foo|Einstein, Albert]]""## tells the wiki to display the link ##""AlbertEinstein""## on the ##Category:Foo## page in alphabetical order by last name, first name.

The system Wikka uses right now is very flexible, at least on the user end, and I don't think it's hard to figure out at all. One of the main reasons I chose to use Wikka for my site was because the categories were flexible and you can have multiple categories per page, and they also weren't annoyingly anchored in the layouts, which is very important for a site trying to use categories as more than metadata about the pages.


~&While I like the idea of (essentially) keeping our current classification system, as long as we get rid of the very obvious drawback of automatic linking by so much as **mentioning** a category name, I agree we need some sort of //explicit// link to a category to place a page in that category. Given that would be "a special kind of link" it's natural to use our ""[[...]]"" notation for that. What I don't like though is adopting the colon which makes it look too much like "name spaces" used in other Wikis - and our categories really **are not** namespaces. So, something different (preferably also simpler than MediaWiki) would be called for. ---
~&Yesterday - as a result of some discussion with SteveB on [[TheLounge #wikka]] - I suddenly hit on the idea on using the same mechanism we already use for code blocks: while ""%%...%%"" just means "this is code", ""%%(php)...%%"" means "this is **php** code": a //special type// of code: using the ""(...)"" notation you add **properties** to an element. So when we need "a special type of link" to link a page to a category, we could use that as well. --- Applying this idea, ""[[(cat)CategoryDevelopmentArchitecture]]"" would then place a page in the CategoryDevelopmentArchitecture **//category//**, while merely mentioning CategoryDevelopmentArchitecture would create a normal link to the CategoryDevelopmentArchitecture **//page//** (where the category is described). And just like the ""(...)"" notation for code blocks is areleady extended to allow for a starting page number (a second property applicable to code blocks), we can easily extend it for links as well. For instance, if you'd want to link a page to a category (so the ""{{category}}"" action can find it) //without// creating a visible link, you could use something like ""[[(cat;hidden)CategoryDevelopmentArchitecture]]"". A few rather simple changes could enable this, I think. --JavaWoman
~ ---
~~& I like that idea, it seems very logical to me. (This would be a good thing to be able to set defaults for in the wikka config file.) --- ---
~~& Do you envision this as also enabling getting rid of the requirement of adding "Category" to the front of the page name? (Though, come to think of it, is there actually a requirement to do so, or is that just common usage? I can't say I've tried to make a page a category without prepending "category" on the page name.) I think that would help immensely by cleaning up the display of pages/categories on category pages by removing the clutter of repeating "category" on every link. --MovieLady

~ ---
~~& JavaWoman, might you point me in right direction to modify my installation ( to accomplish this? I just discovered WikkaWiki yesterday, and love it, but I cannot make good use of it while adding a page to a category is accomplished in this way. I am not very proficient with php or sql but if you have just a few hints as to what the "simple changes" might be, I may be able to figure it out. Thanks very much for any help you could give. --AndCod
~~~&AndCod, I just brainstormed a bit with NilsLindenberg about this on the #wikka channel; the conclusion is that it would not be very hard to accomplish if you //are// reasonably proficient with PHP and MySQL (and steal a bit of code now used in trunk for keeping track of links between pages, and leverage it to keep track of categories for pages) but even that idea isn't fully-formed in my mind and would need some further data design to keep the possibility of a hierarchy of categories. You'd definitely need a table to keep track of relationships between pages and categories, and likely one or two more tables for relationships between categories themselves. It would need additions as well as "relatively simple changes" (that was a bit optimistic maybe). Still quite doable I think - but only for someone who is reasonably proficient with PHP and MySQL as well as reasonably familiar with Wikka's code. Meanwhile, depending on your type of site, you might get away with setting up a system of categories in advance, combined with templates and cloning so epoepl can easily create pages "in" a certain category. --JavaWoman
~~~~& JavaWoman, I have poked around this site a bit more and I found NilsLindenberg's PageAndCategoryDivisionInACategory, which I think will be suitable for me. Thanks for your reply on the issue. --AndCod

==Faceted classification==
Just a day after I told someone there probably would not be any other possible approaches than already proposed on this page, I stumble over another interesting one: [[ FacetedClassification]]. There's even an (experimental) [[ Wiki engine that uses it]]. I'm not proposing we do this, actually (it's too diffrent conceptually from what we already have here), but worth mentioning as an interesting approach to classification and navigation based on it. --JavaWoman

==Categories and FreeMind==

One of the next steps towards a better integration of FreeMind in Wikka could be the automatic generation of an XML tree of Wikka's Categories displayed as a map.
-- DarTar
~&Well, as long as that remains an alternative presentation... the maps are nice, but (as seen from here) too slow to be useful for daily usage. There are many ways to display a tree and HTML has excellent facilities for that (that are excellently accessible as well). --JavaWoman

Valid XHTML :: Valid CSS: :: Powered by WikkaWiki