Revision [9132]

This is an old revision of ImprovedFormatter made by JavaWoman on 2005-06-11 16:34:36.

 

Improved Formatter


This is the development page for an improved version of "the Formatter", specifically, the code in ./formatters/wakka.php (as opposed to the AdvancedFormatter page which deals with "advanced" formatting in other ways as well, such as standardized code generation utilities).
 

Why?


While our current (version 1.1.6.0) Formatter is quite capable, it has some quirks and bugs, doesn't always generate valid XHTML (though it tries hard), and misses a few things that would be nice to have or that would enable things that would be nice to have (such as a TableofcontentsAction page TOCs). The improved version presented here tries to address some of these issues (with more likely to follow).


What?


Here is a short summary of what has changed (details below):

The code presented below is still considered a beta version and as such contains many lines of (comented-out) debug code. These will of course be removed before final release. Any reference to line numbers is (for now) to the new (beta) code since this is a complete drop-in replacement for the original file.


Closing open tags


The current version (Wikka 1.1.6.0) of the Formatter has a bit of code contributed by DotMG to close any left-open tags at the very end of a page. While that can solve some problems with rendering and including pages, the code was incomplete in which open tags were closed. A particular problem was still-open lists and indents which weren't handled at all (see "List parsing bug?" on WikkaBugs). Also, this code would directly echo output instead of returning a string as the rest of the Formatter's main function does.

The new version addresses all of these problems.

Closing of indents and (open) lists was already happening when encountering a newline that doesn't start with a TAB or a ~, so this bit is separated out as a function:

  1. if (!function_exists('close_indents'))
  2. {
  3.     function close_indents(&$indentClosers,&$oldIndentLevel,&$oldIndentLength,&$newIndentSpace)
  4.     {
  5.         $result='';
  6.  
  7.         $c = count($indentClosers);
  8.         for ($i = 0; $i < $c; $i++)
  9.         {
  10.             $result .= array_pop($indentClosers);
  11.             $br = 0;
  12.         }
  13.         $oldIndentLevel = 0;
  14.         $oldIndentLength= 0;
  15.         $newIndentSpace=array();
  16.  
  17.         return $result;
  18.     }
  19. }

The section that handles newlines now only needs to call this function:
  1.             $result .= close_indents($indentClosers,$oldIndentLevel,$oldIndentLength,$newIndentSpace);
  2.  
  3.             $result .= ($br) ? "<br />\n" : "\n";
  4.             $br = 1;
  5.             return $result;


To close open tags at the end of the page, the new code now calls this function first, and then handles all other open tags, in an order to at least minimize incorrect tag nesting (but see "Not a compete solution!" below):

  1.         if ((!is_array($things)) && ($things == 'closetags'))
  2.         {
  3.             $result .= close_indents($indentClosers,$oldIndentLevel,$oldIndentLength,$newIndentSpace);
  4.  
  5.             if ($trigger_bold % 2) $result .= '</strong>';
  6.             if ($trigger_italic % 2) $result .= '</em>';
  7.             if ($trigger_keys % 2) $result .= '</kbd>';
  8.             if ($trigger_monospace % 2) $result .= '</tt>';
  9.  
  10.             if ($trigger_underline % 2) $result .= '</span>';
  11.             if ($trigger_notes % 2) $result .= '</span>';
  12.             if ($trigger_strike % 2) $result .= '</span>';
  13.             if ($trigger_inserted % 2) $result .= '</span>';
  14.             if ($trigger_deleted % 2) $result .= '</span>';
  15.  
  16.             if ($trigger_center % 2) $result .= '</div>';
  17.             if ($trigger_floatl % 2) $result .= '</div>';
  18.             if ($trigger_floatr % 2) $result .= '</div>';                   # JW added
  19.             for ($i = 1; $i<=5; $i ++)
  20.             {
  21.                 if ($trigger_l[$i] % 2) $result .= ("</h$i>");
  22.             }
  23.  
  24.             $trigger_bold = $trigger_italic = $trigger_keys = $trigger_monospace = 0;
  25.             $trigger_underline = $trigger_notes = $trigger_strike = $trigger_inserted = $trigger_deleted = 0;
  26.             $trigger_center = $trigger_floatl = $trigger_floatr = 0;
  27.             $trigger_l = array(-1, 0, 0, 0, 0, 0);
  28.             return $result;
  29.         }
  30.         else
  31.         {
  32.             $thing = $things[1];
  33.         }


Not a compete solution!
A big problem remains, however: in order to produce valid (X)HTML, open tags cannot just be closed anywhere: there are rules for which elements can contain which other elements. For instance, an inline element (like <em>) can never contain a block element (like a list). So if the inline element is left open (which happens if someone types // to start emphasized text but doesn't close it before starting an indent or list), closing the generated opening <em> tag at the end of the page may prevent display problems in some browsers, but the result is still not valid (X)HTML. This type of problem can only be really addressed with completely different mechanism for a formatter. This should definitely be tackled at some time, but is outside the scope of the current improvements which are designed to work within the current Formatter's mechanism.

Escaping single ampersands

follows

Nesting floats

follows

Ids in embedded code

follows

Heading ids

Creating ids for headings is (you guessed it) the first (and necessary) piece of the puzzle to enable generating TableofcontentsAction page TOCs, but other bits will be needed for that as well, such as actually gathering the references to headings (and their levels), and the ability to link to page fragments (something our WikkaCore current core does not support yet). So: we cannot generate TOCs - yet - but we are getting there; the code is also designed to make it possible to extend it to generate TOCs not just for headings, but also for things like images, tables and code blocks.

A method for generating a TOC has not been decided yet (we may even provide alternatives), but one thing we certainly need is ids for headings (see TableofcontentsAction for more background on this); and even if we do not (yet) generate a TOC, being able to link to a page fragment (the obvious next step) will be useful in itself.

Some thought went into the method of generating the ids: Ideally they should be 'recognizable' so creating links to a page fragment with a heading wil be easy, and they should be as 'constant' as possible so a link to a section remains a link to that section, even if that is moved to a different position on the page, or another is inserted before it. This implies that all methods that simply generate a sequential id will not fulfill our requirements. We also don't burden the writer with coming up with ids (or even needing to think about them): they should be able to just concentrate on the content. Instead, we use following approach:


The result is an id that is almost always derived directly from the heading content, giving a high chance that it will remain constant even if the page content is re-arranged: thus it provides a reliable target for a link.

The Code


Here's the code (all of it). This replaces the file ./formatters/wakka.php
follows
There are 7 comments on this page. [Show comments]
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki