Revision [3367]

This is an old revision of WikkaGeshiIntegration made by NilsLindenberg on 2004-12-17 17:28:09.

 

Integration of GeSHi into Wikka

The intention is to bundle GeSHi with Wikka as of version 1.1.6.0, and it can be seen here now in a beta implementation. While it "works", I'd like a more rigorous and at the same time more flexible integration than what we have now. I'll outline how I've done the integration (into my version of Wikka 1.1.5.3) on this page.

Goals
My integration method derives from the following goals:

Implementation steps

Implementation of the integration consists of the following steps:
  1. latest GeSHi version (>= 1.0.4)
  1. extension of the configuration file
  1. adaptation of the routine in the wakka formatter that handles a code block
  1. method in wikka.php that forms the actual interface to GeSHi
  1. adaptation of the Format() method in wikka.php
  1. some additional rules in the wikka stylesheet
  1. TODO: modifications to let installer/updater take care of adding the necessary config values

1. Latest GeSHi version
Download the latest version from http://qbnz.com/highlighter/ and use this to replace the one now in Wikka 1.1.6.0beta. (However, see also WikkaCodeStructure!)

2. Extension of the configuration file
In order to accomplish the goal of automatic recognition of syntax highlighter files, we need to let the program look in the directory where these are stored, instead of hard-coding the (now) available languages. This means we need to define the path where these files are stored in the configuration file; for even more flexibility, we follow the same approach for the built-in Wikka code highlighters. And to allow a WikiAdmin to use an already-installed package, the path to the package itself needs to be defined as well.

Add the following to /wikka.config.php:
    // formatter and code hilighting paths
    'wikka_formatter' => 'formatters',  # (location of Wikka formatter - REQUIRED)
    'wikka_lang_path' => 'formatters',  # (location of Wikka code highlighters - REQUIRED)
    'geshi_path' => 'geshi',        # (location of GeSHi package)
    'geshi_lang_path' => 'geshi/geshi', # (location of GeSHi language hilighting files)

The paths should not end in a slash.
Note that these paths are relative - and only serve as an example; it's also possible to define absolute paths, which would be required anyway if elements of Wikka or GeSHi were to be located outside the webserver's docroot.

In order to accomplish the goal of configurability without having to hack the wikka code, the following configuration parameters should also be added to /wikka.config.php:
    // code hilighting with GeSHi
    'geshi_header' => 'div',        # 'div' (default) or 'pre' to surround code block
    'geshi_line_numbers' => '1',        # disable line numbers (0), or enable normal (1) or fancy line numbers (2)
    'geshi_tab_width' => '4',       # set tab width


3. Wakka code block formatter
This is where we do most of the work in order to make the GeShi implementation as flexible as possible, and accomplish our goals of being able to "drop in" new language files, both for GeSHi and Wikka, as well as allow the end user to use line numbering if enabled by the WikiAdmin.

In /formatters/wakka.php replace this (in the 1.1.5.3 version!):
        // code text
        else if (preg_match("/^\%\%(.*)\%\%$/s", $thing, $matches))
        {
            // check if a language has been specified
            $code = $matches[1];
            $language = "";
            if (preg_match("/^\((.+?)\)(.*)$/s", $code, $matches))
            {
                list(, $language, $code) = $matches;
            }
            switch ($language)
            {
            case "php":
                $formatter = "php";
                break;
            case "ini":
                $formatter = "ini";
                break;
            case "email":
                $formatter = "email";
                break;
            default:
                $formatter = "code";
            }

            $output = "<div class=\"code\">\n";
            $output .= $wakka->Format(trim($code), $formatter);
            $output .= "</div>\n";

            return $output;
        }

by this:
        // code text
        else if (preg_match("/^\%\%(.*?)\%\%$/s", $thing, $matches))    # % is not a meta character: escaping with \ is not necessary
        {
            /*
             * Note: this routine is rewritten such that (new) language formatters
             * will automatically be found, whether they are GeSHi language config files
             * or "internal" Wikka formatters.
             * Path to GeSHi language files and Wikka formatters MUST be defined in config.
             * For line numbering (GeSHi only) a starting line can be specified after the language
             * code, separated by a ; e.g., (php;27).....
             * Specifying >= 1 turns on line numbering if this is enabled  in the configuration.
             */

            $code = $matches[1];
            $geshi_hi_path = isset($wakka->config['geshi_lang_path']) ? $wakka->config['geshi_lang_path'] : '/:/';
            $wikka_hi_path = isset($wakka->config['wikka_lang_path']) ? $wakka->config['wikka_lang_path'] : '/:/';
            // check if a language (and starting line) has been specified
            $language = '';
            if (preg_match("/^\((.+?)(;([0-9]+))??\)(.*)$/s", $code, $matches))
            {
                list(, $language, , $start, $code) = $matches;
            }
            // get rid of  newlines at start and end (and preceding/following whitespace)
            // Note: unlike trim(), this preserves any tabs at the start of the first "real" line
            $code = preg_replace('/^\s*\n+|\n+\s*$/','',$code);
            // check if we have a GeSHi hilighter for this language
            if (isset($wakka->config['geshi_path']) && file_exists($geshi_hi_path.'/'.$language.'.php'))
            {
                // use GeSHi for hilighting
                $output = $wakka->GeSHi_Highlight($code, $language, $start);
            }
            // check if we have an internal Wikka hilighter
            elseif (file_exists($wikka_hi_path.'/'.$language.'.php') && 'wakka' != $language)
            {
                // use internal Wikka hilighter
                $output = "<div class=\"code\">\n";
                $output .= $wakka->Format($code, $language);
                $output .= "</div>\n";
            }
            // no formatter found: default code block
            else
            {
                $output = "<div class=\"code\">\n";
                $output .= $wakka->Format($code, 'code');
                $output .= "</div>\n";
            }

            return $output;
        }


4. Wikka method to interface with GeSHi
As can be seen in the code above, we use a GeSHi_Highlight() method in order to let GeShi do the actual highlighting work.

Insert the following method into /wikka.php (in the //MISC section):
    /**
     * Highlight a code block with GeSHi.
     *
     * The path to GeSHi and the GeSHi language files must be defined in the configuration.
     *
     * This implementation fits in with general Wikka behavior; e.g., we use classes and an external
     * stylesheet to render hilighting.
     *
     * Apart from this fixed general behavior, WikiAdmin can configure a few behaviors via the
     * configuration file:
     * geshi_header         - wrap code in div (default) or pre
     * geshi_line_numbers   - disable line numbering, or enable normal or fancy line numbering
     * geshi_tab_width      - override tab width (default is 8 but 4 is more commonly used in code)
     *
     * Limitation: while line numbering is supported, extra GeSHi styling for line numbers is not.
     * When line numbering is enabled, the end user can "turn it on" by specifying a starting line
     * number together with the language code in a code block, e.g., (php;260); this number is then
     * passed as the $start parameter for this method.
     *
     * @access  public
     * @uses    wakka::config
     * @uses    GeShi
     * @todo    - support for GeSHi line number styles
     *      - enable error handling
     *
     * @param   string  $sourcecode required: source code to be highlighted
     * @param   string  $language   required: language spec to select highlighter
     * @param   integer $start      optional: start line number; if supplied and >= 1 line numbering
     *          will be turned on if it is enabled in the configuration.
     * @return  string  code block with syntax highlhting classes applied
     */

    function GeSHi_Highlight($sourcecode, $language, $start=0)
    {
        // create GeSHi object
        include_once($this->config['geshi_path'].'/geshi.php');
        $geshi =& new GeSHi($sourcecode, $language, $this->config['geshi_lang_path']);              # create object by reference

        $geshi->enable_classes();                               # use classes for hilighting (must be first after creating object)
        $geshi->set_overall_class('code');                      # enables using a single stylesheet for multiple code fragments

        // configure user-defined behavior
        $geshi->set_header_type(GESHI_HEADER_DIV);              # set default
        if (isset($this->config['geshi_header']))               # config override
        {
            if ('pre' == $this->config['geshi_header'])
            {
                $geshi->set_header_type(GESHI_HEADER_PRE);
            }
        }
        $geshi->enable_line_numbers(GESHI_NO_LINE_NUMBERS);     # set default
        if ($start > 0)                                         # line number > 0 _enables_ numbering
        {
            if (isset($this->config['geshi_line_numbers']))     # effect only if enabled in configuration
            {
                if ('1' == $this->config['geshi_line_numbers'])
                {
                    $geshi->enable_line_numbers(GESHI_NORMAL_LINE_NUMBERS);
                }
                elseif ('2' == $this->config['geshi_line_numbers'])
                {
                    $geshi->enable_line_numbers(GESHI_FANCY_LINE_NUMBERS);
                }
                if ($start > 1)
                {
                    $geshi->start_line_numbers_at($start);
                }
            }
        }
        if (isset($this->config['geshi_tab_width']))            # GeSHi override (default is 8)
        {
            $geshi->set_tab_width($this->config['geshi_tab_width']);
        }

        // parse and return highlighted code
        return $geshi->parse_code();
    }

Note how we use the configuration parameters here to determine GeShi's behavior, and have also enabled the end user to turn on line numbering and set a starting line number. That's a few more of our goals accomplished.

5. Adaptation of the Format() method
Now that we have defined a path to the formatter (as well as the built-in language highlighter files), a small adaptation to the Format() method in /wikka.php is in order. (Though strictly speaking not required, it will enhance consistency.)

Replace this:
    function Format($text, $formatter = "wakka") { return $this->IncludeBuffered("formatters/".$formatter.".php", "<em>Formatter \"$formatter\" not found</em>", compact("text")); }

by this:
    function Format($text, $formatter="wakka") { return $this->IncludeBuffered($formatter.".php", "<em>Formatter \"$formatter\" not found</em>", compact("text"), $this->config['wikka_formatter']); }


6. Some additional rules in the stylesheet
The addition of line numbering, as well as the different ways to format a block of code we have now, require a few tweaks to the stylesheet so it's all rendered properly and consistently. Here they are:

The .code section is now extended as follows:
.code {
    color: black;
    background: #ffffee;
    border: 1px solid #888;
    font: 11px "Lucida Console", Monaco, monospace;
    width: 95%;
    margin: auto;
    padding: 3px;
    text-align: left;       /* override justify on body */
    overflow: auto;         /* allow scroll bar in case of long lines - goes together with white-space: nowrap! */
    white-space: nowrap;    /* prevent line wrapping */
}
.code pre {
    margin-top: 1em;
    margin-bottom: 1em;     /* prevent vertical scroll bar in case of overflow */
    font: 11px "Lucida Console", Monaco, monospace;
}

Note that I've added Lucida Console as a font: it's a very clear and readable font for code, so could be used if available on the user's system; also we set some properties for <pre> within a code block (generated by the internal formatters), so the rendering in those code blocks there is consistent with those generated by GeSHi.

For the GeShi code rendering we then have:
/* syntax highlighting code - geshi */
.code ol {
    margin-top: 1em;
    margin-bottom: 1em;     /* prevent vertical scroll bar in case of overflow */
}
.code li {
    font: 11px "Lucida Console","Bitstream Vera Sans Mono","Courier New", monospace;
}
.code .br0  { color: #66cc66; }
.code .co1  { color: #808080; font-style: italic; }
.code .co2  { color: #808080; font-style: italic; }
.code .coMULTI  { color: #808080; font-style: italic; }
.code .es0  { color: #000099; font-weight: bold; }
.code .kw1  { color: #b1b100; }
.code .kw2  { color: #000000; font-weight: bold; }
.code .kw3  { color: #000066; }
.code .kw4  { color: #993333; }
.code .kw5  { color: #0000ff; }
.code .me0  { color: #006600; }
.code .nu0  { color: #cc66cc; }
.code .re0  { color: #0000ff; }
.code .re1  { color: #0000ff; }
.code .re2  { color: #0000ff; }
.code .re4  { color: #009999; }
.code .sc0  { color: #00bbdd; }
.code .sc1  { color: #ddbb00; }
.code .sc2  { color: #009900; }
.code .st0  { color: #ff0000; }

Most of this is just colors for various key code elements, but note the two selectors at the start: these are necessary to handle code rendering properly when line numbering is turned on.

Test!

Please test these modifications. Comments are of course welcome.

If your testing doesn't turn up any bugs (let me know!) I'd like to see this implementation added to the upcoming 1.1.6.0. release.

PHP allows (and encourages) single quotes around strings; in fact, that is more efficient sice PHP won't try to interpret such strings; works just fine on my local test server (Win2K/Apache), and I habitually use single quotes around strings on my server running FreeBSD/Apache (precisely because it's more efficient). Can't imagine why single quotes would not work? what platform (OS/webserver) are you testing on?
did you also copy the style sheet portions? what browser are you using? (I tested with Moz 1.7 and IE6 on Win2K)
I will look tomorrow, if it runs on your system, i've shurely forgotten to copy a bit of text.
Yup, it's all running as shown on my test system (Win2K/Apache) -- JW
Ok, it thos work now, after I copied everything again. There is only the problem of single, long lines of code, which remains. Thanx to Jason giving me a place to test, you can see it here.

Final note
When copying code, please copy from the source version of this page so you get the proper formatting with tabs - copying from the (GeSHi) rendering leads to very sloppy code layout...


CategoryDevelopment
There are 6 comments on this page. [Show comments]
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki