Acronym (or Abbreviation) Formatter
This is the development page for the Acronym (or Abbreviation) Formatter.
This modification allows Wikka to automatically parse known acronyms and render them as <acronym> elements with titles, for example:
CSS - FAQ - HTML
The list of acronyms can be set by the WikiAdmin in a configuration file: each time an acronym is found in the page source matching one of the entries of this file, it is automatically rendered with the appropriate markup and expanded description.
Features
Current version: 0.3 (improved regex pattern)- customizable acronym definition file;
- formatter can be disabled from config file;
- configurable REGEX pattern;
- configurable output format (abbrv or acronym);
To do
- important fix conflicts with links, WikiNames and other elements containing sequences of uppercase letters that should not be rendered as acronyms;
- improve REGEX pattern;
- support CSS classes for different kinds of acronyms;
The code
Here's the list of files that you will have to create or modify (backup the original files before making any modification)1. Modify ./formatters/wakka.php
original:
- // we're cutting the last <br />
- echo ($text);
- wakka2callback('closetags');
modified:
- // we're cutting the last <br />
- //render acronyms
- $text = $this->RenderAcronyms($text);
- echo ($text);
- wakka2callback('closetags');
2. Modify wikka.php
Add the following function in the engine, for instance immediately before the VARIABLES section:
original:
- // VARIABLES
modified:
- /**
- * Look up and return acronym definition from a configuration file.
- *
- * @author {@link http://wikka.jsnx.com/DarTar DarioTaraborelli}
- * @version 0.3
- *
- * @access public
- * @uses GetConfigValue()
- *
- * @param string $text source sent from the formatter
- * @return string $text source with known acronyms formatted as HTML elements
- */
- function RenderAcronyms($text){
- if (($this->GetConfigValue('enable_acronyms') == 1) && file_exists($this->GetConfigValue('acronym_table'))) {
- // define constants
- define('ACRONYM_PATTERN', '/\b([A-Z]{2,})\b/'); #matches sequences of 2 or more capital letters within word boundaries
- // get acronym definitions
- global $wikka_acronyms;
- include($this->GetConfigValue('acronym_table'));
- // replace known acronyms with HTML elements
- ACRONYM_PATTERN,
- '$matches',
- 'global $wikka_acronyms; return (is_array($wikka_acronyms) && array_key_exists($matches[0], $wikka_acronyms))? sprintf(FORMATTED_ACRONYM, $wikka_acronyms[$matches[0]], $matches[0]) : $matches[0];'
- ),
- $text);
- }
- return $text;
- }
- // VARIABLES
3. Modify wikka.config.php
Add the following values to the configuration file:
"enable_acronyms" => "1",
"acronym_table" => "acronyms.php",
"acronym_table" => "acronyms.php",
4. Create the acronym configuration file (acronyms.php)
Save the following code as acronyms.php in the root folder of your Wikka installation. You can obviously add as many acronym definitions as you like:
<?php
$wikka_acronyms = array(
"ACL" => "Access Control List",
"API" => "Application Program(ming) Interface",
"CSS" => "Cascading Style Sheets",
"CVS" => "Concurrent Version System",
"DHTML" => "Dynamic HyperText Markup Language",
"DOM" => "Document Object Model",
"DTD" => "Document Type Definition",
"FAQ" => "Frequently Asked Questions",
"FF" => "Firefox",
"GIF" => "Graphics Interchange Format",
"GPL" => "GNU General Public License",
"GUI" => "Graphical User Interface",
"HTML" => "HyperText Markup Language",
"HTTP" => "HyperText Transfer Protocol",
"IE" => "Internet Explorer",
"PHP" => "PHP hypertext processor",
"RSS" => "Rich Site Summary", # or Really Simple Syndication or RDF Site Summary...
"SQL" => "Structured Query Language",
"TOC" => "Table of Contents",
);
?>
$wikka_acronyms = array(
"ACL" => "Access Control List",
"API" => "Application Program(ming) Interface",
"CSS" => "Cascading Style Sheets",
"CVS" => "Concurrent Version System",
"DHTML" => "Dynamic HyperText Markup Language",
"DOM" => "Document Object Model",
"DTD" => "Document Type Definition",
"FAQ" => "Frequently Asked Questions",
"FF" => "Firefox",
"GIF" => "Graphics Interchange Format",
"GPL" => "GNU General Public License",
"GUI" => "Graphical User Interface",
"HTML" => "HyperText Markup Language",
"HTTP" => "HyperText Transfer Protocol",
"IE" => "Internet Explorer",
"PHP" => "PHP hypertext processor",
"RSS" => "Rich Site Summary", # or Really Simple Syndication or RDF Site Summary...
"SQL" => "Structured Query Language",
"TOC" => "Table of Contents",
);
?>
5. Add some style
Some browsers (Mozilla/FF) automatically highlight acronym elements in the page. To make acronyms visible also in other browsers, paste the following in your stylesheet (default: ./css/wikka.css):
acronym {
border-bottom: 1px dotted #333;
cursor: help /*modifies the mouse pointer as a question mark*/
}
border-bottom: 1px dotted #333;
cursor: help /*modifies the mouse pointer as a question mark*/
}
CategoryDevelopmentFormatters, CategoryUserContributions
1. One problem with current browsers (related to a logical problem): strictly speaking, abbreviations (abbr) and acronyms (acronym) aren't the same thing - that's why they have separate HTML elements. Although different (human) languages also differ (slightly) in what they call acronym and what abbreviation - in general an acronym is a particular *type* of abbreviation. For instance, REGEX is an abbreviation, but not an acronym.
The problem here is that _some_ browsers support only the <acronym> element but not the <abbr> element. Structurally, you'd want to to be able to use *both* (not either/or). I think this choice should be up to the wiki's admin.
It would probably not be too much work to extend the code to make use of two tables, one for abbreviations and one for acronyms (if something occurs in both, acronym would - logically - take precedence): create the element according to which array an abbreviation is found in. (Two entries needed in the config, of course.)
2. Looking at the code, it looks as though the definition file(s) does not need to be in the root: include() takes a filename (current directory -or- somewhere in the PHP include path) or a *path* which can be relative (to the current script) or an absolute path on the server's file system. That's nice and flexible - but should be documented. ;-)
3. A possible problem with the format of the definition file (PHP array) is the same as we have for our current configuration: some users don't know PHP syntax enough to be able to edit (let alone create!) such a file.
A possible solution would be a simple INI-like (keyword = expansion) file (or two) which gets "cached" into an array file; you'd then have to re-create the array only when the INI file has changed (compare timestamps), if not, just include it.
All in all: a great step forward, but we could have some refinements. ;-)
There is still a major issue to be fixed: how to prevent sequences of uppercase characters from being parsed in the *wrong* contest. I've seen bugs of this formatter in the case of links or in the skin editor. Something similar to what happened in the FetchRemote action :)
Code blocks should also be excluded. Maybe some other things as well...
I'd very much like to have a feature like this (like I said, good for accessibility) but I'm afraid it needs to "ripen" a little. :)
The disadvantage is obviously that it will depend on individual page editors to remember to add the "lookup" syntax - but it avoids mis-matches.
So if in the list we have "UML" = "Unified Modeling Language" one could use (say) (?UML User Mode Linux?) to locally override that.
With using only parsing and a transation list, such homonyms cannot be handled.