Revision history for WikkaEmailToolkit
Revision [18445]
Last edited on 2008-01-28 00:11:37 by JavaWoman [Modified links pointing to docs server]No Differences
Additions:
=====Wikka Email Toolkit=====
Note: this is still incomplete but should already provide some solutions for the issues I've outlined on WikkaAndEmail. I'm giving code, instructions for how to implement in Wikka, and a peek at what's to follow.
All released under the [[http://www.gnu.org/copyleft/lesser.html LGPL]]. You could use this with minor changes in other web projects that need email functionality as well.
I'm documenting everything now in the [[http://www.phpdoc.org/ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
====Toolkit implementation - step 1====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
===Patterns===
Define patterns: the idea is to define a pattern only **once** so it can be used consistently in different places.
At the start of wikka.php - add the following (including documentation blocks and without <?php and ?>) right after the WAKKA_VERSION define:
**pattern defines**
%%(php)<?php
// Pattern defines (start every define in this block with 'PATTERN_' and attach a bit to indicate what it is a pattern for
/**#@+
* Defines a pattern as a constant so it is available and consistent throughout the application
*/
define('PATTERN_NL',"/(\r?\n)|\r/"); # newline
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string
/**#@-*/
?>%%
(We'll add more here later.)
===Create an EMAIL section===
Create an "email section" in the ##Wakka## class by adding this right before the %% // VARIABLES%% line:
**email section**
%%
//EMAIL
%%
===Functions===
The toolkit currently consists of the pattern defines (above) and three functions which make use of them. Reason for the functions and their usage are covered in their documentation blocks.
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the ##Wakka## class:
**""NoCrlf()"" method**
%%(php)<?php
/**
* Replace CR and/or LF by space in user input to prevent CRFL injection in PHP email forms.
*
* Email forms (actions) that allow a user to enter a To: or From: email address and/or
* a name and/or a subject --in general fields to be used in constructing an email header--
* may be susceptable to CRLF injection which would allow an attacker to send arbitrary email
* to arbitrary addressees.
* Simply replacing any form of "newline" in such input by a space makes such an attempt
* futile.<br>
* Function inspired by article {@link http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities}
* but implemented differently.
*
* Usage:
* <ul>
* <li> Copy this whole file to the EMAIL section of the Wakka class</li>
* <li> Apply to every user-supplied value for email address, name or subject (anything that is,
* or CAN BE used in an email header!) to guard against this</li>
* </ul>
* Use as follows:
* <code>
* // get input
* // ....
* $input = trim($this->NoCrlf($input));
* </code>
* or directly as:
* <code>
* $email = trim($this->NoCrlf($_POST['email']));
* // get other variables from a submitted form
* // ...
* </code>
* Note that {@link trim()} is applied <i>after</i> applying this function to get rid of
* whitespace at start and end of the resulting string.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_NL to recognize any type of "newline"
*
* @param string $string Required.
* User input to be sanitized
* @return string sanitized input
*/
function NoCrlf($string)
{
return preg_replace(PATTERN_NL,' ',$string);
}
?>%%
**RE_AddrSpec() method**
%%(php)<?php
/**
* Builds an RE that can be used to validate an email address or to recognize something that
* "looks like" an email address.
*
* This function builds a regular expression to enable validation of a string as a valid email
* address or to recognize something that "looks like" an email address, based on applicable
* Internet standards (notably RFC 2822 --officially a "Proposed standard", replacing RFC 822--
* and RFC 1035). The regular expression returned is Perl-compatible for use in PHP's preg_...
* functions but does NOT include delimiters; this is to allow the RE to be used as part of a
* larger RE which could be used to match a string of which an actual email address is only
* part.
*
* This function is designed such that:
* <ul>
* <li> an email address that matches an RE generated by this function is guaranteed to be
* conforming to the format standard(s) specified to the function (using RFC 2822 rules and
* (if specified) the RFC 1035 "Preferred format" for the domain part);</li>
* <li> an email address that is found to NOT match the RE generated by this function <i>may</i>
* still be conforming to the format standard(s) specified.</li>
* </ul>
*
* Error reporting:<br>
* This design implies that a user-supplied email address that matches a generated RE SHOULD be
* silently accepted.<br>
* Conversely when a user-supplied email address does not match this SHOULD NOT result in an
* error message suggesting the address is "invalid" (it may not be); any error message SHOULD
* only indicate that the address format in question is "not supported" by the application
* using this function.
*
* Standards compliance:<br>
* The RE is built using building blocks based on the production rules as specified in:
* <ul>
* <li> RFC 2822 section 3.4.1 for the address format: 'addr-spec = local-part "@" domain'</li>
* <li> RFC 2822 section 3.2.4 for the 'local-part': using 'atext', 'atom' and 'dot-atom' (using
* a subset of the full production ruleset)</li>
* <li> RFC 2822 section 3.4.1 (dot-atom) <b>or</b> RFC 1035 section 3.5 for the 'domain'
* part</li>
* </ul>
*
* Note that the domain syntax as specified in RFC 1035 section 3.5 is merely a
* "Preferred format"; we use it here because this is the generally accepted (and widely
* enforced format).
*
* By means of interfacing with external configuration and a possible override with the
* $email_format parameter, considerable flexibility in selecting an applicable format is
* provided while still returning a standards-compliant email address pattern.
*
* Explicitly NOT SUPPORTED are:
* <ul>
* <li> whitespace and comments ([CFWS]) in an email address (though allowed by RFC 2822 section
* 3.2.4)</li>
* <li> "quoted string" (strings of characters not allowed in the 'atom' production rule in
* section 3.2.4 RFC 2822) - these are considered "obsolete" in RFC 2822 although allowed
* </li>
* <li> domain literals instead of domain name; e.g., [10.0.0.67] (though allowed by RFC 2822
* section 3.4.1)</li>
* <li> "internationalized" domain names (see RFC 3490 and related RFCs: these are still very
* much proposals, not a standard yet)</li>
* <li> any check that a 'local part' is no longer than 64 characters (? mentioned in RFC 3696;
* no other reference found)</li>
* <li> any check that a domain name is no more than 255 bytes long (RFC 1035 section
* 2.3.4)</li>
* </ul>
*
* Behavior:
* <ul>
* <li> If no format is specified, the function delivers the default format</li>
* <li> If a valid format (0-5) is specified in the configuration variable 'email_format', this
* is used but:</li>
* <li> If a valid format (0-5) is specified in the $email_format parameter, this is used,
* overriding anything specified in the configuration; 0 specifies "default format" so it
* can override whatever is specified in the configuration</li>
* </ul>
*
* Formats supported:
* <ul>
* <li> ALL - local-part ('mailbox name'): RFC 2822 compliant but without support for
* whitespace, comments or "quoted string";</li>
* <li> 0-4 - local-part MUST be followed by a '@' to separate it from the domain part</li>
* <li> 0 [default] - domain: RFC 1034/1035 compliant 'domain' consisting of at least two labels
* results in the most "generally acceptable" format for an Internet email
* address</li>
* <li> 1 - domain: RFC 1034/1035 compliant but consisting of one or more labels
* allows relative domain (such as using single server name) while still being RFC
* 1035 compliant if a domain is attached</li>
* <li> 2 - domain: RFC 2822 compliant but consisting of at least two labels</li>
* <li> 3 - domain: RFC 2822 compliant</li>
* <li> 4 - domain: RFC 1035 compliant but allowing only a single level (an internal server
* name); use 1 if multiple levels are needed</li>
* <li> 5 - domain: NOT allowed (only 'user name', no '@' or domain accepted)</li>
* </ul>
*
* Formats 2-5 are specifically intended for Intranet use while 1 may be used for Intranets
* using relative domains (server names) that still need to result in an RFC 1035 compliant
* domain when a domain is appended for external use.
* To see what it's producing, add the following line to just before the result is returned:
* <code>
* echo "resulting RE:<br/>$re<br/><br/>";
* </code>
*
* Usage:<br>
* The function deliberately does not include delimiters in its output to enable it to be used
* as a building block for a larger RE. However, it takes care that / is escaped enabling / to
* be used as delimiter. This results in the following usage patterns:
* <ul>
* <li> to use as building block:</li>
* </ul>
* <code>
* $this->RE_AddrSpec() // (optionally provide parameter)
* </code>
* <ul>
* <li> to use as pattern in any of the preg_... functions, add the / delimiters (and optionally
* 'start' and 'end' delimiters) first:</li>
* </ul>
* <code>
* $pattern = '/^'.$this->RE_AddrSpec().'$/';
* $is_match = preg_match($pattern,$email);
* </code>
* It is NOT necessary to add the i modifier to the pattern since the RE itself already
* takes care of case-insensitivity as per the standards used.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_INT to validate a format specification value as "integer"
* @uses Wakka::$config['email_format'] to get specified email format;
* same rules apply as for parameter $email_format
*
* @param integer $email_format Optional.
* If specified must be 0-5; specifies which format is to be used
* (NULL (default) and integer string allowed); overrides optional
* Wakka::$config['email_format'].
* @return string RE to be used for validation or as building block for a larger RE
*/
function RE_AddrSpec($email_format=NULL)
{
// Which format do we want to validate against? We filter out invalid parameter and config values and then allow parameter to override a config value
// ignore invalid parameter (but allow integer value specified as string)
if (preg_match(PATTERN_INT,$email_format)) $email_format = (int)$email_format;
if (!is_int($email_format) || $email_format > 5 || $email_format < 0) $email_format = NULL;
// ignore invalid config value (but allow integer value specified as string)
$cfg_email_format = $this->config['email_format'];
if (preg_match(PATTERN_INT,$cfg_email_format)) $cfg_email_format = (int)$cfg_email_format;
if (!is_int($cfg_email_format) || $cfg_email_format > 5 || $cfg_email_format < 0) $cfg_email_format = NULL;
// pick up config value if parameter not specified (or invalid)
if (!isset($email_format)) $email_format = $cfg_email_format;
// RFC 2822: Email
$atextchars = "A-Za-z0-9!#$%&'*+-/=?^_`{|}~"; # all characters allowed in 'atext' of an 'atom' (RFC 2822)
$atom = preg_quote($atextchars,'/'); # escape RE special chars; matches 'atom' but excludes allowed whitespace and comments ([CFWS])
$dot_atom = '['.$atom.']+(\.['.$atom.']+)*'; # dot-atom as allowed for local part of an email address
$local_part_rfc2822 = $dot_atom; # dot-atom for local part; no [CFWS]
$domain_rfc2822 = $dot_atom; # domain part as allowed per RFC 2822 but excluding domain literals and [CFWS]
$domain_dot_atom = '['.$atom.']+(\.['.$atom.']+)+'; # dot-atom domain part but requiring at least two levels and excluding domain literals and [CFWS]
// RFC 1035: Domains (Preferred format)
$domain_labelchars = "A-Za-z0-9-"; # all characters allowed in a "label": letters, digits and a hyphen (no escaping needed here)
$domain_labelstart = "A-Za-z"; # label must start with a letter
$domain_labelend = "A-Za-z0-9"; # label cannot end in hyphen
$domain_label_rfc1035 = '['.$domain_labelstart.'](['.$domain_labelchars.']{0,61}['.$domain_labelend.'])?';
# conforms to RFC 1035; max 63 characters in a label
$domain_rfc1035 = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*'; # string of one or more dot-seprataed labels
$domain_rfc1035_abs = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*\.?'; # explicitly allows terminating dot to specify absolute domain
$domain_rfc1035_multi = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')+\.?'; # as $domain_rfc1035_abs but requires at least two labels (the most general case for addresses used on the Internet)
// build RE to match as specified (or default)
switch ($email_format)
{
// default: "Internet" email address
case NULL:
case 0:
$re = $local_part_rfc2822.'@'.$domain_rfc1035_multi;# strict Internet address; absolute assumed even if ending dot not present
break;
case 1:
$re = $local_part_rfc2822.'@'.$domain_rfc1035; # also usable for internal address (allows single label); syntactically always relative (no ending dot allowed)
break;
// all other specified formats for Intranet use *only*
case 2:
$re = $local_part_rfc2822.'@'.$domain_dot_atom; # domain pattern as per RFC 2822 but requires at least two levels
break;
case 3:
$re = $local_part_rfc2822.'@'.$domain_rfc2822; # domain pattern as per RFC 2822 but allows only single label (server name)
break;
case 4:
$re = $local_part_rfc2822.'@'.$domain_label_rfc1035;# domain pattern as per RFC 1035 but allows only single label (server name); use 1 if more levels are needed
break;
case 5:
$re = $local_part_rfc2822; # just a name, no server
break;
}
// return the resulting RE
return $re;
}
?>%%
**""IsValidEmail()"" method**
%%(php)<?php
/**
* Check whether a supplied email address is syntactically valid.
*
* The function serves as a wrapper around Wakka::RE_AddrSpec() to enable validation of a
* user-supplied email address. Best used when the address is already "sanitized" with
* {@link Wakka::NoCrlf()} and subsequently trimmed to get rid of any surrounding whitespace.
*
* Usage example:
* <code>
* $email = trim(NoCrlf($_POST['email']));
* if (!IsValidEmail($email))
* {
* // report problem
* }
* else
* {
* // continue...
* }
* </code>
* See {@link Wakka::RE_AddrSpec()} documentation about Error reporting!
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses Wakka::RE_AddrSpec() to build a standards-compliant RE used for the validation
*
* @param string $email Required.
* String to be validated
* @param integer $email_format Optional.
* Passed on to {@link Wakka::RE_AddrSpec()}
* @return boolean TRUE if $email conforms to format specified in $email_format, FALSE
* if not
*/
function IsValidEmail($email,$email_format=NULL)
{
$pattern = '/^'.$this->RE_AddrSpec($email_format).'$/';
return preg_match($pattern,$email);
}
?>%%
====Toolkit implementation - step 2====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
Now that we have the defines and the functions available we can start to apply them.
Note that while the functions themselves are fully tested, the code for the implementation suggestions below are **untested**; this is because I'm actually working on complete replacements for the actions involved (which I will share when finished, of course). So **use at your own risk**, please test before making it live (and do let me know if there are any problems).
===Installation===
Currently there is only (limited) ""JavaScript"" validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
%%(php)<?php
if (!IsValidEmail_func($email,0)) // 0 = default "Internet" format; use whatever format is needed
// report problem
else
// continue...
?>%%
Note that since we don't have a configuration **yet** at this point, we will need to specify which validation format is to be used, unless the default format is what is desired for the installation.
===Configuration===
If you are working in an Intranet and standard Internet email addresses are not used, create an entry in wikka.config.php with the name email_format and a value between 1 and 5 (see **##RE_AddrSpec()##** documentation above); e.g.:
%%(php)<?php
"email_format" => "4", # name@server
?>%%
===User Settings===
File: **##actions/usersettings.php##**
==Update block==
Starts at: //##""// is user trying to update?""##//
Change as follows:
%%(php)<?php
// is user trying to update?
if (isset($_REQUEST["action"]) && ($_REQUEST["action"] == "update"))
{
$email = trim($this->NoCrlf($_POST["email"]))
if ('' == $email)
$mailerror = "You must specify an email address";
elseif (!$this->IsValidEmail($email))
$mailerror = $email." - that email format is not supported by this system";
else
{
$this->Query("update ".$this->config["table_prefix"]."users set ".
"email = '".mysql_real_escape_string($email)."', ".
"doubleclickedit = '".mysql_real_escape_string($_POST["doubleclickedit"])."', ".
"show_comments = '".mysql_real_escape_string($_POST["show_comments"])."', ".
"revisioncount = '".mysql_real_escape_string($_POST["revisioncount"])."', ".
"changescount = '".mysql_real_escape_string($_POST["changescount"])."' ".
"where name = '".$user["name"]."' limit 1");
$this->SetUser($this->LoadUser($user["name"]));
// forward
$this->SetMessage("User settings stored!");
$this->Redirect($this->href());
}
}
?>%%
==Update form==
Insert after the first table row (**including** <?php and ?>!):
%%(php)
<?php
if (isset($mailerror))
{
print("<tr><td></td><td><div class=\"error\">".$this->Format($mailerror)."</div></td></tr>\n");
}
?>
%%
==Create new account==
Starts at: //##""// otherwise, create new account""##//
Change first section as follows:
%%(php)<?php
else
{
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$password = $_POST["password"];
$confpassword = $_POST["confpassword"];
// check if name is WikkiName style
if (!$this->IsWikiName($name)) $error = "User name must be WikiName formatted!";
else if ('' == $email) $error = "You must specify an email address.";
else if (!$this->IsValidEmail($email)) $error = "That email address format is not supported by this system.";
else if ($confpassword != $password) $error = "Passwords didn't match.";
else if (preg_match("/ /", $password)) $error = "Spaces aren't allowed in passwords.";
else if (strlen($password) < 5) $error = "Password too short.";
else
{
?>%%
===Feedback===
File: **##actions/feedback.php##**
Change first section as follows (note we get any input first and "sanitize" it before validation):
%%(php)<?php
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$comments = $_POST["comments"];
$form = '<p>Fill in the form below to send us your comments:</p>
<form method="post" action="'.$this->tag.'?mail=result">
Name: <input name="name" value="'.$name.' "type="text" /><br />
Email: <input name="email" value="'.$email.'" type="text" /><br />
Comments:<br />
<textarea name="comments" rows="15" cols="45">'.$comments.'</textarea><br />
<input type="submit" value="Send" />
</form>';
if ($_GET["mail"]=="result") {
if ('' == $name) {
// a valid name must be entered
echo "<p class=\"error\">Please enter your name</p>";
echo $form;
} elseif ('' == $email)
echo "<p class=\"error\">You must specify an email address</p>";
echo $form;
} elseif (!$this->IsValidEmail($email)) {
// a valid email address must be entered
echo "<p class=\"error\">That email address format is not supported by this system</p>";
echo $form;
} elseif (!$comments) {
?>%%
===##Link()## method in Wakka class===
File: **##/wikka.php##**
Change these lines:
%%(php)<?php
// check for email addresses
if (preg_match("/^.+\@.+$/", $tag))
?>%%
to:
%%(php)<?php
// check for email addresses
if (preg_match("/^".$this->RE_AddrSpec()."$/", $tag))
?>%%
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the ##RE_AddrSpec()## method, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
====TODO====
===""WikkaMail()"" method===
I'm still working on this but it needs a mention here since the documentation for the toolkit parts above refer to it. Something along these lines to give you an idea what I'm working on:
%%(php)<?php
/**
* Platform-independent smart email
*
* Provides <i>some</i> protection against CRLF injection; uses platform-dependent line
* separators for body and headers, regardless where email elements are coming from (included
* file, function output, user input...); allows "friendly" To: addresses and adds these to
* headers; returns output value from mail().
*
* More when it's finished...
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 0.5
*
* @param string $to Required.
* Addressee(s) in comma-delimited list (see description)
* @param string $subject Required.
* Email subject
* @param string $body Required.
* Email body text
* @param string $headers Optional.
* Additional headers (e.g., From: ); default ''
* @param string $extra Optional.
* Extra switches for MTA program (e.g., sendmail); default ''
* @param string $debug Optional.
* Debug mode (e.g., sendmail); default FALSE
* @return boolean TRUE on success, FALSE on failure
*/
function WikkaMail($to,$subject,$body,$headers=NULL,$extra=NULL,$debug=FALSE)
{
// ... later ... still working on it
}
?>%%
When finished, this can then be used in the FeedBack and EmailPassword actions (or anything else that needs to send an email).
==References==
~-General:
~~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~~-[[http://www.phpdoc.org/ phpDocumentor]]
~~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
~-SQL injection:
~~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~~-[[http://www.unixwiz.net/techtips/sql-injection.html SQL Injection Attacks by Example]]
~~-[[http://msdn.microsoft.com/msdnmag/issues/04/09/SQLInjection/default.aspx Stop SQL Injection Attacks Before They Stop You]]
-- JavaWoman
----
CategoryDevelopmentArchitecture
Note: this is still incomplete but should already provide some solutions for the issues I've outlined on WikkaAndEmail. I'm giving code, instructions for how to implement in Wikka, and a peek at what's to follow.
All released under the [[http://www.gnu.org/copyleft/lesser.html LGPL]]. You could use this with minor changes in other web projects that need email functionality as well.
I'm documenting everything now in the [[http://www.phpdoc.org/ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
====Toolkit implementation - step 1====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
===Patterns===
Define patterns: the idea is to define a pattern only **once** so it can be used consistently in different places.
At the start of wikka.php - add the following (including documentation blocks and without <?php and ?>) right after the WAKKA_VERSION define:
**pattern defines**
%%(php)<?php
// Pattern defines (start every define in this block with 'PATTERN_' and attach a bit to indicate what it is a pattern for
/**#@+
* Defines a pattern as a constant so it is available and consistent throughout the application
*/
define('PATTERN_NL',"/(\r?\n)|\r/"); # newline
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string
/**#@-*/
?>%%
(We'll add more here later.)
===Create an EMAIL section===
Create an "email section" in the ##Wakka## class by adding this right before the %% // VARIABLES%% line:
**email section**
%%
%%
===Functions===
The toolkit currently consists of the pattern defines (above) and three functions which make use of them. Reason for the functions and their usage are covered in their documentation blocks.
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the ##Wakka## class:
**""NoCrlf()"" method**
%%(php)<?php
/**
* Replace CR and/or LF by space in user input to prevent CRFL injection in PHP email forms.
*
* Email forms (actions) that allow a user to enter a To: or From: email address and/or
* a name and/or a subject --in general fields to be used in constructing an email header--
* may be susceptable to CRLF injection which would allow an attacker to send arbitrary email
* to arbitrary addressees.
* Simply replacing any form of "newline" in such input by a space makes such an attempt
* futile.<br>
* Function inspired by article {@link http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities}
* but implemented differently.
*
* Usage:
* <ul>
* <li> Copy this whole file to the EMAIL section of the Wakka class</li>
* <li> Apply to every user-supplied value for email address, name or subject (anything that is,
* or CAN BE used in an email header!) to guard against this</li>
* </ul>
* Use as follows:
* <code>
* // get input
* // ....
* $input = trim($this->NoCrlf($input));
* </code>
* or directly as:
* <code>
* $email = trim($this->NoCrlf($_POST['email']));
* // get other variables from a submitted form
* // ...
* </code>
* Note that {@link trim()} is applied <i>after</i> applying this function to get rid of
* whitespace at start and end of the resulting string.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_NL to recognize any type of "newline"
*
* @param string $string Required.
* User input to be sanitized
* @return string sanitized input
*/
function NoCrlf($string)
{
return preg_replace(PATTERN_NL,' ',$string);
}
?>%%
**RE_AddrSpec() method**
%%(php)<?php
/**
* Builds an RE that can be used to validate an email address or to recognize something that
* "looks like" an email address.
*
* This function builds a regular expression to enable validation of a string as a valid email
* address or to recognize something that "looks like" an email address, based on applicable
* Internet standards (notably RFC 2822 --officially a "Proposed standard", replacing RFC 822--
* and RFC 1035). The regular expression returned is Perl-compatible for use in PHP's preg_...
* functions but does NOT include delimiters; this is to allow the RE to be used as part of a
* larger RE which could be used to match a string of which an actual email address is only
* part.
*
* This function is designed such that:
* <ul>
* <li> an email address that matches an RE generated by this function is guaranteed to be
* conforming to the format standard(s) specified to the function (using RFC 2822 rules and
* (if specified) the RFC 1035 "Preferred format" for the domain part);</li>
* <li> an email address that is found to NOT match the RE generated by this function <i>may</i>
* still be conforming to the format standard(s) specified.</li>
* </ul>
*
* Error reporting:<br>
* This design implies that a user-supplied email address that matches a generated RE SHOULD be
* silently accepted.<br>
* Conversely when a user-supplied email address does not match this SHOULD NOT result in an
* error message suggesting the address is "invalid" (it may not be); any error message SHOULD
* only indicate that the address format in question is "not supported" by the application
* using this function.
*
* Standards compliance:<br>
* The RE is built using building blocks based on the production rules as specified in:
* <ul>
* <li> RFC 2822 section 3.4.1 for the address format: 'addr-spec = local-part "@" domain'</li>
* <li> RFC 2822 section 3.2.4 for the 'local-part': using 'atext', 'atom' and 'dot-atom' (using
* a subset of the full production ruleset)</li>
* <li> RFC 2822 section 3.4.1 (dot-atom) <b>or</b> RFC 1035 section 3.5 for the 'domain'
* part</li>
* </ul>
*
* Note that the domain syntax as specified in RFC 1035 section 3.5 is merely a
* "Preferred format"; we use it here because this is the generally accepted (and widely
* enforced format).
*
* By means of interfacing with external configuration and a possible override with the
* $email_format parameter, considerable flexibility in selecting an applicable format is
* provided while still returning a standards-compliant email address pattern.
*
* Explicitly NOT SUPPORTED are:
* <ul>
* <li> whitespace and comments ([CFWS]) in an email address (though allowed by RFC 2822 section
* 3.2.4)</li>
* <li> "quoted string" (strings of characters not allowed in the 'atom' production rule in
* section 3.2.4 RFC 2822) - these are considered "obsolete" in RFC 2822 although allowed
* </li>
* <li> domain literals instead of domain name; e.g., [10.0.0.67] (though allowed by RFC 2822
* section 3.4.1)</li>
* <li> "internationalized" domain names (see RFC 3490 and related RFCs: these are still very
* much proposals, not a standard yet)</li>
* <li> any check that a 'local part' is no longer than 64 characters (? mentioned in RFC 3696;
* no other reference found)</li>
* <li> any check that a domain name is no more than 255 bytes long (RFC 1035 section
* 2.3.4)</li>
* </ul>
*
* Behavior:
* <ul>
* <li> If no format is specified, the function delivers the default format</li>
* <li> If a valid format (0-5) is specified in the configuration variable 'email_format', this
* is used but:</li>
* <li> If a valid format (0-5) is specified in the $email_format parameter, this is used,
* overriding anything specified in the configuration; 0 specifies "default format" so it
* can override whatever is specified in the configuration</li>
* </ul>
*
* Formats supported:
* <ul>
* <li> ALL - local-part ('mailbox name'): RFC 2822 compliant but without support for
* whitespace, comments or "quoted string";</li>
* <li> 0-4 - local-part MUST be followed by a '@' to separate it from the domain part</li>
* <li> 0 [default] - domain: RFC 1034/1035 compliant 'domain' consisting of at least two labels
* results in the most "generally acceptable" format for an Internet email
* address</li>
* <li> 1 - domain: RFC 1034/1035 compliant but consisting of one or more labels
* allows relative domain (such as using single server name) while still being RFC
* 1035 compliant if a domain is attached</li>
* <li> 2 - domain: RFC 2822 compliant but consisting of at least two labels</li>
* <li> 3 - domain: RFC 2822 compliant</li>
* <li> 4 - domain: RFC 1035 compliant but allowing only a single level (an internal server
* name); use 1 if multiple levels are needed</li>
* <li> 5 - domain: NOT allowed (only 'user name', no '@' or domain accepted)</li>
* </ul>
*
* Formats 2-5 are specifically intended for Intranet use while 1 may be used for Intranets
* using relative domains (server names) that still need to result in an RFC 1035 compliant
* domain when a domain is appended for external use.
* To see what it's producing, add the following line to just before the result is returned:
* <code>
* echo "resulting RE:<br/>$re<br/><br/>";
* </code>
*
* Usage:<br>
* The function deliberately does not include delimiters in its output to enable it to be used
* as a building block for a larger RE. However, it takes care that / is escaped enabling / to
* be used as delimiter. This results in the following usage patterns:
* <ul>
* <li> to use as building block:</li>
* </ul>
* <code>
* $this->RE_AddrSpec() // (optionally provide parameter)
* </code>
* <ul>
* <li> to use as pattern in any of the preg_... functions, add the / delimiters (and optionally
* 'start' and 'end' delimiters) first:</li>
* </ul>
* <code>
* $pattern = '/^'.$this->RE_AddrSpec().'$/';
* $is_match = preg_match($pattern,$email);
* </code>
* It is NOT necessary to add the i modifier to the pattern since the RE itself already
* takes care of case-insensitivity as per the standards used.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_INT to validate a format specification value as "integer"
* @uses Wakka::$config['email_format'] to get specified email format;
* same rules apply as for parameter $email_format
*
* @param integer $email_format Optional.
* If specified must be 0-5; specifies which format is to be used
* (NULL (default) and integer string allowed); overrides optional
* Wakka::$config['email_format'].
* @return string RE to be used for validation or as building block for a larger RE
*/
function RE_AddrSpec($email_format=NULL)
{
// Which format do we want to validate against? We filter out invalid parameter and config values and then allow parameter to override a config value
// ignore invalid parameter (but allow integer value specified as string)
if (preg_match(PATTERN_INT,$email_format)) $email_format = (int)$email_format;
if (!is_int($email_format) || $email_format > 5 || $email_format < 0) $email_format = NULL;
// ignore invalid config value (but allow integer value specified as string)
$cfg_email_format = $this->config['email_format'];
if (preg_match(PATTERN_INT,$cfg_email_format)) $cfg_email_format = (int)$cfg_email_format;
if (!is_int($cfg_email_format) || $cfg_email_format > 5 || $cfg_email_format < 0) $cfg_email_format = NULL;
// pick up config value if parameter not specified (or invalid)
if (!isset($email_format)) $email_format = $cfg_email_format;
// RFC 2822: Email
$atextchars = "A-Za-z0-9!#$%&'*+-/=?^_`{|}~"; # all characters allowed in 'atext' of an 'atom' (RFC 2822)
$atom = preg_quote($atextchars,'/'); # escape RE special chars; matches 'atom' but excludes allowed whitespace and comments ([CFWS])
$dot_atom = '['.$atom.']+(\.['.$atom.']+)*'; # dot-atom as allowed for local part of an email address
$local_part_rfc2822 = $dot_atom; # dot-atom for local part; no [CFWS]
$domain_rfc2822 = $dot_atom; # domain part as allowed per RFC 2822 but excluding domain literals and [CFWS]
$domain_dot_atom = '['.$atom.']+(\.['.$atom.']+)+'; # dot-atom domain part but requiring at least two levels and excluding domain literals and [CFWS]
// RFC 1035: Domains (Preferred format)
$domain_labelchars = "A-Za-z0-9-"; # all characters allowed in a "label": letters, digits and a hyphen (no escaping needed here)
$domain_labelstart = "A-Za-z"; # label must start with a letter
$domain_labelend = "A-Za-z0-9"; # label cannot end in hyphen
$domain_label_rfc1035 = '['.$domain_labelstart.'](['.$domain_labelchars.']{0,61}['.$domain_labelend.'])?';
# conforms to RFC 1035; max 63 characters in a label
$domain_rfc1035 = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*'; # string of one or more dot-seprataed labels
$domain_rfc1035_abs = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*\.?'; # explicitly allows terminating dot to specify absolute domain
$domain_rfc1035_multi = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')+\.?'; # as $domain_rfc1035_abs but requires at least two labels (the most general case for addresses used on the Internet)
// build RE to match as specified (or default)
switch ($email_format)
{
// default: "Internet" email address
case NULL:
case 0:
$re = $local_part_rfc2822.'@'.$domain_rfc1035_multi;# strict Internet address; absolute assumed even if ending dot not present
break;
case 1:
$re = $local_part_rfc2822.'@'.$domain_rfc1035; # also usable for internal address (allows single label); syntactically always relative (no ending dot allowed)
break;
// all other specified formats for Intranet use *only*
case 2:
$re = $local_part_rfc2822.'@'.$domain_dot_atom; # domain pattern as per RFC 2822 but requires at least two levels
break;
case 3:
$re = $local_part_rfc2822.'@'.$domain_rfc2822; # domain pattern as per RFC 2822 but allows only single label (server name)
break;
case 4:
$re = $local_part_rfc2822.'@'.$domain_label_rfc1035;# domain pattern as per RFC 1035 but allows only single label (server name); use 1 if more levels are needed
break;
case 5:
$re = $local_part_rfc2822; # just a name, no server
break;
}
// return the resulting RE
return $re;
}
?>%%
**""IsValidEmail()"" method**
%%(php)<?php
/**
* Check whether a supplied email address is syntactically valid.
*
* The function serves as a wrapper around Wakka::RE_AddrSpec() to enable validation of a
* user-supplied email address. Best used when the address is already "sanitized" with
* {@link Wakka::NoCrlf()} and subsequently trimmed to get rid of any surrounding whitespace.
*
* Usage example:
* <code>
* $email = trim(NoCrlf($_POST['email']));
* if (!IsValidEmail($email))
* {
* // report problem
* }
* else
* {
* // continue...
* }
* </code>
* See {@link Wakka::RE_AddrSpec()} documentation about Error reporting!
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses Wakka::RE_AddrSpec() to build a standards-compliant RE used for the validation
*
* @param string $email Required.
* String to be validated
* @param integer $email_format Optional.
* Passed on to {@link Wakka::RE_AddrSpec()}
* @return boolean TRUE if $email conforms to format specified in $email_format, FALSE
* if not
*/
function IsValidEmail($email,$email_format=NULL)
{
$pattern = '/^'.$this->RE_AddrSpec($email_format).'$/';
return preg_match($pattern,$email);
}
?>%%
====Toolkit implementation - step 2====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
Now that we have the defines and the functions available we can start to apply them.
Note that while the functions themselves are fully tested, the code for the implementation suggestions below are **untested**; this is because I'm actually working on complete replacements for the actions involved (which I will share when finished, of course). So **use at your own risk**, please test before making it live (and do let me know if there are any problems).
===Installation===
Currently there is only (limited) ""JavaScript"" validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
%%(php)<?php
if (!IsValidEmail_func($email,0)) // 0 = default "Internet" format; use whatever format is needed
// report problem
else
// continue...
?>%%
Note that since we don't have a configuration **yet** at this point, we will need to specify which validation format is to be used, unless the default format is what is desired for the installation.
===Configuration===
If you are working in an Intranet and standard Internet email addresses are not used, create an entry in wikka.config.php with the name email_format and a value between 1 and 5 (see **##RE_AddrSpec()##** documentation above); e.g.:
%%(php)<?php
"email_format" => "4", # name@server
?>%%
===User Settings===
File: **##actions/usersettings.php##**
==Update block==
Starts at: //##""// is user trying to update?""##//
Change as follows:
%%(php)<?php
// is user trying to update?
if (isset($_REQUEST["action"]) && ($_REQUEST["action"] == "update"))
{
$email = trim($this->NoCrlf($_POST["email"]))
if ('' == $email)
$mailerror = "You must specify an email address";
elseif (!$this->IsValidEmail($email))
$mailerror = $email." - that email format is not supported by this system";
else
{
$this->Query("update ".$this->config["table_prefix"]."users set ".
"email = '".mysql_real_escape_string($email)."', ".
"doubleclickedit = '".mysql_real_escape_string($_POST["doubleclickedit"])."', ".
"show_comments = '".mysql_real_escape_string($_POST["show_comments"])."', ".
"revisioncount = '".mysql_real_escape_string($_POST["revisioncount"])."', ".
"changescount = '".mysql_real_escape_string($_POST["changescount"])."' ".
"where name = '".$user["name"]."' limit 1");
$this->SetUser($this->LoadUser($user["name"]));
// forward
$this->SetMessage("User settings stored!");
$this->Redirect($this->href());
}
}
?>%%
==Update form==
Insert after the first table row (**including** <?php and ?>!):
%%(php)
<?php
if (isset($mailerror))
{
print("<tr><td></td><td><div class=\"error\">".$this->Format($mailerror)."</div></td></tr>\n");
}
?>
%%
==Create new account==
Starts at: //##""// otherwise, create new account""##//
Change first section as follows:
%%(php)<?php
else
{
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$password = $_POST["password"];
$confpassword = $_POST["confpassword"];
// check if name is WikkiName style
if (!$this->IsWikiName($name)) $error = "User name must be WikiName formatted!";
else if ('' == $email) $error = "You must specify an email address.";
else if (!$this->IsValidEmail($email)) $error = "That email address format is not supported by this system.";
else if ($confpassword != $password) $error = "Passwords didn't match.";
else if (preg_match("/ /", $password)) $error = "Spaces aren't allowed in passwords.";
else if (strlen($password) < 5) $error = "Password too short.";
else
{
?>%%
===Feedback===
File: **##actions/feedback.php##**
Change first section as follows (note we get any input first and "sanitize" it before validation):
%%(php)<?php
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$comments = $_POST["comments"];
$form = '<p>Fill in the form below to send us your comments:</p>
<form method="post" action="'.$this->tag.'?mail=result">
Name: <input name="name" value="'.$name.' "type="text" /><br />
Email: <input name="email" value="'.$email.'" type="text" /><br />
Comments:<br />
<textarea name="comments" rows="15" cols="45">'.$comments.'</textarea><br />
<input type="submit" value="Send" />
</form>';
if ($_GET["mail"]=="result") {
if ('' == $name) {
// a valid name must be entered
echo "<p class=\"error\">Please enter your name</p>";
echo $form;
} elseif ('' == $email)
echo "<p class=\"error\">You must specify an email address</p>";
echo $form;
} elseif (!$this->IsValidEmail($email)) {
// a valid email address must be entered
echo "<p class=\"error\">That email address format is not supported by this system</p>";
echo $form;
} elseif (!$comments) {
?>%%
===##Link()## method in Wakka class===
File: **##/wikka.php##**
Change these lines:
%%(php)<?php
// check for email addresses
if (preg_match("/^.+\@.+$/", $tag))
?>%%
to:
%%(php)<?php
// check for email addresses
if (preg_match("/^".$this->RE_AddrSpec()."$/", $tag))
?>%%
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the ##RE_AddrSpec()## method, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
====TODO====
===""WikkaMail()"" method===
I'm still working on this but it needs a mention here since the documentation for the toolkit parts above refer to it. Something along these lines to give you an idea what I'm working on:
%%(php)<?php
/**
* Platform-independent smart email
*
* Provides <i>some</i> protection against CRLF injection; uses platform-dependent line
* separators for body and headers, regardless where email elements are coming from (included
* file, function output, user input...); allows "friendly" To: addresses and adds these to
* headers; returns output value from mail().
*
* More when it's finished...
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 0.5
*
* @param string $to Required.
* Addressee(s) in comma-delimited list (see description)
* @param string $subject Required.
* Email subject
* @param string $body Required.
* Email body text
* @param string $headers Optional.
* Additional headers (e.g., From: ); default ''
* @param string $extra Optional.
* Extra switches for MTA program (e.g., sendmail); default ''
* @param string $debug Optional.
* Debug mode (e.g., sendmail); default FALSE
* @return boolean TRUE on success, FALSE on failure
*/
function WikkaMail($to,$subject,$body,$headers=NULL,$extra=NULL,$debug=FALSE)
{
// ... later ... still working on it
}
?>%%
When finished, this can then be used in the FeedBack and EmailPassword actions (or anything else that needs to send an email).
==References==
~-General:
~~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~~-[[http://www.phpdoc.org/ phpDocumentor]]
~~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
~-SQL injection:
~~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~~-[[http://www.unixwiz.net/techtips/sql-injection.html SQL Injection Attacks by Example]]
~~-[[http://msdn.microsoft.com/msdnmag/issues/04/09/SQLInjection/default.aspx Stop SQL Injection Attacks Before They Stop You]]
-- JavaWoman
----
CategoryDevelopmentArchitecture
Deletions:
Note: this is still incomplete but should already provide some solutions for the issues I've outlined on WikkaAndEmail. I'm giving code, instructions for how to implement in Wikka, and a peek at what's to follow.
All released under the [[http://www.gnu.org/copyleft/lesser.html LGPL]]. You could use this with minor changes in other web projects that need email functionality as well.
I'm documenting everything now in the [[http://www.phpdoc.org/ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
====Toolkit implementation - step 1====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
===Patterns===
Define patterns: the idea is to define a pattern only **once** so it can be used consistently in different places.
At the start of wikka.php - add the following (including documentation blocks and without <?php and ?>) right after the WAKKA_VERSION define:
**pattern defines**
%%(php)<?php
// Pattern defines (start every define in this block with 'PATTERN_' and attach a bit to indicate what it is a pattern for
/**#@+
* Defines a pattern as a constant so it is available and consistent throughout the application
*/
define('PATTERN_NL',"/(\r?\n)|\r/"); # newline
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string
/**#@-*/
?>%%
(We'll add more here later.)
===Create an EMAIL section===
Create an "email section" in the ##Wakka## class by adding this right before the %% // VARIABLES%% line:
**email section**
%%
%%
===Functions===
The toolkit currently consists of the pattern defines (above) and three functions which make use of them. Reason for the functions and their usage are covered in their documentation blocks.
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the ##Wakka## class:
**""NoCrlf()"" method**
%%(php)<?php
/**
* Replace CR and/or LF by space in user input to prevent CRFL injection in PHP email forms.
*
* Email forms (actions) that allow a user to enter a To: or From: email address and/or
* a name and/or a subject --in general fields to be used in constructing an email header--
* may be susceptable to CRLF injection which would allow an attacker to send arbitrary email
* to arbitrary addressees.
* Simply replacing any form of "newline" in such input by a space makes such an attempt
* futile.<br>
* Function inspired by article {@link http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities}
* but implemented differently.
*
* Usage:
* <ul>
* <li> Copy this whole file to the EMAIL section of the Wakka class</li>
* <li> Apply to every user-supplied value for email address, name or subject (anything that is,
* or CAN BE used in an email header!) to guard against this</li>
* </ul>
* Use as follows:
* <code>
* // get input
* // ....
* $input = trim($this->NoCrlf($input));
* </code>
* or directly as:
* <code>
* $email = trim($this->NoCrlf($_POST['email']));
* // get other variables from a submitted form
* // ...
* </code>
* Note that {@link trim()} is applied <i>after</i> applying this function to get rid of
* whitespace at start and end of the resulting string.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_NL to recognize any type of "newline"
*
* @param string $string Required.
* User input to be sanitized
* @return string sanitized input
*/
function NoCrlf($string)
{
return preg_replace(PATTERN_NL,' ',$string);
}
?>%%
**RE_AddrSpec() method**
%%(php)<?php
/**
* Builds an RE that can be used to validate an email address or to recognize something that
* "looks like" an email address.
*
* This function builds a regular expression to enable validation of a string as a valid email
* address or to recognize something that "looks like" an email address, based on applicable
* Internet standards (notably RFC 2822 --officially a "Proposed standard", replacing RFC 822--
* and RFC 1035). The regular expression returned is Perl-compatible for use in PHP's preg_...
* functions but does NOT include delimiters; this is to allow the RE to be used as part of a
* larger RE which could be used to match a string of which an actual email address is only
* part.
*
* This function is designed such that:
* <ul>
* <li> an email address that matches an RE generated by this function is guaranteed to be
* conforming to the format standard(s) specified to the function (using RFC 2822 rules and
* (if specified) the RFC 1035 "Preferred format" for the domain part);</li>
* <li> an email address that is found to NOT match the RE generated by this function <i>may</i>
* still be conforming to the format standard(s) specified.</li>
* </ul>
*
* Error reporting:<br>
* This design implies that a user-supplied email address that matches a generated RE SHOULD be
* silently accepted.<br>
* Conversely when a user-supplied email address does not match this SHOULD NOT result in an
* error message suggesting the address is "invalid" (it may not be); any error message SHOULD
* only indicate that the address format in question is "not supported" by the application
* using this function.
*
* Standards compliance:<br>
* The RE is built using building blocks based on the production rules as specified in:
* <ul>
* <li> RFC 2822 section 3.4.1 for the address format: 'addr-spec = local-part "@" domain'</li>
* <li> RFC 2822 section 3.2.4 for the 'local-part': using 'atext', 'atom' and 'dot-atom' (using
* a subset of the full production ruleset)</li>
* <li> RFC 2822 section 3.4.1 (dot-atom) <b>or</b> RFC 1035 section 3.5 for the 'domain'
* part</li>
* </ul>
*
* Note that the domain syntax as specified in RFC 1035 section 3.5 is merely a
* "Preferred format"; we use it here because this is the generally accepted (and widely
* enforced format).
*
* By means of interfacing with external configuration and a possible override with the
* $email_format parameter, considerable flexibility in selecting an applicable format is
* provided while still returning a standards-compliant email address pattern.
*
* Explicitly NOT SUPPORTED are:
* <ul>
* <li> whitespace and comments ([CFWS]) in an email address (though allowed by RFC 2822 section
* 3.2.4)</li>
* <li> "quoted string" (strings of characters not allowed in the 'atom' production rule in
* section 3.2.4 RFC 2822) - these are considered "obsolete" in RFC 2822 although allowed
* </li>
* <li> domain literals instead of domain name; e.g., [10.0.0.67] (though allowed by RFC 2822
* section 3.4.1)</li>
* <li> "internationalized" domain names (see RFC 3490 and related RFCs: these are still very
* much proposals, not a standard yet)</li>
* <li> any check that a 'local part' is no longer than 64 characters (? mentioned in RFC 3696;
* no other reference found)</li>
* <li> any check that a domain name is no more than 255 bytes long (RFC 1035 section
* 2.3.4)</li>
* </ul>
*
* Behavior:
* <ul>
* <li> If no format is specified, the function delivers the default format</li>
* <li> If a valid format (0-5) is specified in the configuration variable 'email_format', this
* is used but:</li>
* <li> If a valid format (0-5) is specified in the $email_format parameter, this is used,
* overriding anything specified in the configuration; 0 specifies "default format" so it
* can override whatever is specified in the configuration</li>
* </ul>
*
* Formats supported:
* <ul>
* <li> ALL - local-part ('mailbox name'): RFC 2822 compliant but without support for
* whitespace, comments or "quoted string";</li>
* <li> 0-4 - local-part MUST be followed by a '@' to separate it from the domain part</li>
* <li> 0 [default] - domain: RFC 1034/1035 compliant 'domain' consisting of at least two labels
* results in the most "generally acceptable" format for an Internet email
* address</li>
* <li> 1 - domain: RFC 1034/1035 compliant but consisting of one or more labels
* allows relative domain (such as using single server name) while still being RFC
* 1035 compliant if a domain is attached</li>
* <li> 2 - domain: RFC 2822 compliant but consisting of at least two labels</li>
* <li> 3 - domain: RFC 2822 compliant</li>
* <li> 4 - domain: RFC 1035 compliant but allowing only a single level (an internal server
* name); use 1 if multiple levels are needed</li>
* <li> 5 - domain: NOT allowed (only 'user name', no '@' or domain accepted)</li>
* </ul>
*
* Formats 2-5 are specifically intended for Intranet use while 1 may be used for Intranets
* using relative domains (server names) that still need to result in an RFC 1035 compliant
* domain when a domain is appended for external use.
* To see what it's producing, add the following line to just before the result is returned:
* <code>
* echo "resulting RE:<br/>$re<br/><br/>";
* </code>
*
* Usage:<br>
* The function deliberately does not include delimiters in its output to enable it to be used
* as a building block for a larger RE. However, it takes care that / is escaped enabling / to
* be used as delimiter. This results in the following usage patterns:
* <ul>
* <li> to use as building block:</li>
* </ul>
* <code>
* $this->RE_AddrSpec() // (optionally provide parameter)
* </code>
* <ul>
* <li> to use as pattern in any of the preg_... functions, add the / delimiters (and optionally
* 'start' and 'end' delimiters) first:</li>
* </ul>
* <code>
* $pattern = '/^'.$this->RE_AddrSpec().'$/';
* $is_match = preg_match($pattern,$email);
* </code>
* It is NOT necessary to add the i modifier to the pattern since the RE itself already
* takes care of case-insensitivity as per the standards used.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_INT to validate a format specification value as "integer"
* @uses Wakka::$config['email_format'] to get specified email format;
* same rules apply as for parameter $email_format
*
* @param integer $email_format Optional.
* If specified must be 0-5; specifies which format is to be used
* (NULL (default) and integer string allowed); overrides optional
* Wakka::$config['email_format'].
* @return string RE to be used for validation or as building block for a larger RE
*/
function RE_AddrSpec($email_format=NULL)
{
// Which format do we want to validate against? We filter out invalid parameter and config values and then allow parameter to override a config value
// ignore invalid parameter (but allow integer value specified as string)
if (preg_match(PATTERN_INT,$email_format)) $email_format = (int)$email_format;
if (!is_int($email_format) || $email_format > 5 || $email_format < 0) $email_format = NULL;
// ignore invalid config value (but allow integer value specified as string)
$cfg_email_format = $this->config['email_format'];
if (preg_match(PATTERN_INT,$cfg_email_format)) $cfg_email_format = (int)$cfg_email_format;
if (!is_int($cfg_email_format) || $cfg_email_format > 5 || $cfg_email_format < 0) $cfg_email_format = NULL;
// pick up config value if parameter not specified (or invalid)
if (!isset($email_format)) $email_format = $cfg_email_format;
// RFC 2822: Email
$atextchars = "A-Za-z0-9!#$%&'*+-/=?^_`{|}~"; # all characters allowed in 'atext' of an 'atom' (RFC 2822)
$atom = preg_quote($atextchars,'/'); # escape RE special chars; matches 'atom' but excludes allowed whitespace and comments ([CFWS])
$dot_atom = '['.$atom.']+(\.['.$atom.']+)*'; # dot-atom as allowed for local part of an email address
$local_part_rfc2822 = $dot_atom; # dot-atom for local part; no [CFWS]
$domain_rfc2822 = $dot_atom; # domain part as allowed per RFC 2822 but excluding domain literals and [CFWS]
$domain_dot_atom = '['.$atom.']+(\.['.$atom.']+)+'; # dot-atom domain part but requiring at least two levels and excluding domain literals and [CFWS]
// RFC 1035: Domains (Preferred format)
$domain_labelchars = "A-Za-z0-9-"; # all characters allowed in a "label": letters, digits and a hyphen (no escaping needed here)
$domain_labelstart = "A-Za-z"; # label must start with a letter
$domain_labelend = "A-Za-z0-9"; # label cannot end in hyphen
$domain_label_rfc1035 = '['.$domain_labelstart.'](['.$domain_labelchars.']{0,61}['.$domain_labelend.'])?';
# conforms to RFC 1035; max 63 characters in a label
$domain_rfc1035 = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*'; # string of one or more dot-seprataed labels
$domain_rfc1035_abs = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*\.?'; # explicitly allows terminating dot to specify absolute domain
$domain_rfc1035_multi = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')+\.?'; # as $domain_rfc1035_abs but requires at least two labels (the most general case for addresses used on the Internet)
// build RE to match as specified (or default)
switch ($email_format)
{
// default: "Internet" email address
case NULL:
case 0:
$re = $local_part_rfc2822.'@'.$domain_rfc1035_multi;# strict Internet address; absolute assumed even if ending dot not present
break;
case 1:
$re = $local_part_rfc2822.'@'.$domain_rfc1035; # also usable for internal address (allows single label); syntactically always relative (no ending dot allowed)
break;
// all other specified formats for Intranet use *only*
case 2:
$re = $local_part_rfc2822.'@'.$domain_dot_atom; # domain pattern as per RFC 2822 but requires at least two levels
break;
case 3:
$re = $local_part_rfc2822.'@'.$domain_rfc2822; # domain pattern as per RFC 2822 but allows only single label (server name)
break;
case 4:
$re = $local_part_rfc2822.'@'.$domain_label_rfc1035;# domain pattern as per RFC 1035 but allows only single label (server name); use 1 if more levels are needed
break;
case 5:
$re = $local_part_rfc2822; # just a name, no server
break;
}
// return the resulting RE
return $re;
}
?>%%
**""IsValidEmail()"" method**
%%(php)<?php
/**
* Check whether a supplied email address is syntactically valid.
*
* The function serves as a wrapper around Wakka::RE_AddrSpec() to enable validation of a
* user-supplied email address. Best used when the address is already "sanitized" with
* {@link Wakka::NoCrlf()} and subsequently trimmed to get rid of any surrounding whitespace.
*
* Usage example:
* <code>
* $email = trim(NoCrlf($_POST['email']));
* if (!IsValidEmail($email))
* {
* // report problem
* }
* else
* {
* // continue...
* }
* </code>
* See {@link Wakka::RE_AddrSpec()} documentation about Error reporting!
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses Wakka::RE_AddrSpec() to build a standards-compliant RE used for the validation
*
* @param string $email Required.
* String to be validated
* @param integer $email_format Optional.
* Passed on to {@link Wakka::RE_AddrSpec()}
* @return boolean TRUE if $email conforms to format specified in $email_format, FALSE
* if not
*/
function IsValidEmail($email,$email_format=NULL)
{
$pattern = '/^'.$this->RE_AddrSpec($email_format).'$/';
return preg_match($pattern,$email);
}
?>%%
====Toolkit implementation - step 2====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
Now that we have the defines and the functions available we can start to apply them.
Note that while the functions themselves are fully tested, the code for the implementation suggestions below are **untested**; this is because I'm actually working on complete replacements for the actions involved (which I will share when finished, of course). So **use at your own risk**, please test before making it live (and do let me know if there are any problems).
===Installation===
Currently there is only (limited) ""JavaScript"" validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
%%(php)<?php
if (!IsValidEmail_func($email,0)) // 0 = default "Internet" format; use whatever format is needed
// report problem
else
// continue...
?>%%
Note that since we don't have a configuration **yet** at this point, we will need to specify which validation format is to be used, unless the default format is what is desired for the installation.
===Configuration===
If you are working in an Intranet and standard Internet email addresses are not used, create an entry in wikka.config.php with the name email_format and a value between 1 and 5 (see **##RE_AddrSpec()##** documentation above); e.g.:
%%(php)<?php
"email_format" => "4", # name@server
?>%%
===User Settings===
File: **##actions/usersettings.php##**
==Update block==
Starts at: //##""// is user trying to update?""##//
Change as follows:
%%(php)<?php
// is user trying to update?
if (isset($_REQUEST["action"]) && ($_REQUEST["action"] == "update"))
{
$email = trim($this->NoCrlf($_POST["email"]))
if ('' == $email)
$mailerror = "You must specify an email address";
elseif (!$this->IsValidEmail($email))
$mailerror = $email." - that email format is not supported by this system";
else
{
$this->Query("update ".$this->config["table_prefix"]."users set ".
"email = '".mysql_real_escape_string($email)."', ".
"doubleclickedit = '".mysql_real_escape_string($_POST["doubleclickedit"])."', ".
"show_comments = '".mysql_real_escape_string($_POST["show_comments"])."', ".
"revisioncount = '".mysql_real_escape_string($_POST["revisioncount"])."', ".
"changescount = '".mysql_real_escape_string($_POST["changescount"])."' ".
"where name = '".$user["name"]."' limit 1");
$this->SetUser($this->LoadUser($user["name"]));
// forward
$this->SetMessage("User settings stored!");
$this->Redirect($this->href());
}
}
?>%%
==Update form==
Insert after the first table row (**including** <?php and ?>!):
%%(php)
<?php
if (isset($mailerror))
{
print("<tr><td></td><td><div class=\"error\">".$this->Format($mailerror)."</div></td></tr>\n");
}
?>
%%
==Create new account==
Starts at: //##""// otherwise, create new account""##//
Change first section as follows:
%%(php)<?php
else
{
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$password = $_POST["password"];
$confpassword = $_POST["confpassword"];
// check if name is WikkiName style
if (!$this->IsWikiName($name)) $error = "User name must be WikiName formatted!";
else if ('' == $email) $error = "You must specify an email address.";
else if (!$this->IsValidEmail($email)) $error = "That email address format is not supported by this system.";
else if ($confpassword != $password) $error = "Passwords didn't match.";
else if (preg_match("/ /", $password)) $error = "Spaces aren't allowed in passwords.";
else if (strlen($password) < 5) $error = "Password too short.";
else
{
?>%%
===Feedback===
File: **##actions/feedback.php##**
Change first section as follows (note we get any input first and "sanitize" it before validation):
%%(php)<?php
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$comments = $_POST["comments"];
$form = '<p>Fill in the form below to send us your comments:</p>
<form method="post" action="'.$this->tag.'?mail=result">
Name: <input name="name" value="'.$name.' "type="text" /><br />
Email: <input name="email" value="'.$email.'" type="text" /><br />
Comments:<br />
<textarea name="comments" rows="15" cols="45">'.$comments.'</textarea><br />
<input type="submit" value="Send" />
</form>';
if ($_GET["mail"]=="result") {
if ('' == $name) {
// a valid name must be entered
echo "<p class=\"error\">Please enter your name</p>";
echo $form;
} elseif ('' == $email)
echo "<p class=\"error\">You must specify an email address</p>";
echo $form;
} elseif (!$this->IsValidEmail($email)) {
// a valid email address must be entered
echo "<p class=\"error\">That email address format is not supported by this system</p>";
echo $form;
} elseif (!$comments) {
?>%%
===##Link()## method in Wakka class===
File: **##/wikka.php##**
Change these lines:
%%(php)<?php
// check for email addresses
if (preg_match("/^.+\@.+$/", $tag))
?>%%
to:
%%(php)<?php
// check for email addresses
if (preg_match("/^".$this->RE_AddrSpec()."$/", $tag))
?>%%
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the ##RE_AddrSpec()## method, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
====TODO====
===""WikkaMail()"" method===
I'm still working on this but it needs a mention here since the documentation for the toolkit parts above refer to it. Something along these lines to give you an idea what I'm working on:
%%(php)<?php
/**
* Platform-independent smart email
*
* Provides <i>some</i> protection against CRLF injection; uses platform-dependent line
* separators for body and headers, regardless where email elements are coming from (included
* file, function output, user input...); allows "friendly" To: addresses and adds these to
* headers; returns output value from mail().
*
* More when it's finished...
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 0.5
*
* @param string $to Required.
* Addressee(s) in comma-delimited list (see description)
* @param string $subject Required.
* Email subject
* @param string $body Required.
* Email body text
* @param string $headers Optional.
* Additional headers (e.g., From: ); default ''
* @param string $extra Optional.
* Extra switches for MTA program (e.g., sendmail); default ''
* @param string $debug Optional.
* Debug mode (e.g., sendmail); default FALSE
* @return boolean TRUE on success, FALSE on failure
*/
function WikkaMail($to,$subject,$body,$headers=NULL,$extra=NULL,$debug=FALSE)
{
// ... later ... still working on it
}
?>%%
When finished, this can then be used in the FeedBack and EmailPassword actions (or anything else that needs to send an email).
==References==
~-General:
~~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~~-[[http://www.phpdoc.org/ phpDocumentor]]
~~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
~-SQL injection:
~~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~~-[[http://www.unixwiz.net/techtips/sql-injection.html SQL Injection Attacks by Example]]
~~-[[http://msdn.microsoft.com/msdnmag/issues/04/09/SQLInjection/default.aspx Stop SQL Injection Attacks Before They Stop You]]
-- JavaWoman
----
==categories==
CategoryDevelopment
Revision [4241]
Edited on 2005-01-07 23:40:15 by JavaWoman [Adding (and reorganizing) links (SQL injection - thanks GeorgePetsagourakis!)]Additions:
~-General:
~~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~~-[[http://www.phpdoc.org/ phpDocumentor]]
~~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
~-SQL injection:
~~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~~-[[http://www.unixwiz.net/techtips/sql-injection.html SQL Injection Attacks by Example]]
~~-[[http://msdn.microsoft.com/msdnmag/issues/04/09/SQLInjection/default.aspx Stop SQL Injection Attacks Before They Stop You]]
~~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~~-[[http://www.phpdoc.org/ phpDocumentor]]
~~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
~-SQL injection:
~~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~~-[[http://www.unixwiz.net/techtips/sql-injection.html SQL Injection Attacks by Example]]
~~-[[http://msdn.microsoft.com/msdnmag/issues/04/09/SQLInjection/default.aspx Stop SQL Injection Attacks Before They Stop You]]
Deletions:
~-[[http://www.phpdoc.org/ phpDocumentor]]
~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
Deletions:
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
Additions:
elseif (!$this->IsValidEmail($email))
Deletions:
Additions:
==References==
~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~-[[http://www.phpdoc.org/ phpDocumentor]]
~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
-- JavaWoman
----
==categories==
CategoryDevelopment
~-[[http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License]]
~-[[http://www.phpdoc.org/ phpDocumentor]]
~-[[http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities]]
~-[[http://www.faqs.org/rfcs/rfc2822.html RFC 2822 - Internet Message Format]] (Proposed standard)
~-[[http://www.faqs.org/rfcs/rfc1035.html RFC 1035 - Domain names - implementation and specification]]
~-[[http://www.faqs.org/rfcs/rfc3696.html RFC 3696 - Application Techniques for Checking and Transformation of Names]] (Informational)
-- JavaWoman
----
==categories==
CategoryDevelopment
Deletions:
Additions:
===""WikkaMail()"" method===
Deletions:
Additions:
Create an "email section" in the ##Wakka## class by adding this right before the %% // VARIABLES%% line:
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the ##Wakka## class:
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the ##Wakka## class:
Deletions:
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the wakka class:
Additions:
Currently there is only (limited) ""JavaScript"" validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the ##RE_AddrSpec()## method, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
===WikkaMail() method===
I'm still working on this but it needs a mention here since the documentation for the toolkit parts above refer to it. Something along these lines to give you an idea what I'm working on:
* Platform-independent smart email
* Provides <i>some</i> protection against CRLF injection; uses platform-dependent line
* separators for body and headers, regardless where email elements are coming from (included
* file, function output, user input...); allows "friendly" To: addresses and adds these to
* headers; returns output value from mail().
* More when it's finished...
* @version 0.5
* @param string $to Required.
* Addressee(s) in comma-delimited list (see description)
* @param string $subject Required.
* Email subject
* @param string $body Required.
* Email body text
* @param string $headers Optional.
* Additional headers (e.g., From: ); default ''
* @param string $extra Optional.
* Extra switches for MTA program (e.g., sendmail); default ''
* @param string $debug Optional.
* Debug mode (e.g., sendmail); default FALSE
* @return boolean TRUE on success, FALSE on failure
function WikkaMail($to,$subject,$body,$headers=NULL,$extra=NULL,$debug=FALSE)
// ... later ... still working on it
When finished, this can then be used in the FeedBack and EmailPassword actions (or anything else that needs to send an email).
-- JavaWoman
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the ##RE_AddrSpec()## method, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
===WikkaMail() method===
I'm still working on this but it needs a mention here since the documentation for the toolkit parts above refer to it. Something along these lines to give you an idea what I'm working on:
* Platform-independent smart email
* Provides <i>some</i> protection against CRLF injection; uses platform-dependent line
* separators for body and headers, regardless where email elements are coming from (included
* file, function output, user input...); allows "friendly" To: addresses and adds these to
* headers; returns output value from mail().
* More when it's finished...
* @version 0.5
* @param string $to Required.
* Addressee(s) in comma-delimited list (see description)
* @param string $subject Required.
* Email subject
* @param string $body Required.
* Email body text
* @param string $headers Optional.
* Additional headers (e.g., From: ); default ''
* @param string $extra Optional.
* Extra switches for MTA program (e.g., sendmail); default ''
* @param string $debug Optional.
* Debug mode (e.g., sendmail); default FALSE
* @return boolean TRUE on success, FALSE on failure
function WikkaMail($to,$subject,$body,$headers=NULL,$extra=NULL,$debug=FALSE)
// ... later ... still working on it
When finished, this can then be used in the FeedBack and EmailPassword actions (or anything else that needs to send an email).
-- JavaWoman
Deletions:
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the RE_AddrSpec() function, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
Additions:
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
===Installation===
Currently there is only (limited) JavaScript validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
if (!IsValidEmail_func($email,0)) // 0 = default "Internet" format; use whatever format is needed
// report problem
else
// continue...
Note that since we don't have a configuration **yet** at this point, we will need to specify which validation format is to be used, unless the default format is what is desired for the installation.
===Configuration===
If you are working in an Intranet and standard Internet email addresses are not used, create an entry in wikka.config.php with the name email_format and a value between 1 and 5 (see **##RE_AddrSpec()##** documentation above); e.g.:
"email_format" => "4", # name@server
===User Settings===
File: **##actions/usersettings.php##**
==Update block==
Starts at: //##""// is user trying to update?""##//
Change as follows:
// is user trying to update?
if (isset($_REQUEST["action"]) && ($_REQUEST["action"] == "update"))
$email = trim($this->NoCrlf($_POST["email"]))
if ('' == $email)
$mailerror = "You must specify an email address";
elseif (!$this->IsValidEmail_func($email))
$mailerror = $email." - that email format is not supported by this system";
else
$this->Query("update ".$this->config["table_prefix"]."users set ".
"email = '".mysql_real_escape_string($email)."', ".
"doubleclickedit = '".mysql_real_escape_string($_POST["doubleclickedit"])."', ".
"show_comments = '".mysql_real_escape_string($_POST["show_comments"])."', ".
"revisioncount = '".mysql_real_escape_string($_POST["revisioncount"])."', ".
"changescount = '".mysql_real_escape_string($_POST["changescount"])."' ".
"where name = '".$user["name"]."' limit 1");
$this->SetUser($this->LoadUser($user["name"]));
// forward
$this->SetMessage("User settings stored!");
$this->Redirect($this->href());
==Update form==
Insert after the first table row (**including** <?php and ?>!):
%%(php)
<?php
if (isset($mailerror))
print("<tr><td></td><td><div class=\"error\">".$this->Format($mailerror)."</div></td></tr>\n");
?>
==Create new account==
Starts at: //##""// otherwise, create new account""##//
Change first section as follows:
else
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$password = $_POST["password"];
$confpassword = $_POST["confpassword"];
// check if name is WikkiName style
if (!$this->IsWikiName($name)) $error = "User name must be WikiName formatted!";
else if ('' == $email) $error = "You must specify an email address.";
else if (!$this->IsValidEmail($email)) $error = "That email address format is not supported by this system.";
else if ($confpassword != $password) $error = "Passwords didn't match.";
else if (preg_match("/ /", $password)) $error = "Spaces aren't allowed in passwords.";
else if (strlen($password) < 5) $error = "Password too short.";
else
{
===Feedback===
File: **##actions/feedback.php##**
Change first section as follows (note we get any input first and "sanitize" it before validation):
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$comments = $_POST["comments"];
$form = '<p>Fill in the form below to send us your comments:</p>
<form method="post" action="'.$this->tag.'?mail=result">
Name: <input name="name" value="'.$name.' "type="text" /><br />
Email: <input name="email" value="'.$email.'" type="text" /><br />
Comments:<br />
<textarea name="comments" rows="15" cols="45">'.$comments.'</textarea><br />
<input type="submit" value="Send" />
</form>';
if ($_GET["mail"]=="result") {
if ('' == $name) {
// a valid name must be entered
echo "<p class=\"error\">Please enter your name</p>";
echo $form;
} elseif ('' == $email)
echo "<p class=\"error\">You must specify an email address</p>";
echo $form;
} elseif (!$this->IsValidEmail($email)) {
// a valid email address must be entered
echo "<p class=\"error\">That email address format is not supported by this system</p>";
echo $form;
} elseif (!$comments) {
===##Link()## method in Wakka class===
File: **##/wikka.php##**
Change these lines:
// check for email addresses
if (preg_match("/^.+\@.+$/", $tag))
to:
// check for email addresses
if (preg_match("/^".$this->RE_AddrSpec()."$/", $tag))
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the RE_AddrSpec() function, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
====TODO====
[**NOTE**: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version: the tabs are still there and will make the code more readable!]
===Installation===
Currently there is only (limited) JavaScript validation for Admin's email address. The procedure (**##setup/default.php##**) should at least have validation in PHP as well; I'm only suggesting an approach here, not giving full code:
if (!IsValidEmail_func($email,0)) // 0 = default "Internet" format; use whatever format is needed
// report problem
else
// continue...
Note that since we don't have a configuration **yet** at this point, we will need to specify which validation format is to be used, unless the default format is what is desired for the installation.
===Configuration===
If you are working in an Intranet and standard Internet email addresses are not used, create an entry in wikka.config.php with the name email_format and a value between 1 and 5 (see **##RE_AddrSpec()##** documentation above); e.g.:
"email_format" => "4", # name@server
===User Settings===
File: **##actions/usersettings.php##**
==Update block==
Starts at: //##""// is user trying to update?""##//
Change as follows:
// is user trying to update?
if (isset($_REQUEST["action"]) && ($_REQUEST["action"] == "update"))
$email = trim($this->NoCrlf($_POST["email"]))
if ('' == $email)
$mailerror = "You must specify an email address";
elseif (!$this->IsValidEmail_func($email))
$mailerror = $email." - that email format is not supported by this system";
else
$this->Query("update ".$this->config["table_prefix"]."users set ".
"email = '".mysql_real_escape_string($email)."', ".
"doubleclickedit = '".mysql_real_escape_string($_POST["doubleclickedit"])."', ".
"show_comments = '".mysql_real_escape_string($_POST["show_comments"])."', ".
"revisioncount = '".mysql_real_escape_string($_POST["revisioncount"])."', ".
"changescount = '".mysql_real_escape_string($_POST["changescount"])."' ".
"where name = '".$user["name"]."' limit 1");
$this->SetUser($this->LoadUser($user["name"]));
// forward
$this->SetMessage("User settings stored!");
$this->Redirect($this->href());
==Update form==
Insert after the first table row (**including** <?php and ?>!):
%%(php)
<?php
if (isset($mailerror))
print("<tr><td></td><td><div class=\"error\">".$this->Format($mailerror)."</div></td></tr>\n");
?>
==Create new account==
Starts at: //##""// otherwise, create new account""##//
Change first section as follows:
else
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$password = $_POST["password"];
$confpassword = $_POST["confpassword"];
// check if name is WikkiName style
if (!$this->IsWikiName($name)) $error = "User name must be WikiName formatted!";
else if ('' == $email) $error = "You must specify an email address.";
else if (!$this->IsValidEmail($email)) $error = "That email address format is not supported by this system.";
else if ($confpassword != $password) $error = "Passwords didn't match.";
else if (preg_match("/ /", $password)) $error = "Spaces aren't allowed in passwords.";
else if (strlen($password) < 5) $error = "Password too short.";
else
{
===Feedback===
File: **##actions/feedback.php##**
Change first section as follows (note we get any input first and "sanitize" it before validation):
$name = trim($this->NoCrlf($_POST["name"]));
$email = trim($this->NoCrlf($_POST["email"]))
$comments = $_POST["comments"];
$form = '<p>Fill in the form below to send us your comments:</p>
<form method="post" action="'.$this->tag.'?mail=result">
Name: <input name="name" value="'.$name.' "type="text" /><br />
Email: <input name="email" value="'.$email.'" type="text" /><br />
Comments:<br />
<textarea name="comments" rows="15" cols="45">'.$comments.'</textarea><br />
<input type="submit" value="Send" />
</form>';
if ($_GET["mail"]=="result") {
if ('' == $name) {
// a valid name must be entered
echo "<p class=\"error\">Please enter your name</p>";
echo $form;
} elseif ('' == $email)
echo "<p class=\"error\">You must specify an email address</p>";
echo $form;
} elseif (!$this->IsValidEmail($email)) {
// a valid email address must be entered
echo "<p class=\"error\">That email address format is not supported by this system</p>";
echo $form;
} elseif (!$comments) {
===##Link()## method in Wakka class===
File: **##/wikka.php##**
Change these lines:
// check for email addresses
if (preg_match("/^.+\@.+$/", $tag))
to:
// check for email addresses
if (preg_match("/^".$this->RE_AddrSpec()."$/", $tag))
This will match the default "Internet" address format or whatever is configured in wikka.config.php; optionally, provide a format override in the RE_AddrSpec() function, for instance 2 for a very generic pattern that is still RFC 2822 compliant (but not necessarily usable as an Internet email address!).
====TODO====
Deletions:
Additions:
All released under the [[http://www.gnu.org/copyleft/lesser.html LGPL]]. You could use this with minor changes in other web projects that need email functionality as well.
I'm documenting everything now in the [[http://www.phpdoc.org/ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
[Note: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version.]
Define patterns: the idea is to define a pattern only **once** so it can be used consistently in different places.
**pattern defines**
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string
(We'll add more here later.)
===Create an EMAIL section===
Create an "email section" in the wakka class by adding this right before the %% // VARIABLES%% line:
**email section**
%%
//EMAIL
%%
===Functions===
The toolkit currently consists of the pattern defines (above) and three functions which make use of them. Reason for the functions and their usage are covered in their documentation blocks.
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the wakka class:
**""NoCrlf()"" method**
/**
* Replace CR and/or LF by space in user input to prevent CRFL injection in PHP email forms.
*
* Email forms (actions) that allow a user to enter a To: or From: email address and/or
* a name and/or a subject --in general fields to be used in constructing an email header--
* may be susceptable to CRLF injection which would allow an attacker to send arbitrary email
* to arbitrary addressees.
* Simply replacing any form of "newline" in such input by a space makes such an attempt
* futile.<br>
* Function inspired by article {@link http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities}
* but implemented differently.
*
* Usage:
* <ul>
* <li> Copy this whole file to the EMAIL section of the Wakka class</li>
* <li> Apply to every user-supplied value for email address, name or subject (anything that is,
* or CAN BE used in an email header!) to guard against this</li>
* </ul>
* Use as follows:
* <code>
* // get input
* // ....
* $input = trim($this->NoCrlf($input));
* </code>
* or directly as:
* <code>
* $email = trim($this->NoCrlf($_POST['email']));
* // get other variables from a submitted form
* // ...
* </code>
* Note that {@link trim()} is applied <i>after</i> applying this function to get rid of
* whitespace at start and end of the resulting string.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_NL to recognize any type of "newline"
*
* @param string $string Required.
* User input to be sanitized
* @return string sanitized input
*/
function NoCrlf($string)
{
return preg_replace(PATTERN_NL,' ',$string);
}
**RE_AddrSpec() method**
/**
* Builds an RE that can be used to validate an email address or to recognize something that
* "looks like" an email address.
*
* This function builds a regular expression to enable validation of a string as a valid email
* address or to recognize something that "looks like" an email address, based on applicable
* Internet standards (notably RFC 2822 --officially a "Proposed standard", replacing RFC 822--
* and RFC 1035). The regular expression returned is Perl-compatible for use in PHP's preg_...
* functions but does NOT include delimiters; this is to allow the RE to be used as part of a
* larger RE which could be used to match a string of which an actual email address is only
* part.
*
* This function is designed such that:
* <ul>
* <li> an email address that matches an RE generated by this function is guaranteed to be
* conforming to the format standard(s) specified to the function (using RFC 2822 rules and
* (if specified) the RFC 1035 "Preferred format" for the domain part);</li>
* <li> an email address that is found to NOT match the RE generated by this function <i>may</i>
* still be conforming to the format standard(s) specified.</li>
* </ul>
*
* Error reporting:<br>
* This design implies that a user-supplied email address that matches a generated RE SHOULD be
* silently accepted.<br>
* Conversely when a user-supplied email address does not match this SHOULD NOT result in an
* error message suggesting the address is "invalid" (it may not be); any error message SHOULD
* only indicate that the address format in question is "not supported" by the application
* using this function.
*
* Standards compliance:<br>
* The RE is built using building blocks based on the production rules as specified in:
* <ul>
* <li> RFC 2822 section 3.4.1 for the address format: 'addr-spec = local-part "@" domain'</li>
* <li> RFC 2822 section 3.2.4 for the 'local-part': using 'atext', 'atom' and 'dot-atom' (using
* a subset of the full production ruleset)</li>
* <li> RFC 2822 section 3.4.1 (dot-atom) <b>or</b> RFC 1035 section 3.5 for the 'domain'
* part</li>
* </ul>
*
* Note that the domain syntax as specified in RFC 1035 section 3.5 is merely a
* "Preferred format"; we use it here because this is the generally accepted (and widely
* enforced format).
*
* By means of interfacing with external configuration and a possible override with the
* $email_format parameter, considerable flexibility in selecting an applicable format is
* provided while still returning a standards-compliant email address pattern.
*
* Explicitly NOT SUPPORTED are:
* <ul>
* <li> whitespace and comments ([CFWS]) in an email address (though allowed by RFC 2822 section
* 3.2.4)</li>
* <li> "quoted string" (strings of characters not allowed in the 'atom' production rule in
* section 3.2.4 RFC 2822) - these are considered "obsolete" in RFC 2822 although allowed
* </li>
* <li> domain literals instead of domain name; e.g., [10.0.0.67] (though allowed by RFC 2822
* section 3.4.1)</li>
* <li> "internationalized" domain names (see RFC 3490 and related RFCs: these are still very
* much proposals, not a standard yet)</li>
* <li> any check that a 'local part' is no longer than 64 characters (? mentioned in RFC 3696;
* no other reference found)</li>
* <li> any check that a domain name is no more than 255 bytes long (RFC 1035 section
* 2.3.4)</li>
* </ul>
*
* Behavior:
* <ul>
* <li> If no format is specified, the function delivers the default format</li>
* <li> If a valid format (0-5) is specified in the configuration variable 'email_format', this
* is used but:</li>
* <li> If a valid format (0-5) is specified in the $email_format parameter, this is used,
* overriding anything specified in the configuration; 0 specifies "default format" so it
* can override whatever is specified in the configuration</li>
* </ul>
*
* Formats supported:
* <ul>
* <li> ALL - local-part ('mailbox name'): RFC 2822 compliant but without support for
* whitespace, comments or "quoted string";</li>
* <li> 0-4 - local-part MUST be followed by a '@' to separate it from the domain part</li>
* <li> 0 [default] - domain: RFC 1034/1035 compliant 'domain' consisting of at least two labels
* results in the most "generally acceptable" format for an Internet email
* address</li>
* <li> 1 - domain: RFC 1034/1035 compliant but consisting of one or more labels
* allows relative domain (such as using single server name) while still being RFC
* 1035 compliant if a domain is attached</li>
* <li> 2 - domain: RFC 2822 compliant but consisting of at least two labels</li>
* <li> 3 - domain: RFC 2822 compliant</li>
* <li> 4 - domain: RFC 1035 compliant but allowing only a single level (an internal server
* name); use 1 if multiple levels are needed</li>
* <li> 5 - domain: NOT allowed (only 'user name', no '@' or domain accepted)</li>
* </ul>
*
* Formats 2-5 are specifically intended for Intranet use while 1 may be used for Intranets
* using relative domains (server names) that still need to result in an RFC 1035 compliant
* domain when a domain is appended for external use.
* To see what it's producing, add the following line to just before the result is returned:
* <code>
* echo "resulting RE:<br/>$re<br/><br/>";
* </code>
*
* Usage:<br>
* The function deliberately does not include delimiters in its output to enable it to be used
* as a building block for a larger RE. However, it takes care that / is escaped enabling / to
* be used as delimiter. This results in the following usage patterns:
* <ul>
* <li> to use as building block:</li>
* </ul>
* <code>
* $this->RE_AddrSpec() // (optionally provide parameter)
* </code>
* <ul>
* <li> to use as pattern in any of the preg_... functions, add the / delimiters (and optionally
* 'start' and 'end' delimiters) first:</li>
* </ul>
* <code>
* $pattern = '/^'.$this->RE_AddrSpec().'$/';
* $is_match = preg_match($pattern,$email);
* </code>
* It is NOT necessary to add the i modifier to the pattern since the RE itself already
* takes care of case-insensitivity as per the standards used.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_INT to validate a format specification value as "integer"
* @uses Wakka::$config['email_format'] to get specified email format;
* same rules apply as for parameter $email_format
*
* @param integer $email_format Optional.
* If specified must be 0-5; specifies which format is to be used
* (NULL (default) and integer string allowed); overrides optional
* Wakka::$config['email_format'].
* @return string RE to be used for validation or as building block for a larger RE
*/
function RE_AddrSpec($email_format=NULL)
{
// Which format do we want to validate against? We filter out invalid parameter and config values and then allow parameter to override a config value
// ignore invalid parameter (but allow integer value specified as string)
if (preg_match(PATTERN_INT,$email_format)) $email_format = (int)$email_format;
if (!is_int($email_format) || $email_format > 5 || $email_format < 0) $email_format = NULL;
// ignore invalid config value (but allow integer value specified as string)
$cfg_email_format = $this->config['email_format'];
if (preg_match(PATTERN_INT,$cfg_email_format)) $cfg_email_format = (int)$cfg_email_format;
if (!is_int($cfg_email_format) || $cfg_email_format > 5 || $cfg_email_format < 0) $cfg_email_format = NULL;
// pick up config value if parameter not specified (or invalid)
if (!isset($email_format)) $email_format = $cfg_email_format;
// RFC 2822: Email
$atextchars = "A-Za-z0-9!#$%&'*+-/=?^_`{|}~"; # all characters allowed in 'atext' of an 'atom' (RFC 2822)
$atom = preg_quote($atextchars,'/'); # escape RE special chars; matches 'atom' but excludes allowed whitespace and comments ([CFWS])
$dot_atom = '['.$atom.']+(\.['.$atom.']+)*'; # dot-atom as allowed for local part of an email address
$local_part_rfc2822 = $dot_atom; # dot-atom for local part; no [CFWS]
$domain_rfc2822 = $dot_atom; # domain part as allowed per RFC 2822 but excluding domain literals and [CFWS]
$domain_dot_atom = '['.$atom.']+(\.['.$atom.']+)+'; # dot-atom domain part but requiring at least two levels and excluding domain literals and [CFWS]
// RFC 1035: Domains (Preferred format)
$domain_labelchars = "A-Za-z0-9-"; # all characters allowed in a "label": letters, digits and a hyphen (no escaping needed here)
$domain_labelstart = "A-Za-z"; # label must start with a letter
$domain_labelend = "A-Za-z0-9"; # label cannot end in hyphen
$domain_label_rfc1035 = '['.$domain_labelstart.'](['.$domain_labelchars.']{0,61}['.$domain_labelend.'])?';
# conforms to RFC 1035; max 63 characters in a label
$domain_rfc1035 = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*'; # string of one or more dot-seprataed labels
$domain_rfc1035_abs = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*\.?'; # explicitly allows terminating dot to specify absolute domain
$domain_rfc1035_multi = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')+\.?'; # as $domain_rfc1035_abs but requires at least two labels (the most general case for addresses used on the Internet)
// build RE to match as specified (or default)
switch ($email_format)
{
// default: "Internet" email address
case NULL:
case 0:
$re = $local_part_rfc2822.'@'.$domain_rfc1035_multi;# strict Internet address; absolute assumed even if ending dot not present
break;
case 1:
$re = $local_part_rfc2822.'@'.$domain_rfc1035; # also usable for internal address (allows single label); syntactically always relative (no ending dot allowed)
break;
// all other specified formats for Intranet use *only*
case 2:
$re = $local_part_rfc2822.'@'.$domain_dot_atom; # domain pattern as per RFC 2822 but requires at least two levels
break;
case 3:
$re = $local_part_rfc2822.'@'.$domain_rfc2822; # domain pattern as per RFC 2822 but allows only single label (server name)
break;
case 4:
$re = $local_part_rfc2822.'@'.$domain_label_rfc1035;# domain pattern as per RFC 1035 but allows only single label (server name); use 1 if more levels are needed
break;
case 5:
$re = $local_part_rfc2822; # just a name, no server
break;
}
// return the resulting RE
return $re;
}
**""IsValidEmail()"" method**
/**
* Check whether a supplied email address is syntactically valid.
*
* The function serves as a wrapper around Wakka::RE_AddrSpec() to enable validation of a
* user-supplied email address. Best used when the address is already "sanitized" with
* {@link Wakka::NoCrlf()} and subsequently trimmed to get rid of any surrounding whitespace.
*
* Usage example:
* <code>
* $email = trim(NoCrlf($_POST['email']));
* if (!IsValidEmail($email))
* {
* // report problem
* }
* else
* {
* // continue...
* }
* </code>
* See {@link Wakka::RE_AddrSpec()} documentation about Error reporting!
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
* @version 1.0
*
* @access public
* @uses Wakka::RE_AddrSpec() to build a standards-compliant RE used for the validation
*
* @param string $email Required.
* String to be validated
* @param integer $email_format Optional.
* Passed on to {@link Wakka::RE_AddrSpec()}
* @return boolean TRUE if $email conforms to format specified in $email_format, FALSE
* if not
*/
function IsValidEmail($email,$email_format=NULL)
{
$pattern = '/^'.$this->RE_AddrSpec($email_format).'$/';
return preg_match($pattern,$email);
}
====Toolkit implementation - step 2====
Now that we have the defines and the functions available we can start to apply them.
Note that while the functions themselves are fully tested, the code for the implementation suggestions below are **untested**; this is because I'm actually working on complete replacements for the actions involved (which I will share when finished, of course). So **use at your own risk**, please test before making it live (and do let me know if there are any problems).
I'm documenting everything now in the [[http://www.phpdoc.org/ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
[Note: the syntax highlighting below is nice, but makes a mess of the tabs used for formatting; try copying the code from the **source** of this page instead of from the rendered version.]
Define patterns: the idea is to define a pattern only **once** so it can be used consistently in different places.
**pattern defines**
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string
(We'll add more here later.)
===Create an EMAIL section===
Create an "email section" in the wakka class by adding this right before the %% // VARIABLES%% line:
**email section**
%%
%%
===Functions===
The toolkit currently consists of the pattern defines (above) and three functions which make use of them. Reason for the functions and their usage are covered in their documentation blocks.
Copy the following three functions (including documentation blocks and without <?php and ?>) into the (new) EMAIL section in the wakka class:
**""NoCrlf()"" method**
/**
* Replace CR and/or LF by space in user input to prevent CRFL injection in PHP email forms.
*
* Email forms (actions) that allow a user to enter a To: or From: email address and/or
* a name and/or a subject --in general fields to be used in constructing an email header--
* may be susceptable to CRLF injection which would allow an attacker to send arbitrary email
* to arbitrary addressees.
* Simply replacing any form of "newline" in such input by a space makes such an attempt
* futile.<br>
* Function inspired by article {@link http://www.securiteam.com/unixfocus/6F00Q0K6AK.html PHP-Nuke mail CRLF Injection Vulnerabilities}
* but implemented differently.
*
* Usage:
* <ul>
* <li> Copy this whole file to the EMAIL section of the Wakka class</li>
* <li> Apply to every user-supplied value for email address, name or subject (anything that is,
* or CAN BE used in an email header!) to guard against this</li>
* </ul>
* Use as follows:
* <code>
* // get input
* // ....
* $input = trim($this->NoCrlf($input));
* </code>
* or directly as:
* <code>
* $email = trim($this->NoCrlf($_POST['email']));
* // get other variables from a submitted form
* // ...
* </code>
* Note that {@link trim()} is applied <i>after</i> applying this function to get rid of
* whitespace at start and end of the resulting string.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU Lesser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_NL to recognize any type of "newline"
*
* @param string $string Required.
* User input to be sanitized
* @return string sanitized input
*/
function NoCrlf($string)
{
return preg_replace(PATTERN_NL,' ',$string);
}
**RE_AddrSpec() method**
/**
* Builds an RE that can be used to validate an email address or to recognize something that
* "looks like" an email address.
*
* This function builds a regular expression to enable validation of a string as a valid email
* address or to recognize something that "looks like" an email address, based on applicable
* Internet standards (notably RFC 2822 --officially a "Proposed standard", replacing RFC 822--
* and RFC 1035). The regular expression returned is Perl-compatible for use in PHP's preg_...
* functions but does NOT include delimiters; this is to allow the RE to be used as part of a
* larger RE which could be used to match a string of which an actual email address is only
* part.
*
* This function is designed such that:
* <ul>
* <li> an email address that matches an RE generated by this function is guaranteed to be
* conforming to the format standard(s) specified to the function (using RFC 2822 rules and
* (if specified) the RFC 1035 "Preferred format" for the domain part);</li>
* <li> an email address that is found to NOT match the RE generated by this function <i>may</i>
* still be conforming to the format standard(s) specified.</li>
* </ul>
*
* Error reporting:<br>
* This design implies that a user-supplied email address that matches a generated RE SHOULD be
* silently accepted.<br>
* Conversely when a user-supplied email address does not match this SHOULD NOT result in an
* error message suggesting the address is "invalid" (it may not be); any error message SHOULD
* only indicate that the address format in question is "not supported" by the application
* using this function.
*
* Standards compliance:<br>
* The RE is built using building blocks based on the production rules as specified in:
* <ul>
* <li> RFC 2822 section 3.4.1 for the address format: 'addr-spec = local-part "@" domain'</li>
* <li> RFC 2822 section 3.2.4 for the 'local-part': using 'atext', 'atom' and 'dot-atom' (using
* a subset of the full production ruleset)</li>
* <li> RFC 2822 section 3.4.1 (dot-atom) <b>or</b> RFC 1035 section 3.5 for the 'domain'
* part</li>
* </ul>
*
* Note that the domain syntax as specified in RFC 1035 section 3.5 is merely a
* "Preferred format"; we use it here because this is the generally accepted (and widely
* enforced format).
*
* By means of interfacing with external configuration and a possible override with the
* $email_format parameter, considerable flexibility in selecting an applicable format is
* provided while still returning a standards-compliant email address pattern.
*
* Explicitly NOT SUPPORTED are:
* <ul>
* <li> whitespace and comments ([CFWS]) in an email address (though allowed by RFC 2822 section
* 3.2.4)</li>
* <li> "quoted string" (strings of characters not allowed in the 'atom' production rule in
* section 3.2.4 RFC 2822) - these are considered "obsolete" in RFC 2822 although allowed
* </li>
* <li> domain literals instead of domain name; e.g., [10.0.0.67] (though allowed by RFC 2822
* section 3.4.1)</li>
* <li> "internationalized" domain names (see RFC 3490 and related RFCs: these are still very
* much proposals, not a standard yet)</li>
* <li> any check that a 'local part' is no longer than 64 characters (? mentioned in RFC 3696;
* no other reference found)</li>
* <li> any check that a domain name is no more than 255 bytes long (RFC 1035 section
* 2.3.4)</li>
* </ul>
*
* Behavior:
* <ul>
* <li> If no format is specified, the function delivers the default format</li>
* <li> If a valid format (0-5) is specified in the configuration variable 'email_format', this
* is used but:</li>
* <li> If a valid format (0-5) is specified in the $email_format parameter, this is used,
* overriding anything specified in the configuration; 0 specifies "default format" so it
* can override whatever is specified in the configuration</li>
* </ul>
*
* Formats supported:
* <ul>
* <li> ALL - local-part ('mailbox name'): RFC 2822 compliant but without support for
* whitespace, comments or "quoted string";</li>
* <li> 0-4 - local-part MUST be followed by a '@' to separate it from the domain part</li>
* <li> 0 [default] - domain: RFC 1034/1035 compliant 'domain' consisting of at least two labels
* results in the most "generally acceptable" format for an Internet email
* address</li>
* <li> 1 - domain: RFC 1034/1035 compliant but consisting of one or more labels
* allows relative domain (such as using single server name) while still being RFC
* 1035 compliant if a domain is attached</li>
* <li> 2 - domain: RFC 2822 compliant but consisting of at least two labels</li>
* <li> 3 - domain: RFC 2822 compliant</li>
* <li> 4 - domain: RFC 1035 compliant but allowing only a single level (an internal server
* name); use 1 if multiple levels are needed</li>
* <li> 5 - domain: NOT allowed (only 'user name', no '@' or domain accepted)</li>
* </ul>
*
* Formats 2-5 are specifically intended for Intranet use while 1 may be used for Intranets
* using relative domains (server names) that still need to result in an RFC 1035 compliant
* domain when a domain is appended for external use.
* To see what it's producing, add the following line to just before the result is returned:
* <code>
* echo "resulting RE:<br/>$re<br/><br/>";
* </code>
*
* Usage:<br>
* The function deliberately does not include delimiters in its output to enable it to be used
* as a building block for a larger RE. However, it takes care that / is escaped enabling / to
* be used as delimiter. This results in the following usage patterns:
* <ul>
* <li> to use as building block:</li>
* </ul>
* <code>
* $this->RE_AddrSpec() // (optionally provide parameter)
* </code>
* <ul>
* <li> to use as pattern in any of the preg_... functions, add the / delimiters (and optionally
* 'start' and 'end' delimiters) first:</li>
* </ul>
* <code>
* $pattern = '/^'.$this->RE_AddrSpec().'$/';
* $is_match = preg_match($pattern,$email);
* </code>
* It is NOT necessary to add the i modifier to the pattern since the RE itself already
* takes care of case-insensitivity as per the standards used.
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
* @version 1.0
*
* @access public
* @uses PATTERN_INT to validate a format specification value as "integer"
* @uses Wakka::$config['email_format'] to get specified email format;
* same rules apply as for parameter $email_format
*
* @param integer $email_format Optional.
* If specified must be 0-5; specifies which format is to be used
* (NULL (default) and integer string allowed); overrides optional
* Wakka::$config['email_format'].
* @return string RE to be used for validation or as building block for a larger RE
*/
function RE_AddrSpec($email_format=NULL)
{
// Which format do we want to validate against? We filter out invalid parameter and config values and then allow parameter to override a config value
// ignore invalid parameter (but allow integer value specified as string)
if (preg_match(PATTERN_INT,$email_format)) $email_format = (int)$email_format;
if (!is_int($email_format) || $email_format > 5 || $email_format < 0) $email_format = NULL;
// ignore invalid config value (but allow integer value specified as string)
$cfg_email_format = $this->config['email_format'];
if (preg_match(PATTERN_INT,$cfg_email_format)) $cfg_email_format = (int)$cfg_email_format;
if (!is_int($cfg_email_format) || $cfg_email_format > 5 || $cfg_email_format < 0) $cfg_email_format = NULL;
// pick up config value if parameter not specified (or invalid)
if (!isset($email_format)) $email_format = $cfg_email_format;
// RFC 2822: Email
$atextchars = "A-Za-z0-9!#$%&'*+-/=?^_`{|}~"; # all characters allowed in 'atext' of an 'atom' (RFC 2822)
$atom = preg_quote($atextchars,'/'); # escape RE special chars; matches 'atom' but excludes allowed whitespace and comments ([CFWS])
$dot_atom = '['.$atom.']+(\.['.$atom.']+)*'; # dot-atom as allowed for local part of an email address
$local_part_rfc2822 = $dot_atom; # dot-atom for local part; no [CFWS]
$domain_rfc2822 = $dot_atom; # domain part as allowed per RFC 2822 but excluding domain literals and [CFWS]
$domain_dot_atom = '['.$atom.']+(\.['.$atom.']+)+'; # dot-atom domain part but requiring at least two levels and excluding domain literals and [CFWS]
// RFC 1035: Domains (Preferred format)
$domain_labelchars = "A-Za-z0-9-"; # all characters allowed in a "label": letters, digits and a hyphen (no escaping needed here)
$domain_labelstart = "A-Za-z"; # label must start with a letter
$domain_labelend = "A-Za-z0-9"; # label cannot end in hyphen
$domain_label_rfc1035 = '['.$domain_labelstart.'](['.$domain_labelchars.']{0,61}['.$domain_labelend.'])?';
# conforms to RFC 1035; max 63 characters in a label
$domain_rfc1035 = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*'; # string of one or more dot-seprataed labels
$domain_rfc1035_abs = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')*\.?'; # explicitly allows terminating dot to specify absolute domain
$domain_rfc1035_multi = $domain_label_rfc1035.'(\.'.$domain_label_rfc1035.')+\.?'; # as $domain_rfc1035_abs but requires at least two labels (the most general case for addresses used on the Internet)
// build RE to match as specified (or default)
switch ($email_format)
{
// default: "Internet" email address
case NULL:
case 0:
$re = $local_part_rfc2822.'@'.$domain_rfc1035_multi;# strict Internet address; absolute assumed even if ending dot not present
break;
case 1:
$re = $local_part_rfc2822.'@'.$domain_rfc1035; # also usable for internal address (allows single label); syntactically always relative (no ending dot allowed)
break;
// all other specified formats for Intranet use *only*
case 2:
$re = $local_part_rfc2822.'@'.$domain_dot_atom; # domain pattern as per RFC 2822 but requires at least two levels
break;
case 3:
$re = $local_part_rfc2822.'@'.$domain_rfc2822; # domain pattern as per RFC 2822 but allows only single label (server name)
break;
case 4:
$re = $local_part_rfc2822.'@'.$domain_label_rfc1035;# domain pattern as per RFC 1035 but allows only single label (server name); use 1 if more levels are needed
break;
case 5:
$re = $local_part_rfc2822; # just a name, no server
break;
}
// return the resulting RE
return $re;
}
**""IsValidEmail()"" method**
/**
* Check whether a supplied email address is syntactically valid.
*
* The function serves as a wrapper around Wakka::RE_AddrSpec() to enable validation of a
* user-supplied email address. Best used when the address is already "sanitized" with
* {@link Wakka::NoCrlf()} and subsequently trimmed to get rid of any surrounding whitespace.
*
* Usage example:
* <code>
* $email = trim(NoCrlf($_POST['email']));
* if (!IsValidEmail($email))
* {
* // report problem
* }
* else
* {
* // continue...
* }
* </code>
* See {@link Wakka::RE_AddrSpec()} documentation about Error reporting!
*
* @author {@link http://wikka.jsnx.com/JavaWoman JavaWoman}
* @copyright Copyright © 2004, Marjolein Katsma
* @license http://www.gnu.org/copyleft/lesser.html GNU esser General Public License
* @version 1.0
*
* @access public
* @uses Wakka::RE_AddrSpec() to build a standards-compliant RE used for the validation
*
* @param string $email Required.
* String to be validated
* @param integer $email_format Optional.
* Passed on to {@link Wakka::RE_AddrSpec()}
* @return boolean TRUE if $email conforms to format specified in $email_format, FALSE
* if not
*/
function IsValidEmail($email,$email_format=NULL)
{
$pattern = '/^'.$this->RE_AddrSpec($email_format).'$/';
return preg_match($pattern,$email);
}
====Toolkit implementation - step 2====
Now that we have the defines and the functions available we can start to apply them.
Note that while the functions themselves are fully tested, the code for the implementation suggestions below are **untested**; this is because I'm actually working on complete replacements for the actions involved (which I will share when finished, of course). So **use at your own risk**, please test before making it live (and do let me know if there are any problems).
Deletions:
I'm documenting everything now in the [[ phpDocumentor]] format; please keep the documentation *with* the code. The documentation is readable (I think) even if it's not processed by phpDocumentor; and contains important information about what is supported (and why) and what is explicitly **not** supported, as well as how to use the functions. And, of course, it contains copyright and license information. :)
Define patterns:
define('PATTERN_INT','/^[0-9]+$/'); # integer defined as string