Wikka And Email


Some thoughts and observations on Wikka and email.

A start of solutions can now be found on WikkaEmailToolkit.

Email addresses


Wikka gathers and (often) stores email addresses at various moments. This starts at installation when the WikiMaster privides an email address for the WikiAdmin to be stored in the configuration; every WikiVisitor signing up for an account to be able to become a WikiEditor also must provide an email adddress. Finally, someone providing "feedback" through the feedback form (displayed with the {{feedback}} action) must also provide an email address.

A problem I note is that various programs and actions use different validation rules for email addresses (and associated names).

Admin email address at installation


During installation the file setup/default.php requests an email address for the WikiAdmin. This address is validated with a very simple Regular Expression (RE) using JavaScript. This RE excludes many perfectly valid email addresses. The email address provided will (together with the Admin's WikiName) be stored in the the [prefix-]users table as well. Although the RE does not allow many perfectly valid email addresses, at least any email address that is validated by the RE will actually be a valid address.

A possible workaround (to allow valid addresses that would be rejected) is to disable JavaScript during installation - but then the email address for a WikiAdmin is not validated at all.

[Question: can the installation be done / completed without JavaScript?]

WikiEditor email address at registration


When a WikiVisitor signs up for an account, she is asked for an email address; an email address is required and the address provided is validated with an RE.

The email validation RE used here is different from that used for Admin's email address during installation. More worrying is that the RE used is very "lax" and actually allows strings that are not syntactically valid as an email address: many characters are accepted that would be invalid in an email address.

WikiEditor email address when updating account


When a logged-in WikiUser wants to update his account, the form presented by the actions/usersettings.php file displays a field to permit changing the email address. However, no validation is performed here at all; it's possible to enter an invalid email address, or even to remove an email address completely.

Obviously, this negates the requirement of providing an email address during registration; apart from the fact that now (as opposed to registration-time) any string can be stored as "email address". It's safe from SQL injection - but that is all.

Why require an email address on registration when it can be erased immediately? Or: why not validate an email address during update when it is validated at registration?

WikiEditor request temporary password


When a (registered) WikiUser forgets her password, she can request a temporary password to be sent via email. The action provided for this via actions/emailpassword.php blindly accepts whatever is stored as "email address" for the WikiName provided; this could in fact be an empty string (see "WikiEditor email address when updating account" above) or an invalid email address (see "WikiEditor email address at registration" above).

In addition, an email is "sent" to the presumed address as stored in the database (which may be empty) but there is no error trapping for the PHP mail() function used. If this WikiUser also forgot that an invalid email address was stored (or none at all - see above), she would be none the wiser as the application would not tell her there was a problem (or rather, an error message might result - but one formulated in language for a WikiMaster, not aimed at a WikiEditor end user).

Feedback


In the distribution a {{feedback}} action is provided which can be used to enable a WikiVisitor (or any WikiUser) to send comments by email email to the WikiAdmin. The form requires an email address to be provided for the sender and the provided email address is validated. However, the RE used for the validation - while less strict than that for WikiAdmin during installation - will reject many perfectly valid email addresses.

Many people, in view of being bombarded by spam, will want to provide a "throw-away" email address here (at least until they have sufficient trust in the operators of the Wiki); the problem is that some of the "throw-away" email address services may generate an email address that is perfectly valid, but would be rejected by the RE used in this form/action (this has happened to me on several occasions on (other) websites doing similar too-restrictive validation).

An additional problem is that the email address for the WikiAdmin role is blindly accepted while the WikiMaster may have disabled JavaScript during installation to get around the obvious limitations thus disabling all validation of that address: a simple typo might result in the WikiVisitors' feedback messages not going anywhere near a WikiAdmin...


This method of the Wakka class takes care of formatting "linkable things" on a page as hyperlinks, including formatting email addresses as a mailto: links. A regular expression is used to "recognize" something that "looks like" an email address. The expression used is "/^.+\@.+$/" which would result in some strings being formatted as an email links that are not actually syntactically valid email addresses. That might confuse some people when trying to use such a link to send email - and lead some spammers astray as well...
Example: [[an[invalid]emailaddress@example.com]]

(An additional concern about this method is that such an email link is not obfuscated, making the email address easy fodder for harvesting spambots.)

This last issue can be solved independently from the question of email validation. We just need to modify the function in charge of formatting email links so that it produces "safe" mailto links, e.g. [[email@example.com | Johnny Stecchino]] --> <a href="mailto:email[at]example[dot]com">Johnny Stecchino</a>. I think it is in our interest that HTML formatted mailto links result in invalid addresses. :)
-- DarTar

Size of email address


Another issue is that the length of an email adress for WikiUsers is limited by the size of the database field used to store it. Currently this is 50 characters; no check on length is made, so a longer email address submitted would be silently truncated; if that WikiUser expects to be able to receive a "temporary password" by email that WikiUser will be very disappointed when the time comes. ;-)

It sounds worse than it is - 50 characters is ample for most email addresses. Still, personally, I'd like to set the limit at 75 or so (just a minor database modification). I did some rather entertaining digging on my own HD to see what email addresses were used in the From: header of emails received (I'm a packrat, so I do have enough to make it worthwhile ;-)). These are the results I'm basing my preferred limit of 75 on:
Even if the field isn't enlarged, at least a check should be made on allowable length so the user will know when there is a problem.

Also interesting were the "special" characters I found in the "mailbox" part of the addresses - apart from letters (a-z), numbers (0-9), dashes (-), underscores (_) and dots (.), I also found these actually used (visual inspection only):
+ / ? ! | % = * and $
Yes, all of those (and more) are valid!

Conclusions


  1. We have three different REs to validate email addresses
  1. One of these can be circumvented by disabling JavaScript
  1. In one context the email address provided is not validated or even required at all
  1. While seemingly WikiUsers need a "valid" email address, providing any old string that happens to validate and then removing or invalidating it via update is perfectly possible
  1. No check is made for email address length - an address longer than (currently) 50 characters would be silently truncated on storage in the users database.

There are (at least) three different issues with this:

Obviously, we have some problems here, and more potential problems when people are customising the action programs as provided without being aware of these inconsistencies.

Solutions

I'm working on some (more robust) solutions, but please have a little patience: I can think much faster than I can code (solid) solutions; writing email applications (especially robust email) is never easy, and the PHP mail() function isn't all that robust by itself either.

Meanwhile I wanted to share my observations, just to make my fellow Wikka users and implementors aware that there are a few problems lurking in there with respect to email from Wikka...

I will be posting code to address or get around some of these issues but I will not release any code before having tested it as best I can and having documented it properly.

Note:
The above is based on version 1.1.5.1.
More (or different) - including code - when it's ready ... hang in there. :)

References:
*) Proposed standards: while RFC 2821 is intended to replace RFC 821 and RFC 2822 is intended to replace RFC 822 most everyone applies these now rather than the older ones: while not official standards yet, they effectively have the status of "de facto" standard.

-- JavaWoman

I like the idea of creating a single, standards-compliant function in wikka.php for validating email addresses everytime a user is asked to provide one. We should consider adding to this list also the Link() function (the function in charge of formatting [[email@example.com]] as mailto:email@example.com): it currently excludes some valid email addresses containing non alphanumeric characters.

-- DarTar

Thanks, DarTar. Just goes to show what a Wikka newbie I still am - I hadn't realized there was a Link() function that would also format email addresses into mailto: links (although having one that creates http: links was fully expected :)). So yes, Link() belongs here, too. (Added in its rightful place.)
(I see other issues in that function though - I'll leave those alone for now and concentrate on email first.)

BTW, I changed your 'server.com' examples into 'example.com' - in compliance with RFC 2606 - Reserved Top Level DNS Names and because 'server.com' actually belongs to someone. We don't want to invite spammers to send them email, do we? ;-)

-- JavaWoman

Progress report (sort of)

I'm wrapping (nearly) all the email functionality into a class, taking care that it can be used outside of Wikka as well. The basic structure for that is done now, and most of the email functionality as well.

There are issues with using PHP's mail() function on a Windows platform though, which I haven't really addressed yet; I know in principle how to work around them but it's a lot of work. I think I'll leave that out of a first version for now.

Another thing isn't addressed yet: the email class really needs its own configuration - and making it independent of Wikka means I can't simply use Wikka's configuration file/array. I wasn't too happy with using an array for configuration anyway - the more items you need, the messier it tends to get - I like things organized in logical groups, at least. The obvious solution is to use an ini file (with sections) to store the configuration parameters. So now I'm working on a class that can read and write ini files, as well as provide an interactive Admin interface to build and maintain an ini file - sections and comments and all. Making good progress on that, too: parsing and (re)writing work nicely now, Admin interface is next (not easy).
This should also open an avenue for a Wikka Admin Configuration action!

So hang in there, I'm still hacking. Though I did take some time "off" over the weekend and more to do some upgrades for an active site (where I'll want to use the email and ini classes as well). Sometimes you need to concentrate on one thing to give your brain a rest from another thing. :)
-- JavaWoman


CategoryDevelopmentDiscussion
There is one comment on this page. [Display comment]
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki