Wiki source for WikkaAndEmail
=====Wikka And Email=====
Some thoughts and observations on Wikka and email.
A **start** of solutions can now be found on WikkaEmailToolkit.
====Email addresses====
Wikka gathers and (often) stores email addresses at various moments. This starts at installation when the WikiMaster privides an email address for the WikiAdmin to be stored in the configuration; every WikiVisitor signing up for an account to be able to become a WikiEditor also must provide an email adddress. Finally, someone providing "feedback" through the feedback form (displayed with the ""{{feedback}}"" action) must also provide an email address.
A problem I note is that various programs and actions use //different// validation rules for email addresses (and associated names).
===Admin email address at installation===
During installation the file **##setup/default.php##** requests an email address for the WikiAdmin. This address is validated with a very simple Regular Expression (RE) using ""JavaScript"". This RE excludes many perfectly valid email addresses. The email address provided will (together with the Admin's WikiName) be stored in the the ##[prefix-]users## table as well. Although the RE does not allow many perfectly valid email addresses, at least any email address that is validated by the RE will actually be a valid address.
A possible workaround (to allow valid addresses that would be rejected) is to disable ""JavaScript"" during installation - but then the email address for a WikiAdmin is not validated at all.
[Question: can the installation be done / completed //without// ""JavaScript""?]
===WikiEditor email address at registration===
When a WikiVisitor signs up for an account, she is asked for an email address; an email address is required and the address provided is validated with an RE.
The email validation RE used here is //different// from that used for Admin's email address during installation. More worrying is that the RE used is very "lax" and actually allows strings that are **not** syntactically valid as an email address: many characters are accepted that would be invalid in an email address.
===WikiEditor email address when updating account===
When a logged-in WikiUser wants to update his account, the form presented by the **##actions/usersettings.php##** file displays a field to permit changing the email address. However, no validation is performed here **at all**; it's possible to enter an invalid email address, or even to remove an email address completely.
Obviously, this negates the requirement of providing an email address during registration; apart from the fact that now (as opposed to registration-time) **any string** can be stored as "email address". It's safe from SQL injection - but that is all.
Why require an email address on registration when it can be erased immediately? Or: why not validate an email address during update when it is validated at registration?
===WikiEditor request temporary password===
When a (registered) WikiUser forgets her password, she can request a temporary password to be sent via email. The action provided for this via **##actions/emailpassword.php##** blindly accepts whatever is stored as "email address" for the WikiName provided; this could in fact be an empty string (see "WikiEditor email address when updating account" above) or an invalid email address (see "WikiEditor email address at registration" above).
In addition, an email is "sent" to the presumed address as stored in the database (which may be empty) but there is no error trapping for the PHP ##mail()## function used. If this WikiUser also forgot that an invalid email address was stored (or none at all - see above), she would be none the wiser as the application would not tell her there was a problem (or rather, an error message might result - but one formulated in language for a WikiMaster, not aimed at a WikiEditor end user).
===Feedback===
In the distribution a ""{{feedback}}"" action is provided which can be used to enable a WikiVisitor (or any WikiUser) to send comments by email email to the WikiAdmin. The form requires an email address to be provided for the sender and the provided email address is validated. However, the RE used for the validation - while less strict than that for WikiAdmin during installation - will reject many perfectly valid email addresses.
Many people, in view of being bombarded by spam, will want to provide a "throw-away" email address here (at least until they have sufficient trust in the operators of the Wiki); the problem is that some of the "throw-away" email address services may generate an email address that is perfectly valid, but would be rejected by the RE used in this form/action (this has happened to me on several occasions on (other) websites doing similar too-restrictive validation).
An additional problem is that the email address for the WikiAdmin role is blindly accepted while the WikiMaster //may// have disabled ""JavaScript"" during installation to get around the obvious limitations thus disabling all validation of that address: a simple typo might result in the [[WikiVisitor]]s' feedback messages not going anywhere near a WikiAdmin...
===##Link()## method===
This method of the Wakka class takes care of formatting "linkable things" on a page as hyperlinks, //including// formatting email addresses as a mailto: links. A regular expression is used to "recognize" something that "looks like" an email address. The expression used is ##"/^.+\@.+$/"## which would result in some strings being formatted as an email links that are not actually syntactically valid email addresses. That might confuse some people when trying to use such a link to send email - and lead some spammers astray as well...
Example: [[an[invalid][email protected]]]
(An additional concern about this method is that such an email link is not obfuscated, making the email address easy fodder for harvesting spambots.)
''This last issue can be solved independently from the question of email validation. We just need to modify the function in charge of formatting email links so that it produces "safe" mailto links, e.g. ##""[[[email protected] | Johnny Stecchino]]""## --> ##<a href="mailto:email[at]example[dot]com">Johnny Stecchino</a>##. I think it is in our interest that HTML formatted mailto links result in //invalid// addresses. :)''
-- DarTar
===Size of email address===
Another issue is that the length of an email adress for [[WikiUser]]s is limited by the size of the database field used to store it. Currently this is 50 characters; no check on length is made, so a longer email address submitted would be silently truncated; if that WikiUser expects to be able to receive a "temporary password" by email that WikiUser will be very disappointed when the time comes. ;-)
It sounds worse than it is - 50 characters is ample for **most** email addresses. Still, personally, I'd like to set the limit at 75 or so (just a minor database modification). I did some rather entertaining digging on my own HD to see what email addresses were used in the From: header of emails received (I'm a packrat, so I do have enough to make it worthwhile ;-)). These are the results I'm basing my preferred limit of 75 on:
~- ##"" 519""## mailboxes scanned
~- ##""56,697""## unique, syntactically correct email addresses left after considerable editing ([[http://www.faqs.org/rfcs/rfc2822.html | RFC 2822]]);---##"" ""## not that they would necessarily be //working// addresses
~- ##"" 93""## of these were longer than 50 characters
~- ##"" 3""## were longer than 75 characters (the longest, a real working one, was 116)
Even if the field isn't enlarged, at least a check should be made on allowable length so the user will know when there is a problem.
Also interesting were the "special" characters I found in the "mailbox" part of the addresses - apart from letters (**a-z**), numbers (**0-9**), dashes (**-**), underscores (**_**) and dots (**.**), I also found these actually used (visual inspection only):
**+ / ? ! | % = ""*""** and **$**
Yes, all of those (and more) are valid!
====Conclusions====
~1) We have three different REs to validate email addresses
~1) One of these can be circumvented by disabling ""JavaScript""
~1) In one context the email address provided is not validated or even required at all
~1) While seemingly [[WikiUser]]s need a "valid" email address, providing any old string that happens to validate and then removing or invalidating it via update is perfectly possible
~1) No check is made for email address length - an address longer than (currently) 50 characters would be silently truncated on storage in the users database.
There are (at least) three different issues with this:
~- Requiring users to provide an email address that matches an arbitrarily restrictive RE instead of merely providing a valid email address throws up a hurdle that may send some people away that could otherwise be(come) valuable contributors - while the ""{{feedback}}"" action will not provide them with a channel to actually provide feedback on this because the email address would likely be rejected there, too.
~- Different criteria in different places implies inconsistency, which detracts from usability. (Inversely, consistency breeds usability.)
~- Valid email addresses that just don't fit will be silently accepted - but truncated and rendered inoperable.
Obviously, we have some problems here, and more potential problems when people are customising the action programs as provided without being aware of these inconsistencies.
===Solutions===
I'm working on some (more robust) solutions, but please have a little patience: I can //think// much faster than I can //code// (solid) solutions; writing email applications (especially robust email) is never easy, and the PHP ##mail()## function isn't all that robust by itself either.
Meanwhile I wanted to share my observations, just to make my fellow Wikka users and implementors aware that there are a few problems lurking in there with respect to email from Wikka...
I //will// be posting code to address or get around some of these issues but I will not release any code before having tested it as best I can and having documented it properly.
Note:
The above is based on version 1.1.5.1.
More (or different) - including code - when it's ready ... hang in there. :)
==References:==
~- [[http://www.faqs.org/rfcs/rfc2822.html | RFC 2822 - Internet Message Format]] (proposed standard*)
~- [[http://www.faqs.org/rfcs/rfc2821.html | RFC 2821 - Simple Mail Transfer Protocol]] (proposed standard*)
~- [[http://www.faqs.org/rfcs/rfc2476.html | RFC 2476 - Message Submission]]
~- [[http://www.faqs.org/rfcs/rfc1035.html | RFC 1035 - Domain names - implementation and specification]]
*) Proposed standards: while RFC 2821 is intended to replace RFC 821 and RFC 2822 is intended to replace RFC 822 most everyone applies these now rather than the older ones: while not official standards yet, they effectively have the status of "de facto" standard.
-- JavaWoman
''I like the idea of creating a single, standards-compliant function in wikka.php for validating email addresses everytime a user is asked to provide one. We should consider adding to this list also the ##Link()## function (the function in charge of formatting ##""[[[email protected]]]""## as ##mailto:[email protected]##): it currently excludes some valid email addresses containing non alphanumeric characters.''
-- DarTar
Thanks, DarTar. Just goes to show what a Wikka newbie I still am - I hadn't realized there was a ##Link() ##function that would //also// format email addresses into mailto: links (although having one that creates http: links was fully expected :)). So yes, ##Link()## belongs here, too. (Added in its rightful place.)
(I see other issues in that function though - I'll leave those alone for now and concentrate on email first.)
BTW, I changed your 'server.com' examples into 'example.com' - in compliance with [[http://www.rfc-editor.org/rfc/rfc2606.txt | RFC 2606 - Reserved Top Level DNS Names]] and because 'server.com' actually belongs to someone. We don't want to invite spammers to send them email, do we? ;-)
-- JavaWoman
===Progress report (sort of)===
I'm wrapping (nearly) all the email functionality into a class, taking care that it can be used outside of Wikka as well. The basic structure for that is done now, and most of the email functionality as well.
There are issues with using PHP's mail() function on a Windows platform though, which I haven't really addressed yet; I know in principle how to work around them but it's a lot of work. I think I'll leave that out of a first version for now.
Another thing isn't addressed yet: the email class really needs its own configuration - and making it independent of Wikka means I can't simply use Wikka's configuration file/array. I wasn't too happy with using an array for configuration anyway - the more items you need, the messier it tends to get - I like things organized in logical groups, at least. The obvious solution is to use an ini file (with sections) to store the configuration parameters. So now I'm working on a class that can read and write ini files, as well as provide an interactive Admin interface to build and maintain an ini file - sections and comments and all. Making good progress on that, too: parsing and (re)writing work nicely now, Admin interface is next (not easy).
This should also open an avenue for a Wikka Admin Configuration action!
So hang in there, I'm still hacking. Though I did take some time "off" over the weekend and more to do some upgrades for an active site (where I'll want to use the email and ini classes as well). Sometimes you need to concentrate on one thing to give your brain a rest from another thing. :)
-- JavaWoman
----
CategoryDevelopmentDiscussion
Some thoughts and observations on Wikka and email.
A **start** of solutions can now be found on WikkaEmailToolkit.
====Email addresses====
Wikka gathers and (often) stores email addresses at various moments. This starts at installation when the WikiMaster privides an email address for the WikiAdmin to be stored in the configuration; every WikiVisitor signing up for an account to be able to become a WikiEditor also must provide an email adddress. Finally, someone providing "feedback" through the feedback form (displayed with the ""{{feedback}}"" action) must also provide an email address.
A problem I note is that various programs and actions use //different// validation rules for email addresses (and associated names).
===Admin email address at installation===
During installation the file **##setup/default.php##** requests an email address for the WikiAdmin. This address is validated with a very simple Regular Expression (RE) using ""JavaScript"". This RE excludes many perfectly valid email addresses. The email address provided will (together with the Admin's WikiName) be stored in the the ##[prefix-]users## table as well. Although the RE does not allow many perfectly valid email addresses, at least any email address that is validated by the RE will actually be a valid address.
A possible workaround (to allow valid addresses that would be rejected) is to disable ""JavaScript"" during installation - but then the email address for a WikiAdmin is not validated at all.
[Question: can the installation be done / completed //without// ""JavaScript""?]
===WikiEditor email address at registration===
When a WikiVisitor signs up for an account, she is asked for an email address; an email address is required and the address provided is validated with an RE.
The email validation RE used here is //different// from that used for Admin's email address during installation. More worrying is that the RE used is very "lax" and actually allows strings that are **not** syntactically valid as an email address: many characters are accepted that would be invalid in an email address.
===WikiEditor email address when updating account===
When a logged-in WikiUser wants to update his account, the form presented by the **##actions/usersettings.php##** file displays a field to permit changing the email address. However, no validation is performed here **at all**; it's possible to enter an invalid email address, or even to remove an email address completely.
Obviously, this negates the requirement of providing an email address during registration; apart from the fact that now (as opposed to registration-time) **any string** can be stored as "email address". It's safe from SQL injection - but that is all.
Why require an email address on registration when it can be erased immediately? Or: why not validate an email address during update when it is validated at registration?
===WikiEditor request temporary password===
When a (registered) WikiUser forgets her password, she can request a temporary password to be sent via email. The action provided for this via **##actions/emailpassword.php##** blindly accepts whatever is stored as "email address" for the WikiName provided; this could in fact be an empty string (see "WikiEditor email address when updating account" above) or an invalid email address (see "WikiEditor email address at registration" above).
In addition, an email is "sent" to the presumed address as stored in the database (which may be empty) but there is no error trapping for the PHP ##mail()## function used. If this WikiUser also forgot that an invalid email address was stored (or none at all - see above), she would be none the wiser as the application would not tell her there was a problem (or rather, an error message might result - but one formulated in language for a WikiMaster, not aimed at a WikiEditor end user).
===Feedback===
In the distribution a ""{{feedback}}"" action is provided which can be used to enable a WikiVisitor (or any WikiUser) to send comments by email email to the WikiAdmin. The form requires an email address to be provided for the sender and the provided email address is validated. However, the RE used for the validation - while less strict than that for WikiAdmin during installation - will reject many perfectly valid email addresses.
Many people, in view of being bombarded by spam, will want to provide a "throw-away" email address here (at least until they have sufficient trust in the operators of the Wiki); the problem is that some of the "throw-away" email address services may generate an email address that is perfectly valid, but would be rejected by the RE used in this form/action (this has happened to me on several occasions on (other) websites doing similar too-restrictive validation).
An additional problem is that the email address for the WikiAdmin role is blindly accepted while the WikiMaster //may// have disabled ""JavaScript"" during installation to get around the obvious limitations thus disabling all validation of that address: a simple typo might result in the [[WikiVisitor]]s' feedback messages not going anywhere near a WikiAdmin...
===##Link()## method===
This method of the Wakka class takes care of formatting "linkable things" on a page as hyperlinks, //including// formatting email addresses as a mailto: links. A regular expression is used to "recognize" something that "looks like" an email address. The expression used is ##"/^.+\@.+$/"## which would result in some strings being formatted as an email links that are not actually syntactically valid email addresses. That might confuse some people when trying to use such a link to send email - and lead some spammers astray as well...
Example: [[an[invalid][email protected]]]
(An additional concern about this method is that such an email link is not obfuscated, making the email address easy fodder for harvesting spambots.)
''This last issue can be solved independently from the question of email validation. We just need to modify the function in charge of formatting email links so that it produces "safe" mailto links, e.g. ##""[[[email protected] | Johnny Stecchino]]""## --> ##<a href="mailto:email[at]example[dot]com">Johnny Stecchino</a>##. I think it is in our interest that HTML formatted mailto links result in //invalid// addresses. :)''
-- DarTar
===Size of email address===
Another issue is that the length of an email adress for [[WikiUser]]s is limited by the size of the database field used to store it. Currently this is 50 characters; no check on length is made, so a longer email address submitted would be silently truncated; if that WikiUser expects to be able to receive a "temporary password" by email that WikiUser will be very disappointed when the time comes. ;-)
It sounds worse than it is - 50 characters is ample for **most** email addresses. Still, personally, I'd like to set the limit at 75 or so (just a minor database modification). I did some rather entertaining digging on my own HD to see what email addresses were used in the From: header of emails received (I'm a packrat, so I do have enough to make it worthwhile ;-)). These are the results I'm basing my preferred limit of 75 on:
~- ##"" 519""## mailboxes scanned
~- ##""56,697""## unique, syntactically correct email addresses left after considerable editing ([[http://www.faqs.org/rfcs/rfc2822.html | RFC 2822]]);---##"" ""## not that they would necessarily be //working// addresses
~- ##"" 93""## of these were longer than 50 characters
~- ##"" 3""## were longer than 75 characters (the longest, a real working one, was 116)
Even if the field isn't enlarged, at least a check should be made on allowable length so the user will know when there is a problem.
Also interesting were the "special" characters I found in the "mailbox" part of the addresses - apart from letters (**a-z**), numbers (**0-9**), dashes (**-**), underscores (**_**) and dots (**.**), I also found these actually used (visual inspection only):
**+ / ? ! | % = ""*""** and **$**
Yes, all of those (and more) are valid!
====Conclusions====
~1) We have three different REs to validate email addresses
~1) One of these can be circumvented by disabling ""JavaScript""
~1) In one context the email address provided is not validated or even required at all
~1) While seemingly [[WikiUser]]s need a "valid" email address, providing any old string that happens to validate and then removing or invalidating it via update is perfectly possible
~1) No check is made for email address length - an address longer than (currently) 50 characters would be silently truncated on storage in the users database.
There are (at least) three different issues with this:
~- Requiring users to provide an email address that matches an arbitrarily restrictive RE instead of merely providing a valid email address throws up a hurdle that may send some people away that could otherwise be(come) valuable contributors - while the ""{{feedback}}"" action will not provide them with a channel to actually provide feedback on this because the email address would likely be rejected there, too.
~- Different criteria in different places implies inconsistency, which detracts from usability. (Inversely, consistency breeds usability.)
~- Valid email addresses that just don't fit will be silently accepted - but truncated and rendered inoperable.
Obviously, we have some problems here, and more potential problems when people are customising the action programs as provided without being aware of these inconsistencies.
===Solutions===
I'm working on some (more robust) solutions, but please have a little patience: I can //think// much faster than I can //code// (solid) solutions; writing email applications (especially robust email) is never easy, and the PHP ##mail()## function isn't all that robust by itself either.
Meanwhile I wanted to share my observations, just to make my fellow Wikka users and implementors aware that there are a few problems lurking in there with respect to email from Wikka...
I //will// be posting code to address or get around some of these issues but I will not release any code before having tested it as best I can and having documented it properly.
Note:
The above is based on version 1.1.5.1.
More (or different) - including code - when it's ready ... hang in there. :)
==References:==
~- [[http://www.faqs.org/rfcs/rfc2822.html | RFC 2822 - Internet Message Format]] (proposed standard*)
~- [[http://www.faqs.org/rfcs/rfc2821.html | RFC 2821 - Simple Mail Transfer Protocol]] (proposed standard*)
~- [[http://www.faqs.org/rfcs/rfc2476.html | RFC 2476 - Message Submission]]
~- [[http://www.faqs.org/rfcs/rfc1035.html | RFC 1035 - Domain names - implementation and specification]]
*) Proposed standards: while RFC 2821 is intended to replace RFC 821 and RFC 2822 is intended to replace RFC 822 most everyone applies these now rather than the older ones: while not official standards yet, they effectively have the status of "de facto" standard.
-- JavaWoman
''I like the idea of creating a single, standards-compliant function in wikka.php for validating email addresses everytime a user is asked to provide one. We should consider adding to this list also the ##Link()## function (the function in charge of formatting ##""[[[email protected]]]""## as ##mailto:[email protected]##): it currently excludes some valid email addresses containing non alphanumeric characters.''
-- DarTar
Thanks, DarTar. Just goes to show what a Wikka newbie I still am - I hadn't realized there was a ##Link() ##function that would //also// format email addresses into mailto: links (although having one that creates http: links was fully expected :)). So yes, ##Link()## belongs here, too. (Added in its rightful place.)
(I see other issues in that function though - I'll leave those alone for now and concentrate on email first.)
BTW, I changed your 'server.com' examples into 'example.com' - in compliance with [[http://www.rfc-editor.org/rfc/rfc2606.txt | RFC 2606 - Reserved Top Level DNS Names]] and because 'server.com' actually belongs to someone. We don't want to invite spammers to send them email, do we? ;-)
-- JavaWoman
===Progress report (sort of)===
I'm wrapping (nearly) all the email functionality into a class, taking care that it can be used outside of Wikka as well. The basic structure for that is done now, and most of the email functionality as well.
There are issues with using PHP's mail() function on a Windows platform though, which I haven't really addressed yet; I know in principle how to work around them but it's a lot of work. I think I'll leave that out of a first version for now.
Another thing isn't addressed yet: the email class really needs its own configuration - and making it independent of Wikka means I can't simply use Wikka's configuration file/array. I wasn't too happy with using an array for configuration anyway - the more items you need, the messier it tends to get - I like things organized in logical groups, at least. The obvious solution is to use an ini file (with sections) to store the configuration parameters. So now I'm working on a class that can read and write ini files, as well as provide an interactive Admin interface to build and maintain an ini file - sections and comments and all. Making good progress on that, too: parsing and (re)writing work nicely now, Admin interface is next (not easy).
This should also open an avenue for a Wikka Admin Configuration action!
So hang in there, I'm still hacking. Though I did take some time "off" over the weekend and more to do some upgrades for an active site (where I'll want to use the email and ini classes as well). Sometimes you need to concentrate on one thing to give your brain a rest from another thing. :)
-- JavaWoman
----
CategoryDevelopmentDiscussion