Validating Email Addresses

Ok, validating an email address should not be difficult. You look up the definition in the RFC and write a little regular expression. Here is what I could extract from RFC 822:

address        =  mailbox                      ; one addressee
                 /  group                        ; named list
mailbox        =  addr-spec                    ; simple address
                 /  phrase route-addr            ; name & addr-spec
domain         =  sub-domain *("." sub-domain)
sub-domain     =   domain-ref / domain-literal
domain-literal =  "[" *(dtext / quoted-pair) "]"
domain-ref     =  atom                         ; symbolic reference
atom           =  1*<any CHAR except specials, SPACE and CTLs>
dtext          =  <any CHAR excluding "[",     ; => may be folded
                     "]", "\" & CR, & including
                     linear-white-space>
quoted-pair    =  "\" CHAR                     ; may quote any char

Right, the symbols used are a bit strange: / – or, 1* – one or more, * – zero or more. That clears up things a little. I intend to check for simple addresses only. I do not care about named lists. Then I was a bit shocked about the sub-domain *(“.” sub-domain) expression. I always thought that a domain suffix such as .com or .de would be some what more confined than a sub domain. But ok, this makes things easier and I ended-up with this simple expression:

\w[-.\w]*@[-\w]+(?:\.[-\w]+)+

Leave a Reply

Your email address will not be published. Required fields are marked *