109

Here is an extracted portion of my query, reflecting the EMAIL_ADDRESS column data type and property:

EMAIL_ADDRESS CHARACTER VARYING(20) NOT NULL, 

However, John Saunders uses VARYING(256).

This suggests me that I have not necessarily understood the VARYING correctly.

I understand it such that the length of an email address is 20 characters in my case, while 256 for Jodn.

Context in John's code

CREATE TABLE so."User"
  (
    USER_ID SERIAL NOT NULL,
    USER_NAME CHARACTER VARYING(50) NOT NULL,
    EMAIL_ADDRESS CHARACTER VARYING(256) NOT NULL, // Here
    HASHED_PASSWORD so.HashedPassword NOT NULL,
    OPEN_ID CHARACTER VARYING(512),                                                         
    A_MODERATOR BOOLEAN,
    LOGGED_IN BOOLEAN,
    HAS_BEEN_SENT_A_MODERATOR_MESSAGE BOOLEAN,
    CONSTRAINT User_PK PRIMARY KEY(USER_ID)
  );

I have never seen email addresses longer than 20 characters, used by ordinary people.

What is the optimal length for an email address in a database?

nyedidikeke
  • 6,899
  • 7
  • 44
  • 59
Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697
  • What do you mean by "optimal"? What are you trying to "optimize"? – S.Lott Jul 29 '09 at 10:34
  • are you telling me I'm extraordinary? XD – fortran Jul 29 '09 at 10:38
  • 1
    @S.Lott: I want build a secure system. The increase in user's input increases the risk that they can run codes in the database. --- I see optimal as the best way to have a secure system. – Léo Léopold Hertz 준영 Jul 29 '09 at 10:40
  • Heres one that seems likely to be real and its 31 character: michael dot peterson at googlemail dot com (spaces, changed @ etc added cause this could be a real address). Basically, don't make assumptions, always code to the limits defined in whatever specification because you can never be completely sure otherwise. –  Jul 29 '09 at 11:05
  • 1
    Well, while there is security considerations in not making something unbounded, adhearing to the standards will always make the most sense. Following what is "common" or "optimal" will likely introduce security issues then reduce them. – Kitson Jul 29 '09 at 11:39
  • 1
    This question on StackOverflow suggests that the max length is now 254 characters including the "@" sign: http://stackoverflow.com/questions/386294/maximum-length-of-a-valid-email-id – dthrasher Feb 11 '10 at 16:35
  • 1
    Here's a related post on email length from @DominicSayers, with a really thorough answer: https://stackoverflow.com/a/574698/361842 – JohnLBevan Jul 28 '17 at 09:25
  • "I have never seen email addresses longer than 20 characters, used by ordinary people." -- non-ordinary people are prevalent on the internet too. and 20 characters barley covers some domain names, if you consider an `@outlook.com` email address, then there is 12 char just for the domain, leaving only 8 char for the actual email address. My valid outlook.com email address is 64 char long. – gattsbr Feb 28 '23 at 19:46

9 Answers9

160

The maximum length of an email address is 254 characters.

Every email address is composed of two parts. The local part that comes before the '@' sign, and the domain part that follows it. In "user@example.com", the local part is "user", and the domain part is "example.com".

The local part must not exceed 64 characters and the domain part cannot be longer than 255 characters.

The combined length of the local + @ + domain parts of an email address must not exceed 254 characters. As described in RFC3696 Errata ID 1690.

I got the original part of this information from here

Iain Hoult
  • 3,889
  • 5
  • 25
  • 39
  • 1
    It seems that it is the best to take 320 as the length. – Léo Léopold Hertz 준영 Jul 29 '09 at 11:12
  • 43
    I know this is an old thread and there's no problem using 320, but the actual maximum is 254 because of an overriding restriction from RFC2821 that imposes additional constraints over and above those quoted for the local and domain parts. If storage space is an issue, this may be worth people knowing if they stumble on this thread. See Errata ID 1690 in [errata to RFC3696](http://www.rfc-editor.org/errata_search.php?rfc=3696) – HexAndBugs Apr 25 '12 at 15:48
  • As @flightplanner said, Wikipedia summarizes those sections [here](http://en.wikipedia.org/wiki/Email_address#Syntax): "but the maximum...restricts the entire email address to be no more than 254 characters" – RustyTheBoyRobot Jun 15 '12 at 18:11
  • 2
    Especially if you want the email field to have a unique constraint; under INNODB and utf8 varchar(254) is small enough (less than 767bytes) to have a unique constraint and varchar(300) is not. – Autonomy Apr 11 '13 at 19:14
  • In the [RFC 3696 errata ID 1003](https://www.rfc-editor.org/errata/eid1003) I found it says that 256 chars is the practical limit (and 320 chars the maximum). – Arnold Schrijver Jun 25 '20 at 07:42
58

from Ask Metafilter:

My data comes from a database of 323 addresses. The distribution has some upper-end outliers (positively-skewed). It is normally distributed without the outliers (I tested it.)

Min: 12 1st quartile: 19 Mean (w/ outliers): 23.04 Mean w/o outliers): 22.79 3rd quartile: 26 Max (w/ outliers): 47 Max (w/o outliers): 35

Median: 23 Mode: 24 Std. Dev (w/ outliers): 5.20 Std. Dev (w/o outliers): 4.70

Ranges based on data including outliers 68.2% of data 17.8 - 28.2 95.4% of data 12.6 - 33.4 99.7% of data 7.4 - 38.6

Ranges based on data outliers excluded 68.2% of data 18.1 - 27.5 95.4% of data 13.4 - 32.2 99.7% of data 8.7 - 36.9

If you sign up for http://www.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.com/ then your email address would surely be an outlier :)

Here's What is the maximum safe length of an email address to allow in a website form? on Raycon with a slightly different mean (N=50,496, mean=23):

Email address length distribution

Rosdi Kasim
  • 24,267
  • 23
  • 130
  • 154
pageman
  • 2,864
  • 1
  • 29
  • 38
  • @Masi actually what's curious is that it's a Poisson distribution rather than a normal distribution - anyone have ideas why it's like that? :P – pageman Jul 29 '09 at 10:47
  • @pageman: The reason is that each event is randomly distributed AND each event is taken from the infinity space. - You get a similar distribution if you calculate the number of cars driving to RED such that you have time vs. number of cars driving to red in the axes. – Léo Léopold Hertz 준영 Jul 29 '09 at 11:01
  • Personally I like Benford's Law better: http://en.wikipedia.org/wiki/Benford%27s_law – Kitson Jul 29 '09 at 11:36
  • 2
    I've used 120 variable characters for years. The real world logic is that even if someone is ready to fill your 320 varchar field...I bet they have a 40 char alternative email just standing by – Chukky Nze Jul 28 '14 at 02:25
  • The link http://www.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.com/ seems to be spam now - potentially dangerous. would not click. – sommmen Apr 06 '22 at 12:24
  • Assuming there is infinitely(for arguments sake) more than 323 valid domains, and infinitely more valid email address at those more than 323 valid domains, starting with a@ then aa@, then aaa@, and so on... 323 is going to have a large sampling error for your statistics. – gattsbr Feb 28 '23 at 19:56
26

Just use varchar(50). Longer emails are crap, every time.

Just look how long 50 chars is:

peoplewithanemail@ddressthislongjustuseashorterone

If you allow 255 character emails:

  • Displaying them can mess up your UI (at best they will be cut off, at worst they push your containers and margins around) and
  • Malicious users can do things with them you can't anticipate (like those cases where hackers used a free online API to store a bunch of data)

(Statistics show that no-one actually enters more than about 50 chars for a legit email address, see e.g.: pageman's answer https://stackoverflow.com/a/1199245/87861)

MGOwen
  • 6,562
  • 13
  • 58
  • 67
Nicolas Manzini
  • 8,379
  • 6
  • 63
  • 81
  • 7
    Totally agree. Who in their right mind would have an email address any longer? Sure, it is theoretically correct that an email can be 320 chars but in the real world? In my systems I also use varchar(50) and I have never had a complaint that a user cannot register. – Norbert Norbertson Sep 23 '16 at 09:03
  • 2
    It would be interesting to know from huge datasets what the average real world email length is and what the outliers are and how big. – Norbert Norbertson Sep 23 '16 at 09:04
  • 9
    Wrong. There are plenty of real-world users who have more than 50 characters in their email, and more importantly they can't change it just for you. Refusing them access for something they can't fix is unfair. – Marcus Downing Feb 19 '18 at 13:16
  • 2
    they can make new emails of course they can. make google one. – Nicolas Manzini Feb 20 '18 at 09:54
  • 5
    Also, don’t forget about the plus notation. Some power users are using this to segregate and organize their emails in their inbox. Essentially, they’ll have a unique (sub-)email per each website/service/app. For example, let’s imagine that my normal email is my first name and last name at some company name: firstnameandlastone@superacmecompany.com. That’s already ~40 characters. Now, if I used a plus notation for a stackoverflow account: firstnameandlastone+stackoverflow@superacmecompany.com—that’s ~55 characters. Some plus notations may be longer, e.g., +stackoverflow-personal and *-work. – Waterlink Nov 15 '19 at 19:39
17

My work email address is more than 20 characters!

Read the appropriate RFC specification:

"The local-part of an e-mail address may be up to 64 characters long and the domain name may have a maximum of 255 characters"

Dan Diplo
  • 25,076
  • 4
  • 67
  • 89
5

Variable character types in databases don't occupy unneeded space. Thus, there is no reason to constrain such fields as much as possible. Depending on the name of a person, the naming scheme used by their organization and their domain name, an address can easily exceed 20 characters.

There is no limit as to the length of local-part and domain-name in RFC-2822. RFC-2181 limits the domain name to 255 octets/characters though.

Again, since a varchar only uses the space actually used by the string you store, there is no reason to have a small limit for email address length. Just go with 512 and stop worrying. Everything else is premature optimization

Community
  • 1
  • 1
VoidPointer
  • 17,651
  • 15
  • 54
  • 58
3

Initially the maximum is 320 characters (64+1+255, as show in other answers) but as RFC 3696 Errata 1003 said:

However, there is a restriction in RFC 2821 on the length of an address in MAIL and RCPT commands of 256 characters. Since addresses that do not fit in those fields are not normally useful, the upper limit on address lengths should normally be considered to be 256.

And from RFC 5321 section 4.5.3.1.3:

4.5.3.1.3. Path

The maximum total length of a reverse-path or forward-path is 256 octets (including the punctuation and element separators)

This is including the opening and closing brackets so it let us to only 254 octets of email address.

But get in mind that the number of octets may not be equal to the numbers of characters (a char may have 2 or more octets). Also the RFC section 4.5.3.1 tell that there can be fields of more that the maximum and this is possible but not guarantied to servers to catch they correctly.

And then you can/must use a VARCHAR(254) to store an email address.

Note: In MySQL at least, a column declared as VARCHAR whit less or equal than 255 octets will be all stored as 1 byte + length (the 1 is to store the length) so no space is gained if used a lower limit.

Community
  • 1
  • 1
PhoneixS
  • 10,574
  • 6
  • 57
  • 73
  • You fail to explain how you go from 256 bytes to 254. I know this is the result of the opening/closing brackets, but you should explain this as part of the answer. – Gili May 08 '14 at 02:36
2

TLDR Answer

Use a VARCHAR(256) to store the 256 character maximum entailed in current, prevailing RFC Internet standards.

Source

SMTP originally defined what a path was in RFC821, published August 1982, which is an official Internet Standard (most RFC's are only proposals). To quote it...

...a reverse-path, specifies who the mail is from.

...a forward-path, which specifies who the mail is to.

RFC2821, published in April 2001, is the Obsoleted Standard that defined our present maximum values for local-parts, domains, and paths. A new Draft Standard, RFC5321, published in October 2008, keeps the same limits. To quote RFC2821...

4.5.3.1.3. Path

The maximum total length of a reverse-path or forward-path is 256 characters (including the punctuation and element separators).

Common Mistaken Answers

In February 2004, RFC3696 was published and it mistakenly cites the max limit of email addresses as 320 characters. But this was an "informational-only" document, which states...

"This memo provides information for the Internet community. It does not specify an Internet standard of any kind."

We can disregard this limit, then, as it is published before RFC5321, which is a Draft Internet Standard that keeps the email address max limit unchanged at 256 characters.

Community
  • 1
  • 1
HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
1

As others have said, way bigger than 20. 256 + 64 sounds good to me, and is RFC compliant.

The only reason to not have such a large value for your database is if you are worrying about performance or space, and if you are doing that then I'm 99.99999999999999% sure that is premature optimization.

Go big.

Stu Thompson
  • 38,370
  • 19
  • 110
  • 156
1

A CHAR(20) field will always take up 20 characters, whether you use it all or not. (Often padded with spaces at the end.) A VARCHAR(20) field will take up up to 20 characters, but may take up less. One benefit of CHAR()s constant width is fast jumping to a row in a table, because you can just calculate the index it must be on. The drawback is wasting space.

The benefit of constant-sized CHAR(x)'s is lost if you have any VARCHAR(x) columns in your table. I seem to recall that MySQL silently converted any CHAR() fields into VARCHAR() behind the scenes if some columns were VARCHAR()s.