10

I'm trying to familiarize myself with Postgres (9.2) after a fair bit of MySQL (5.1) usage, since I've been bitten by a handful of MySQL's gotchas. However, in my first five minutes with Postgres I ran into one of its gotchas, which I'm sure hits everyone:

  • By default, PostgreSQL converts everything that isn't quoted to lower case.

This isn't too big of a deal to me, since there are a couple of obvious workarounds:

  • Encapsulate everything in quotes.
  • Allow everything to be named in a lower case fashion.

But I'm wondering why. Considering how much contention I imagine this design decision causes, I'm surprised that I couldn't find any rationale on the internet. Does anybody have a thorough explanation, or preferably a link to some developer manifesto, as to why Postgres was designed this way? I'm interested.

Hammer Bro.
  • 965
  • 1
  • 10
  • 23
  • Are you asking about this? http://stackoverflow.com/questions/153944/is-sql-syntax-case-sensitive?rq=1 – Matt Ball Nov 16 '12 at 01:31
  • 3
    AFAIK, the standard says that unquoted identifiers should be folded to upper case, [PostgreSQL folds to lower case](http://stackoverflow.com/a/8736088/479863) instead. Folding to upper or lower case shouldn't matter unless you're only quoting your identifiers sometimes but if you're doing that then you deserve what you get. You're welcome to `create table Pancakes (...)` and `select * from Pancakes`, just don't `create table "Pancakes" (...)` and try to `select * from Pancakes`. MySQL people seem to have this odd habit of backtick-quoting everything, lose that habit. – mu is too short Nov 16 '12 at 01:46
  • @muistooshort I've generally seen claims that the SQL spec doesn't specify whether case-folding must be to upper or lower case, only that it must occur. Do you know where it's specified to fold to upper case specifically? (I'm looking for references but not finding much that's useful at the moment). – Craig Ringer Nov 16 '12 at 01:55
  • 5
    @Craig: I'm trusting the PostgreSQL docs on this: ["The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case."](http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS). Unfortunately I don't have a copy of The Standard handy. I just go lower case these days, I've mellowed as I've aged so I don't want to run around shouting all the time. *Markdown fixed this time* – mu is too short Nov 16 '12 at 02:02
  • @muistooshort I missed that. Thanks. Might be worth posting as an answer. – Craig Ringer Nov 16 '12 at 02:22
  • 2
    @muistooshort I believe the mysql convention may have arisen to ensure system independence (possibly on older versions). When I used mysql some years ago, I had problems when I moved to linux from windows as the table names impact file names. Even though the table names weren't quoted, case would cause a problem under linux because it couldn't find the files. I'm not sure if this bug still exists in modern versions. – Philip Couling Nov 16 '12 at 09:37
  • @couling: So a combination of (possible) cargo-cult programming ("quote everything because someone else did") and a poor design decision in the MySQL implementation (tying table names to disk files)? That does sound likely. I've always used underscored lower case identifiers in my databases so I've never had to worry about any of these case problems. – mu is too short Nov 16 '12 at 17:00
  • Interesting; I never realized that the SQL Standard suggested everything should be upper-cased. That does make more sense out of Postgres' behavior, although then it raises the question of why they deliberately chose the opposite case. I can speculate (all caps makes my skin crawl), but I'd be surprised if there weren't some discussion about it somewhere. If there's not, @muistooshort has done a great job so far on shedding light on my dark curiosities. – Hammer Bro. Nov 16 '12 at 17:16
  • 2
    Back in the before-times, PostgreSQL was just "Postgres", the SQL interface and consideration for the SQL standard came later. So, the case folding behavior could be historic. I suspect that you'd have more success with this issue on one of the PostgreSQL mailing lists. – mu is too short Nov 16 '12 at 18:00
  • The case-folding is an artifact from the days when most large databases were on mainframes. Old mainframe languages and environments heavily relied on everything being in upper case (originally via necessity to minimize the set of symbols to parse, then later via habit). – Matthew Wood Dec 13 '12 at 21:30

1 Answers1

12

The SQL standard specifies folding unquoted identifiers to upper case. Many other RDBMS's follow the standard in this way. Firebird and Oracle both do. This means that identifier matching is, by default, case insensitive. This behavior is very important when it comes to compatibility in basic queries. In this regard MySQL's behavior is a real outlier.

However PostgreSQL deviates from the standard by folding to lower case. There are general reasons why this is considered more readable, etc. because you can use case for cuing syntax. Something like:

SELECT foo FROM bar WHERE baz = 1;

This is more natural when cases are folded to lower. The alternative folding opposite would be:

select FOO from BAR where BAZ = 1;

In general like the former behavior (folding to lower case) becasue it emphasizes the sql operations better while folding to the other case de-emphasizes the operations and emphasizes the identifiers. Given the complexity of many queries, I think the former works better.

In general the most discussion I have seen on the postgres mailing lists have been that everyone agrees the standard-mandated behavior is broken. so the above is my understanding of the issues.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182
  • I do not understand the context of the line “In this regard MySQL's behavior is a real outlier”. Did you mean 'Postgres' instead of 'MySQL'? Or is here some other context for 'MySQL' that I missed? – Basil Bourque May 31 '17 at 01:24
  • Reference to the MySQL gotchas and PostgreSQL gotchas. – Chris Travers Jun 01 '17 at 07:40
  • 2
    I don't understand why folding to lower case (departing from the standard) helps readability, because the SQL you write can be any case you want. Just because ORACLE stores FOO and BAR doesn't stop you writing SELECT foo FROM bar; – xpusostomos Apr 08 '21 at 08:20
  • 3
    You really do have to wonder what the standard authors were smoking. – Andy Jun 12 '21 at 18:14
  • Because of MySQL, the idea of basic query compatibility is kind of a moot point. It doesn't seem like a worthwhile goal since most real-world queries aren't compatible anyway. I wish all the other RDBMSes would abandon the standard behavior as well. – Andy Jun 12 '21 at 18:21