563

Is there any way to write case-insensitive queries in PostgreSQL, E.g. I want that following 3 queries return same result.

SELECT id FROM groups where name='administrator'

SELECT id FROM groups where name='ADMINISTRATOR'

SELECT id FROM groups where name='Administrator'
ivanleoncz
  • 9,070
  • 7
  • 57
  • 49
Jame
  • 21,150
  • 37
  • 80
  • 107
  • 1
    if citext comes with your Postgres installation, try citext type. It's case-insensitive text – Michael Buen Aug 10 '11 at 04:06
  • 5
    For newcomers to this question, [this link](http://www.postgresql.org/docs/8.3/static/functions-matching.html) to official postgres documentation contains all the answers given here, as well as a few other options. – Parthian Shot Jul 07 '14 at 20:38
  • 2
    Sir reassign accepted answer to the one made by @Arun please. It is much less complicated and do not pull bunch of troubles after apply. – zelibobla Jun 07 '16 at 17:25

15 Answers15

690

Use LOWER function to convert the strings to lower case before comparing.

Try this:

SELECT id 
  FROM groups
 WHERE LOWER(name)=LOWER('Administrator')
Chandu
  • 81,493
  • 19
  • 133
  • 134
  • 143
    It's important to note that using LOWER (or any function) on the predicate columns--in this case "name"--will cause any indexes to no longer be seekable. If this is a large or frequently queried table, that could cause trouble. Case-insensitive collation, citext, or a function-based index will improve performance. – Jordan Aug 10 '11 at 04:23
  • 149
    Or just create an index like this: CREATE INDEX idx_groups_name ON groups lower(name); – Daniel Aug 10 '11 at 05:25
  • 25
    Also specify `varchar_pattern_ops` if you want the index to work with `LIKE 'xxx%'` query, i.e. `CREATE INDEX ix_groups_name ON groups (lower(name) varchar_pattern_ops)`. – sayap Aug 10 '11 at 11:44
  • 21
    Using the ILIKE operator (as shown in other answers below) is a simpler approach, even though this is the most voted answer. – Ryan Apr 21 '15 at 19:36
  • 4
    To add to @miR's comment: while the ILIKE approach is syntactically much simpler, it will NEVER attempt to use an index - thus, if performance is of great importance to you, consider using Chandu's solution w/ the index demonstrated by Daniel & sayap instead. – Priidu Neemre Jun 17 '15 at 16:38
  • 5 times slower than equal. `ILIKE` is 9 times slower – brauliobo Jun 26 '15 at 21:48
  • 1
    This is very serious bad approach, you don't add functions to columns in the left side of a comparative. You will have poor performance. – Juan Jul 02 '16 at 20:18
  • 1
    According to PostgreSQL 9.4 document " It is also possible to use B-tree indexes for ILIKE and ~*, but only if the pattern starts with non-alphabetic characters, i.e., characters that are not affected by upper/lower case conversion." So it's possible, not NEVER. @Priidu Neemre – Robert Aug 22 '16 at 10:40
  • 1
    If your application is searching for a fixed string, there is no need to have `LOWER()` on both sides: `WHERE LOWER(name)='administrator'` – Elliot Chance Sep 06 '16 at 23:37
  • 9
    Going through the comments here, a lot of suggestions here suggests `ILIKE`, It will work, `but with slow response`. To obtain fast access to tables based on the results of computations, I suggest anyone just checking this should go with the accepted answer. See more details [here](https://www.postgresql.org/docs/current/static/indexes-expressional.html) and [here](http://stackoverflow.com/a/11278943/5614748) – Afolabi Olaoluwa Dec 01 '16 at 19:35
  • This page notes the issue that causes the lower approach to cause performance problems as noted by @Jordan: https://www.postgresql.org/docs/current/citext.html – Sudhanshu Mishra Jan 09 '20 at 00:58
  • 2
    Does this address the "turkish problem"? http://www.i18nguy.com/unicode/turkish-i18n.html – Greg Apr 14 '20 at 17:02
  • 2
    This is *wrong*, case conversion is very much a locale specific thing, you cannot just convert strings to lower/uppercase and compare them, some strings might have several lowercase variants and hence wouldn't be matched. – paul23 Jun 01 '20 at 18:23
  • 1
    This is an awful solution. Using LOWER(Name) will force the query to run that function on EVERY SINGLE ROW - if it's a big table, that's a LOT of processing. Not to mention what @paul23 said about the complexity of case changes on some strings (albeit not really an issue with standard English) – cyberspy Feb 10 '22 at 13:06
  • @cyberspy It's easy to point out flaws, but much harder to actually be helpful and give some alternative method? – Peter R Aug 05 '22 at 09:16
524

using ILIKE instead of LIKE

SELECT id FROM groups WHERE name ILIKE 'Administrator'
Matthew Lock
  • 13,144
  • 12
  • 92
  • 130
Mohammad Reza Norouzi
  • 5,899
  • 1
  • 16
  • 18
  • 9
    Note that `ILIKE` is not supported by Hibernate when used in Spring Boot. – AnT Mar 27 '20 at 02:00
  • 2
    @AnT it works with `org.hibernate.dialect.PostgreSQL94Dialect` and Spring Boot 2.0.6.RELEASE. But IntelliJ complains about it. – skaveesh May 02 '20 at 07:46
  • 1
    Is ilike going to be much slower? Especially when the field is indexed? – Ye Zhang Jun 08 '22 at 20:19
  • 2
    This is the actual answer, the accepted one is a hack – Igor Beaufils Aug 26 '22 at 08:58
  • 3
    Actually, `ILIKE` is the simplest answer but not the "actual" answer for all cases. `ILIKE` is a non-standard extension to Postgres and it will perform very slowly. The accepted answer using the `LOWER` function, along with proper indexing, will perform much better and will be supported by all client libraries and ORMS. It depends on your particular use case. – gerrard00 Dec 05 '22 at 15:20
168

The most common approach is to either lowercase or uppercase the search string and the data. But there are two problems with that.

  1. It works in English, but not in all languages. (Maybe not even in most languages.) Not every lowercase letter has a corresponding uppercase letter; not every uppercase letter has a corresponding lowercase letter.
  2. Using functions like lower() and upper() will give you a sequential scan. It can't use indexes. On my test system, using lower() takes about 2000 times longer than a query that can use an index. (Test data has a little over 100k rows.)

There are at least three less frequently used solutions that might be more effective.

  1. Use the citext module, which mostly mimics the behavior of a case-insensitive data type. Having loaded that module, you can create a case-insensitive index by CREATE INDEX ON groups (name::citext);. (But see below.)
  2. Use a case-insensitive collation. This is set when you initialize a database. Using a case-insensitive collation means you can accept just about any format from client code, and you'll still return useful results. (It also means you can't do case-sensitive queries. Duh.)
  3. Create a functional index. Create a lowercase index by using CREATE INDEX ON groups (LOWER(name));. Having done that, you can take advantage of the index with queries like SELECT id FROM groups WHERE LOWER(name) = LOWER('ADMINISTRATOR');, or SELECT id FROM groups WHERE LOWER(name) = 'administrator'; You have to remember to use LOWER(), though.

The citext module doesn't provide a true case-insensitive data type. Instead, it behaves as if each string were lowercased. That is, it behaves as if you had called lower() on each string, as in number 3 above. The advantage is that programmers don't have to remember to lowercase strings. But you need to read the sections "String Comparison Behavior" and "Limitations" in the docs before you decide to use citext.

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • 1
    About #1: It shouldn't be a problem, since it would be two different strings (think of it like doing `col = 'a'` and `col = 'b'`). About #2: As you said, you can create an index on an expression, so it's not really a problem. But I agree with you that changing the collation is most likely the best solution. – Vincent Savard Aug 10 '11 at 04:21
  • Have you heard anyone saying that citext doesn't work with postgresql jdbc (hibernate)? – Bill Rosmus Aug 20 '12 at 01:48
  • @BillR: No, but I'm not using JDBC with PostgreSQL myself, so I could easily overlook a problem like that. There *have* been [some bugs in version upgrades](http://www.postgresql.org/docs/9.1/static/release-9-1-4.html) to databases using citext. – Mike Sherrill 'Cat Recall' Aug 20 '12 at 02:05
  • Upvoted but I think you're wrong. For #2 you will need: SELECT id FROM groups WHERE LOWER(name) = LOWER('ADMINISTRATOR'); If you want to make use of the additionaly created index. Using explain plan. – albertpeiro Aug 06 '13 at 18:48
  • @albertpeiro: Thanks for pointing that out. I tested before I posted, but sometimes I'm doing three things at one time. – Mike Sherrill 'Cat Recall' Aug 06 '13 at 21:27
  • i reordered citext to the top of your list because it's the only idea that isn't horrible. Creation a functional index on lower is a really bad idea. – Evan Carroll Sep 19 '13 at 00:41
  • 5
    Can someone tell me what case-insensitive collations are PostgreSQL built-in collations ? I see this as an option but can't find anything about a case-insensitive collation for Postgres on the net ? – khorvat Dec 20 '13 at 13:40
  • PostgreSQL doesn't really *have* built-in collations. It relies on the underlying operating system locales. – Mike Sherrill 'Cat Recall' Dec 20 '13 at 14:18
  • I'm porting my solution from MS SQL Server to PostgreSQL and I'm using the ORM tool to do my queries, so is there a way to do a case-insensitive search by setting the database to some collation or something ? – khorvat Dec 20 '13 at 14:38
  • No. Your best bet is probably the citext module, linked above. – Mike Sherrill 'Cat Recall' Dec 20 '13 at 16:58
  • Hey Mike, regarding the option 2 can you specify what should be values for windows env. I have exact same requirement that we will not need case sensitive search. – Anup Shah May 12 '15 at 14:59
  • @mike, So you are saying the windows locales should be suffice to achive the case-insensitive search for postgres. Can you give some tips how do I find out what locales my windows 7 OS has and what should I set to so that postgres will work as case-insensitive. – Anup Shah May 12 '15 at 15:56
  • 1
    @AnupShah: No, I'm not saying that. I'm not running PostgreSQL on Windows. [The 9.4 docs say this](http://www.postgresql.org/docs/9.4/static/collation.html): "On all platforms, the collations named default, C, and POSIX are available. Additional collations may be available depending on operating system support." You can see which collations PostgreSQL thinks are available with `select * from pg_collation;`. – Mike Sherrill 'Cat Recall' May 12 '15 at 22:18
  • Thanks Mike. I check that and looks like only those 3 are installed on my machine. but anyways, After asking around on other forums as well I get the same answer that CITEXT is my best option to go with and I tried it out as well last night. It just matter of selling it to other Architect team members and see any one has any concern with that option. – Anup Shah May 13 '15 at 13:53
  • I'm curious to know which language with upper/lower case "support" are not symetrical. IMHO, worst case would be upper=lower but I'm curious... – Matthieu Aug 10 '17 at 14:02
  • 1
    @Matthieu: This is best introduction (and caution) to the subject that I know about: [Edge Cases to Keep in Mind. Part 1 – Text](https://www.thedroidsonroids.com/blog/edge-cases-to-keep-in-mind-part-1-text). – Mike Sherrill 'Cat Recall' Aug 10 '17 at 16:08
  • 1
    @Matthieu: [The Unicode FAQ](http://www.unicode.org/faq/) is also fun to read. Here's [Why's there no unique uppercase character for . . .](http://unicode.org/faq/casemap_charprop.html#10) – Mike Sherrill 'Cat Recall' Aug 10 '17 at 16:15
  • @MikeSherrill'CatRecall' thank you for those really interesting links. For those wondering, [case problems can kill](http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail) :) (wait, no, that's not funny). – Matthieu Aug 10 '17 at 17:12
134

You can use ILIKE. i.e.

SELECT id FROM groups where name ILIKE 'administrator'
Matthieu
  • 2,736
  • 4
  • 57
  • 87
ADJ
  • 1,531
  • 1
  • 11
  • 15
  • Its correct and working fine for me, I am using MAC OS X(Mountain Lion). – ADJ Apr 04 '13 at 05:58
  • 9
    This will work, but with slow response. To obtain fast access to tables based on the results of computations, I suggest to use the `lower` function. See more [details](https://www.postgresql.org/docs/current/static/indexes-expressional.html) – Afolabi Olaoluwa Dec 01 '16 at 19:28
  • 3
    @AfolabiOlaoluwaAkinwumi fundamentally this comes down to whether you're **searching** for results opposed to **filtering** *known* values. In the latter case, a single uniform case should be persisted at the data level allowing the equality operator to work. [Personal recommendation is upper pascal case for type code values] – Chris Marisic May 01 '17 at 20:10
59

You can also read up on the ILIKE keyword. It can be quite useful at times, albeit it does not conform to the SQL standard. See here for more information: http://www.postgresql.org/docs/9.2/static/functions-matching.html

biegleux
  • 13,179
  • 11
  • 45
  • 52
Priidu Neemre
  • 2,813
  • 2
  • 39
  • 40
  • 12
    Something to watch out for here is malicious user input. If you run a query like `email ILIKE 'user-input-email-here'`, make sure to escape the user input. Otherwise people can enter characters like % that match anything. – Matt De Leon May 13 '13 at 19:23
  • 2
    @MattDeLeon Hi. Well said. But I just want to ask you, if I use `ILIKE` and `prepared statements` will this protect me from `sql injection`? – slevin Oct 09 '13 at 00:00
  • Not sure, I suppose you want to send an escape string to the prepared statement. – Matt De Leon Oct 11 '13 at 09:36
  • 1
    "The key word ILIKE can be used instead of LIKE to make the match case-insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension." Works like a charm in 9.3 – Aleksey Deryagin Dec 23 '14 at 20:38
  • 2
    ILIKE is slower than `lower(column_name) like %expression%`. – Patryk Imosa Oct 03 '17 at 11:53
  • 1
    @PatrykImosa: Can you please elaborate or show an example of ILIKE being slower? – Blama Dec 03 '19 at 16:48
  • @Blama Sorry, it was two years ago, as far as I remember when I tested it on my db results where really better using `lower(column_name) like %expression%` – Patryk Imosa Dec 04 '19 at 14:54
  • @MattDeLeon when would you *not* escape user input? – Madbreaks Jan 12 '21 at 23:30
38

You could also use POSIX regular expressions, like

SELECT id FROM groups where name ~* 'administrator'

SELECT 'asd' ~* 'AsD' returns t

James Brown
  • 633
  • 1
  • 7
  • 11
  • 1
    I had the same problem, I needed case insensitive searches on my PostgreSQL database. I thought about transforming the user input string into a regular expression. Now, using ~* instead of = or LIKE worked perfectly! I didn't need to create new indexes, columns or whatever. Sure, regex search is slower than straight byte comparison, but I don't think the impact on performance would be so much greater than having to handle two sets of data (one lower or uppercased just for searching, then having to retrieve the corresponding original data from the other set). Besides, this is cleaner! – Cyberknight Jan 11 '15 at 03:21
  • 1
    Fine, but how to do with regexp_matches() for example ? – WKT Sep 09 '16 at 17:42
  • 2
    According to postgres docs: The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !~~* operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are PostgreSQL-specific. – sh4 Aug 30 '19 at 13:02
  • 1
    I faced a issue when brackets are included in the text, its not working. like: "code (LC)" – Oshan Wisumperuma Sep 23 '19 at 16:06
  • faced issues with special characters. – Ben Walton Jan 10 '22 at 19:18
28

use ILIKE

select id from groups where name ILIKE 'adminstration';

If your coming the expressjs background and name is a variable use

select id from groups where name ILIKE $1;
MUGABA
  • 751
  • 6
  • 7
  • what's the performance tradeoff between direct matching and ILIKE ? – Nikhil VJ Aug 02 '22 at 06:54
  • The idea is to make sure not you match every string variations, To get to your point, The time and space are the same, However ILIKE does a deep comparision – MUGABA Aug 02 '22 at 07:13
14

Using ~* can improve greatly on performance, with functionality of INSTR.

SELECT id FROM groups WHERE name ~* 'adm'

return rows with name that contains OR equals to 'adm'.

EstevaoLuis
  • 2,422
  • 7
  • 33
  • 40
Robin Goh
  • 151
  • 1
  • 5
  • 4
    Hey, Robin, welcome to SO. James Brown's answer already proposed this solution. Additionally, your proposed answer does not leverage regex in any way. – Rafael Oct 08 '18 at 16:42
  • 1
    `~*` is not precisely case-insensitive match. It is regex pattern matching. In your example if db contains `ADM` then `where name ~* 'Adm'` or `where name ~* 'Ad'` would yield results. Use `ILIKE` instead – nabster Oct 14 '21 at 21:32
10

ILIKE work in this case:

SELECT id 
  FROM groups
 WHERE name ILIKE 'Administrator'
samzna
  • 405
  • 4
  • 8
2

For a case-insensitive parameterized query, you can use the following syntax:

 "select * from article where upper(content) LIKE upper('%' || $1 || '%')"
James Hudnall
  • 39
  • 1
  • 1
2

If you want not only upper/lower case but also diacritics, you can implement your own func:

CREATE EXTENSION unaccent;

CREATE OR REPLACE FUNCTION lower_unaccent(input text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
BEGIN
    return lower(unaccent(input));
END;
$function$;

Call is then

select lower_unaccent('Hôtel')
>> 'hotel'
pink_demon
  • 89
  • 4
2

A tested approach is using ~*

As in the example below

SELECT id FROM groups WHERE name ~* 'administrator'
VVD
  • 178
  • 1
  • 9
1
-- Install 'Case Ignore Test Extension'
create extension citext;

-- Make a request
select 'Thomas'::citext in ('thomas', 'tiago');

select name from users where name::citext in ('thomas', 'tiago');
DEV Tiago França
  • 1,271
  • 9
  • 9
1

None of the existing answers are correct.

The Turkish language has multiple representations of the letter I which are equal from a case-sensitivity point of view. Comparing the lowercase or uppercase forms of these characters will return false.

Georgian has a character that has no uppercase equivalent, and a character that has no lowercase equivalent. If you dig into Java's implementation of String.equalsIgnoreCase() you will find the following code snippet:

// Case insensitive comparison of two code points
private static int compareCodePointCI(int cp1, int cp2) {
    // try converting both characters to uppercase.
    // If the results match, then the comparison scan should
    // continue.
    cp1 = Character.toUpperCase(cp1);
    cp2 = Character.toUpperCase(cp2);
    if (cp1 != cp2) {
        // Unfortunately, conversion to uppercase does not work properly
        // for the Georgian alphabet, which has strange rules about case
        // conversion.  So we need to make one last check before
        // exiting.
        cp1 = Character.toLowerCase(cp1);
        cp2 = Character.toLowerCase(cp2);
        if (cp1 != cp2) {
            return cp1 - cp2;
        }
    }
    return 0;
}

Besides, if you'd like to compare strings in a case-insensitive manner you'd likely want to strip out other differences as well. You can use a process called "text normalization" to convert text into a canonical form that strips away accents, punctuation, characters that display the same but have different values, and invisible characters.

The way I personally dealt with this problem is storing two representations for each column that I want to search against:

  • foobar_display which corresponds to the original value that the user entered, and is used for display purposes.
  • foobar_compare which is used for search and comparison purposes.

Every time that I insert a row into the database I convert foobar_display into foobar_compareusing a [text normalization library](https://github.com/slugify/slugify). I store both values, and any time a user tries searching for a value I usefoobar_compare` to detect partial or full matches.

Finally, I add a unique constraint against foobar_compare to unwanted duplicates.

Gili
  • 86,244
  • 97
  • 390
  • 689
-20
select id from groups where name in ('administrator', 'ADMINISTRATOR', 'Administrator')
Anand Tripathi
  • 14,556
  • 1
  • 47
  • 52