1

I have a unique index to prevent users from taking identical, case insensitive, names:

CREATE UNIQUE INDEX accounts_lower_idx ON public.accounts USING btree (lower((account_name)::text))

Recently it came to my attention that some pairs of users did have these kinds of identical names.

Under normal cases, attempting to insert a user with a duplicate name will fail:

SQL Error [23505]: ERROR: duplicate key value violates unique constraint "accounts_lower_idx"
  Detail: Key (lower(account_name::text))=(cogg) already exists.

But these duplicate names are still popping up, somehow.

Attempting to create a duplicate index will result in an error:

SQL Error [23505]: ERROR: could not create unique index "accounts_lower_idx_1"
  Detail: Key (lower(account_name::text))=(some name) is duplicated.

I'm not making any fancy changes to these rows. I'm using trivial insert statements, and names are not changed once set.

This isn't another question about duplicate NULL's being allowed. account_name is marked not null.

Is there some reason this wouldn't work? Is using indexes for this bad practice? Should a constraint be used instead? As far as I understand, constraints use a unique index internally. In any case, the fact that trying to make a duplicate index fails seems to indicate something has gone badly wrong.

In case it's relevant, my postgres version:

postgres (PostgreSQL) 10.18 (Ubuntu 10.18-1.pgdg20.04+1)
cogg
  • 11
  • 1
  • 1
    It could be a bug in PostgreSQL's indexing code (vanishingly unlikely), it could be hardware problems that PostgreSQL failed to detect (possible, but not common) or it could be that you are mistaken. Nobody can say if you are mistaken because for some reason you haven't actually shown us any of these duplicates. Please show `SELECT ctid, '[' || account_name || ']', lower(account_name)` for the rows in question. – Richard Huxton Nov 01 '21 at 23:06
  • 2
    Sounds like index corruption. It is hard to know the root cause of such corruption. Have you had hardware problems recently? Has the database been moved? Major OS updates? And what is the OS? Do you have page checksums turned on? Do you have backups from back before you noticed problems? What is the character set and collation of your database? Have people reported data as being missing/unfindable? – jjanes Nov 02 '21 at 00:28
  • @jjanes I'm going to assume this is the case for the time being. Hardware problems seem unlikely as the issue has occurred on two separate servers. I have re-imaged and updated the servers multiple times, so it might be related to [this](https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html). If memory serves I was using Ubuntu 18.04 at one point, which evidently uses a pre-2.28 build of glibc. I'll reindex and consider changing the collation. – cogg Nov 02 '21 at 03:59
  • Yes, that is the likely cause. – Laurenz Albe Nov 02 '21 at 04:04

1 Answers1

0

Use of the citext extension is mentioned in a comment in this older similar post:

Postgres unique constraint vs index

Citext’s documentation is here: https://www.postgresql.org/docs/current/citext.html

"The standard approach to doing case-insensitive matches in PostgreSQL has been to use the lower function when comparing values…"

"If you declare a column as UNIQUE or PRIMARY KEY, the implicitly generated index is case-sensitive. So it's useless for case-insensitive searches, and it won't enforce uniqueness case-insensitively."

Citext's documentation also suggests this tip:

https://www.postgresql.org/docs/current/collation.html#COLLATION-NONDETERMINISTIC

Philip Wright
  • 306
  • 2
  • 3