30

I need to create a case-insensitive index on a column in rails. I did this via SQL:

execute(
   "CREATE UNIQUE INDEX index_users_on_lower_email_index 
    ON users (lower(email))"
 )

This works great, but in my schema.rb file I have:

add_index "users", [nil], 
  :name => "index_users_on_lower_email_index", 
  :unique => true

Notice the "nil". So when I try to clone the database to run a test, I get an obvious error. Am I doing something wrong here? Is there some other convention that I should be using inside rails?

Thanks for the help.

Mike Woodhouse
  • 51,832
  • 12
  • 88
  • 127
Binary Logic
  • 2,562
  • 7
  • 31
  • 39
  • 1
    The syntax shown looks fairly platform-specific, so the deficiency is likely to be in the adapter. Some additional information would be helpful,: what database? I'm guessing PostgreSQL? What version of ActiveRecord? – Mike Woodhouse Nov 01 '11 at 10:31
  • postgres extension can be used with rails/non-rails apps http://stackoverflow.com/a/32136337/1536309 – Blair Anderson Jan 10 '16 at 04:30

8 Answers8

41

Since MySQL indexes are already case-insensitive, I'm guessing you're dealing with PostgreSQL, which creates case-sensitive indexes by default. I'm answering here based on Rails 3.2.3 and PostgreSQL 8.4.

It seems functional indexes are one more example of things that ActiveRecord can't generate. Foreign keys and UUID columns are two more that come to mind. So there is no choice (other than monkey-patching ActiveRecord) but to use execute statements.

This means for an accurate dump of your database, you'll need to abandon the DB-agnostic schema.rb in favor of DB-specific structure.sql. See the Rails Guide on Migrations, section 6.2 Types of Schema Dumps. This is set as follows:

config/application.rb

config.active_record.schema_format = :sql

db/structure.sql should be updated automatically when you run a migration. You can generate it manually with this command:

rake db:structure:dump

The file is pure Postgres SQL. Although not documented when you use rake -T to list rake tasks, it seems that you can use this command to load the database from the structure.sql dump:

rake db:structure:load

There's nothing magic here: the source code (shown here from Rails 3.2.16) just calls psql on structure.sql.

Finally, here is my migration to drop an old, case-sensitive email constraint and add the case-sensitive functional index:

class FixEmailUniqueIndexOnUsers < ActiveRecord::Migration
  def up
    remove_index :users, :email
    execute "CREATE UNIQUE INDEX index_users_on_lowercase_email 
             ON users USING btree (lower(email));"
  end

  def down
    execute "DROP INDEX index_users_on_lowercase_email;"
    add_index :users, :email, :unique => true
  end
end
Mark Berry
  • 17,843
  • 4
  • 58
  • 88
  • Great answer, but wondering: in PostgreSQL (at least >= 8.3), the default index method is `btree` - why did you specify it explicitly? – Richard Michael Dec 08 '12 at 14:11
  • Don't remember. Either I was using old documentation or I snipped the script out of a GUI. – Mark Berry Dec 13 '12 at 01:46
  • @meagar, why did you edit this post? I was merely trying to call out that I updated the post on February 4, 2014 to fix broken links, one of which now points specifically to Rails 3.2.16. It's discouraging for someone who is trying to contribute to get slapped down by a trivial mod to a six-year-old post, a mod that in my opinion removes relevant information. – Mark Berry Jul 11 '18 at 15:29
  • @MarkBerry Meta information about edits should go in the "Edit summary" field when you're submitting the edit, that kind of information just clutters up the answer. Nobody "slapped" anybody down here, my edit was not punitive in any way, and there is absolutely no reason for you to take offence over it. It was harmless routine cleanup. – user229044 Jul 11 '18 at 15:39
  • @meagar, I disagree about where meta info belongs (I doubt most people know how to view edits), but now that you have explained the policy, I will keep it in mind. Perhaps you could also use the "Edit summary" to explain edits--it would both educate and minimize offence. Seems strange to discuss this here, but I did research how to ask about an edit: https://meta.stackexchange.com/a/75666/251926. – Mark Berry Jul 12 '18 at 17:28
31

If you are using PostgreSQL you can change your column type to citext - case-insensitive string. It also makes search independent from the register.

def change
  enable_extension :citext
  change_column :users, :email, :citext
  add_index :users, :email, unique: true
end
pragma
  • 1,290
  • 14
  • 16
  • Beautiful: works like a charm. The only downside is that the column "base" type is `text`, not `string` (in Postgres, `character varying`), which may have performance and storage implications for massive User tables. (?) – aec Jan 23 '16 at 00:18
  • 3
    @aec there should be no difference in storage or performance. See: http://www.postgresql.org/docs/9.1/static/datatype-character.html – Tijn Mar 23 '16 at 13:09
  • 2
    Note that PostgreSQL doesn't yet optimize LIKE or ILIKE conditions for citext fields. See: http://dba.stackexchange.com/questions/105244/index-on-column-with-data-type-citext-not-used – Jeff Tsay Aug 01 '16 at 07:26
8

I would simplify this...

In your model:

before_validation :downcase_email

def downcase_email
  self.email = email.downcase
end

That way, the index is database agnostic, and your emails are all lowercase in the database.

Jesse Wolgamott
  • 40,197
  • 4
  • 83
  • 109
  • 11
    Would work, but might not be appreciated if you send email to people who want to see case preserved in their email addresses (e.g. BobSmith@example.com). – Mark Berry Jun 22 '12 at 23:38
  • 8
    @MarkBerry Cannot tell if you're trolling. – Jesse Wolgamott Jun 23 '12 at 17:40
  • 17
    Sorry if my phrasing was off. I'm trying to say yes, you can simplify your indexes if you discard case information on emails, or anything else you want to index, before saving it to the database. With email addresses, you might get away with it--the email addresses would still work--but some recipients wouldn't like it. In general, I think that preserving user-supplied information (including case, accent marks, etc.) should take priority over technical simplification. – Mark Berry Jun 24 '12 at 20:42
  • `self.email = email.downcase if email` to trap nil otherwise Ruby will throw an exception `undefined method 'downcase' for nil:NilClass` – scarver2 Oct 15 '12 at 20:38
  • Depends on the use case, database agnostic is one thing but enforcing constraints on the database level has some advantages as well: different apps/scripts/batch jobs using the same database, what if you move away from Rails, etc. I mostly duplicate constraints, in Rails for convenience, on the database for protecting my data. – Cimm Jan 12 '13 at 12:20
4

Have you considered using schema_plus (https://github.com/lomba/schema_plus)? Among other things (support for enforcing foreign keys in the database and for views), it supports setting case-insensitive indexes for PostgreSQL databases and handles dumping them in the schema. From the Readme, "If you’re using Postgresql, SchemaPlus provides support for conditions, expressions, index methods, and case-insensitive indexes."

tovodeverett
  • 1,053
  • 8
  • 7
4

The documentation is unclear on how to do this but the source looks like this:

def add_index(table_name, column_name, options = {})
  index_name, index_type, index_columns = add_index_options(table_name, column_name, options)
  execute "CREATE #{index_type} INDEX #{quote_column_name(index_name)} ON #{quote_table_name(table_name)} (#{index_columns})"
end

So, if your database's quote_column_name is the default implementation (which does nothing at all), then this might work:

add_index "users", ['lower(email)'], :name => "index_users_on_lower_email_index", :unique => true

You note that you tried that one but it didn't work (adding that to your question might be a good idea). Looks like ActiveRecord simply doesn't understand indexes on a computed value. I can think of an ugly hack that will get it done but it is ugly:

  1. Add an email_lc column.
  2. Add a before_validation or before_save hook to put a lower case version of email into email_lc.
  3. Put your unique index on email_lc.

That's pretty ugly and you might feel dirty for doing it but that's the best I can think of right now.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
1

For Rails 4.2, create case-insensitive unique index in users table on name column.

Create new migration file with empty change method:

$ rails generate migration add_index_in_users_on_name

Add call of add_index method to empty change method:

add_index :users, 'lower(name)', name: 'index_users_on_lower_name', unique: true

Run Rake db:migrate task:

$ rake db:migrate

In result, index will be added correctly and file db/schema.rb contains correct add_index:

add_index "users", ["LOWER(\"NAME\")"], name: "index_users_on_lower_name", unique: true

This tested only with RDB Oracle.

sadnix
  • 53
  • 2
  • 6
  • 2
    This doesn't actually work. I get: CREATE UNIQUE INDEX "index_users_on_lower_email" ON "users" ("lower(email)") PG::UndefinedColumn: ERROR: column "lower(email)" does not exist – Rob May 27 '15 at 19:06
  • Rails version? In this case, the column must be specified as (lower("email")) . In your example, double quotes are not set correctly. – sadnix May 27 '15 at 22:23
  • 1
    Did anyone else checked this solution? I'm using rails 4.2.3 and postgresql 9.4 and that solution doesn't work regardless of quotes place. I'm getting PG::UndefinedColumn: ERROR. – Jeremen Jul 11 '15 at 14:07
  • same here. the statement built is always ``CREATE UNIQUE INDEX "index_contacts_on_lower_email" ON "contacts" ("LOWER(contacts)")``, when it should be just ``(LOWER(contacts))``, regardless of quote placement. – srecnig Aug 25 '15 at 13:14
  • 1
    Rails 4.2.5 still wraps quotes around index columns, therefor breaking any attempt to use add_index on expressions. – maia Nov 27 '15 at 20:57
  • 4
    Works for Rails 5.1.4 + PostgresSQL – Alex Tatarnikov Jan 28 '18 at 14:00
0

I would suggest (just as an opportunity to consider among others) to use two separate fields:

  • email
  • email_original

The first one is always down cased and thus can be uniquely indexed in a database agnostic manner, while the second field keeps verbatim to what user enetered and can have upper characters.

Obviously in User model one then needs to set email based on email_original on each save and prohibit direct mangling with email field.

Hero Qu
  • 911
  • 9
  • 10
-3

I think you need column name(s) as below

   add_index "users", [email], {:name => "index_users_on_lower_email_index", :unique => true }

And you have to make email field in database with proper Case Insensitive collation, that way your index will be also case insensitive.

depending of db engine which you are using it syntax may be different but

alter table [users] alter column [email] varchar(250) collate utf8_general_ci ...

and when you add index to this column it will be case insensitive.

Milan Jaric
  • 5,556
  • 2
  • 26
  • 34
  • This is not what I asked. I need a case insensitive index. Which I know how to do, but rails is not recognizing my solution. – Binary Logic Oct 30 '11 at 23:05
  • For your question everything depends on collation you pick for your table or just one field in case if you want "expected" behaviour, or else you will need hacky code or ending patching your database with lots of sql scrits. schema dump is reading this "meta data" from your database and since it is generic solution for all database, some things are missing sometimes if you do not follow conventions. I'm not sure for your case, but I don't see why you need case sensitive column in table with case insensitive index? – Milan Jaric Aug 14 '13 at 19:59