24

I want to match a url field against a url prefix (which may contain percent signs), e.g. .where("url LIKE ?", "#{some_url}%"). What's the most Rails way?

Nick Ginanto
  • 31,090
  • 47
  • 134
  • 244
Costa Shapiro
  • 524
  • 1
  • 3
  • 8
  • I hope you don't have any plans to scale to millions of posts. That query will suck system resources faster than an SUV drains gas out of a tank. – Wes Apr 18 '11 at 23:42
  • @Wes: Depends on your database. AFAIK, recent versions of PostgreSQL can utilize an index for LIKE matches that use a prefix (i.e. those of the form `X%` for some fixed `X`). There are some notes on this over here: http://stackoverflow.com/questions/1566717/postgresql-like-query-performance-variations – mu is too short Apr 18 '11 at 23:50
  • @Wes: You're still hosed with `%X%` patterns though, that'll almost certainly hand you a table scan. Sorry that I don't have an authoritative reference but you could probably grok a query plan and see what happens. – mu is too short Apr 19 '11 at 01:48

4 Answers4

27

From Rails version 4.2.x there is an active record method called sanitize_sql_like. So, you can do in your model a search scope like:

scope :search, -> search { where('"accounts"."name" LIKE ?', "#{sanitize_sql_like(search)}%") }

and call the scope like:

Account.search('Test_%')

The resulting escaped sql string is:

SELECT "accounts".* FROM "accounts" WHERE ("accounts"."name" LIKE 'Test\_\%%')

Read more here: http://edgeapi.rubyonrails.org/classes/ActiveRecord/Sanitization/ClassMethods.html

phlegx
  • 2,618
  • 3
  • 35
  • 39
  • 2
    In my opinion, this is the most elegant way to solve this issue – Nikola M. Dec 14 '15 at 13:10
  • Guess from rails 5.* sanitize_sql_like is not supported Any other alternative? – Surya Dec 11 '18 at 09:41
  • It is supported on Rails 5.x. See here https://github.com/rails/rails/blob/v5.2.1.1/activerecord/lib/active_record/sanitization.rb#L107 – phlegx Dec 11 '18 at 16:40
  • 1
    FYI these are protected/private methods on 5.0 and 5.1. They're only public on 5.2. – Teoulas Jul 05 '19 at 09:49
  • 2
    @Teoulas is right. Rails 5.0 and 5.1, use this to call them: `ActiveRecord::Base.send(:sanitize_sql_like, "Test_%")` – stwr667 Sep 08 '20 at 02:13
22

If I understand correctly, you're worried about "%" appearing inside some_url and rightly so; you should also be worried about embedded underscores ("_") too, they're the LIKE version of "." in a regex. I don't think there is any Rails-specific way of doing this so you're left with gsub:

.where('url like ?', some_url.gsub('%', '\\\\\%').gsub('_', '\\\\\_') + '%')

There's no need for string interpolation here either. You need to double the backslashes to escape their meaning from the database's string parser so that the LIKE parser will see simple "\%" and know to ignore the escaped percent sign.

You should check your logs to make sure the two backslashes get through. I'm getting confusing results from checking things in irb, using five (!) gets the right output but I don't see the sense in it; if anyone does see the sense in five of them, an explanatory comment would be appreciated.

UPDATE: Jason King has kindly offered a simplification for the nightmare of escaped escape characters. This lets you specify a temporary escape character so you can do things like this:

.where("url LIKE ? ESCAPE '!'", some_url.gsub(/[!%_]/) { |x| '!' + x })

I've also switched to the block form of gsub to make it a bit less nasty.

This is standard SQL92 syntax, so will work in any DB that supports that, including PostgreSQL, MySQL and SQLite.

Embedding one language inside another is always a bit of a nightmarish kludge and there's not that much you can do about it. There will always be ugly little bits that you just have to grin and bear.

smathy
  • 26,283
  • 5
  • 48
  • 68
mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • Btw, if (I don't know) you need to get double-backslashes into those strings, the above won't do it. `'\\' => \` in Ruby (and most languages). – smathy Apr 19 '11 at 03:01
  • @Jason: Five escapes produces the right thing in `irb`, thanks for catching that. I'm guessing that it goes through as three tokens "\\", "\\", and "\%" which, after escape processing, ends up as "\\%" as desired. I usually just keep adding them until it works then reverse engineer the justification. Does anyone know of a quote operator that really does produce literal results without any processing or big pile of mess? – mu is too short Apr 19 '11 at 03:21
  • 8
    Something else you could do is to add an `ESCAPE` clause, like this: `.where( "url LIKE ? ESCAPE '!'", some_url.gsub('%', '!%').gsub('_', '!_')` – smathy Apr 19 '11 at 04:13
  • @Jason: That's genius. I wish I could give you points for a comment or just transfer ownership of the answer to you. – mu is too short Apr 19 '11 at 04:34
  • @mu Unfortunately, your answer is probably correct. The only (barely convincing) reason for such a nuisance I can think of: LIKE queries suck, and should not generally be used. – Costa Shapiro Apr 19 '11 at 14:24
  • @Costa: Thanks for the correction. Sometimes you have a choice between different types of pain: use LIKE or SIMILAR TO inside the database and deal with the quoting nonsense or pull a ton of stuff out of the database, do the pattern matching in Ruby, and then throw most of the data away. Six of one, half a dozen of the other. – mu is too short Apr 19 '11 at 19:45
  • @mu - you're welcome - and no worries about the points, these things have a way of working themselves out in the end. – smathy Apr 20 '11 at 18:58
  • 2
    You could also use capturing groups instead of the block syntax (shorter and probably faster): `.where("url LIKE ? ESCAPE '!'", some_url.gsub(/([!%_])/, '!\1')` – Lucas Wiman Sep 12 '12 at 18:31
  • @RecursivelyIronic: I rarely use that form of `gsub`, the double meaning of backslashes can be problematic so I usually just go with the block form when working with a non-constant replacement so that I don't have to worry about when backslashes need to be doubled and when they don't. But that's just personal preference. – mu is too short Sep 12 '12 at 18:37
6

https://gist.github.com/3656283

With this code,

Item.where(Item.arel_table[:name].matches("%sample!%code%"))

correctly escapes % between "sample" and "code", and matches "AAAsample%codeBBB" but does not for "AAAsampleBBBcodeCCC" on MySQL, PostgreSQL and SQLite3 at least.

Vikdor
  • 23,934
  • 10
  • 61
  • 84
kaznum
  • 61
  • 1
  • 2
-3
Post.where('url like ?', "%#{some_url + '%'}%)
thenengah
  • 42,557
  • 33
  • 113
  • 157
  • @Sam Very close. `Post.where('url like ?', "%#{some_url}%")` – Rein Henrichs Apr 18 '11 at 23:31
  • You code just makes a like condition match other entries. @Costa-Shaprio is looking for a way to do what your doing plus having `%` or at least that is how I'm reading it ie 'which may contain percent signs' – thenengah Apr 18 '11 at 23:44
  • @Rein: No. That changes the nature of the match ("begins with" becomes "anywhere in") and does nothing to account for an embedded "%" in `some_url` let alone underscores. – mu is too short Apr 18 '11 at 23:48
  • @mu Ok, remove the first % then. That's a simple requirements misunderstanding. It still works for the 90% case. And at least it's syntactically valid, which `"%#{some_url + '%'}%` is not. – Rein Henrichs Apr 19 '11 at 01:19
  • 1
    @mu Did you check to see if ActiveRecord's sql sanitization already escapes %s before you commented? – Rein Henrichs Apr 19 '11 at 01:20
  • 1
    @Rein: How would AR tell the difference between an intentional `%x%x%` and an accidental `"%#{a}%"` when `a` just happens to be an unescaped `'x%x'`? The first (intentional) case wants "*anything* x *anything* x *anything*" whereas the second would want "*anything* x%x *anything*" but get "*anything* x *anything* x *anything*" because no one escaped it. By the time AR sees the argument for LIKE, it just has a string that has embedded SQL syntax and it is too late to infer the programmer's intent. – mu is too short Apr 19 '11 at 01:42
  • @ReinH - it doesn't, %s are not escaped by AR - nor is there a % escaper anywhere in AR. – smathy Apr 19 '11 at 02:56
  • Yes, I understand what % means and why it is important to escape it. I was just trying to fix a syntax error. Please direct your corrections to the original answer. – Rein Henrichs Apr 19 '11 at 04:20
  • bottom line the answer is not good. it wont escape %s in the url. I can't believe rails have no native answer for that. wtf? – Vitaly Kushner Apr 19 '11 at 11:59
  • yeah, it's not good. can you give examples of your urls that you need to get searched? That's why my answer sucks, I just pasted some code that was kinda related to you answer. Post some examples of the urls you wan to search. – thenengah Apr 19 '11 at 15:58