1

I can think of convoluted and ugly ways to do this in mysql but i'm looking for a nice way. Let's say i have a bunch of school names, like

Meopham County Infant School
Speldhurst Nursery School
Rainbow Pre-School
The Annex School House
Fleet Learning Zone
Dartford Grammar School
Kiddliwinks
Hextable Kindergarten
The Rocking Horse Montessori Kinder
Little Angels Day Nursery

and i have a list of stopwords:

["school", "primary", "nursery", "college", "junior", "church", "cofe", "community", "infant"]

I have a ruby function "short_name" which returns the school name up to, but not including, the first instance of any of the stop words, so that we get

"Bower Grove School" => "Bower Grove"
"Fulston Manor School" => "Fulston Manor"
"St Johns Church Hall Play" => "St Johns"
"St Botolph's Church of England Voluntary Aided Primary School" => "St Botolph's"
"Fawkham House School" => "Fawkham House"
"Silverdale Day Nursery" => "Silverdale Day"
"Vigo Village School" => "Vigo Village"
"Sevenoaks Primary School" => "Sevenoaks"
"High Weald Academy" => "High Weald Academy"
"The Ebbsfleet Academy" => "The Ebbsfleet Academy"

That's all fine. My question is: what's the simplest way to do the above string processing in mysql?

eg, if i wanted to search by this short_name, i'd want to do something like

"select * from schools where <function(name)> = 'Bower Grove'"

what's the simplest way to do <function>? I'd thought that some combination of substring(), and locate(), using a regex, would be the way to go, but it looks like i can't use a regex with locate.

I guess the regex would be

"school|primary|nursery|college|junior|church|cofe|community|infant"

thanks, Max

Max Williams
  • 32,435
  • 31
  • 130
  • 197
  • http://stackoverflow.com/questions/5361457/mysql-return-matching-pattern-in-regexp-query – Taemyr Feb 23 '15 at 12:38
  • Thanks @Taemyr, i'd prefer to avoid a solution which requires a patch to MySQL if possible. Appreciate the option though :) – Max Williams Feb 23 '15 at 12:41
  • Have you considered using a LIKE clause at all? `SELECT * FROM schools WHERE name LIKE 'Bower Grove%';` would return all schools where the name starts with the string `Bower Grove`. – Emile Pels Feb 23 '15 at 12:43
  • I need to use the function in an update, as well as a select, so that's not an option i'm afraid. Also, that would return schools called eg "Bower Grove Park" as well as "Bower Grove". – Max Williams Feb 23 '15 at 13:04

1 Answers1

2

MySQL does support regular expressions. Unfortunately, it is only used for matching.

Here is one method:

select least(substring_index(schoolname, ' School', 1),
             substring_index(schoolname, ' Primary', 1),
             . . .
            )

This uses substring_index() to extract the first part of a string before the separator. If the separator is not present, you get the whole string. The least() function will then choose the shortest string.

This assumes that that keyword has a space before it. After all, you probably don't want to totally eliminate everything for a name like "School for Little Angels".

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • That's great, thanks Gordon. `substring_index()` is case sensitive (i assumed the solution would be case insensitive since mysql usually is, by default) but i got round this by starting with the titleized stopwords ("Primary", "School" etc) and generating substring_index functions for both versions and adding them to the list. – Max Williams Feb 23 '15 at 13:42