0

I have a mysql query with REGEXP which match the starting of field with 'A', 'An' and 'The' Followed by space if match then trim the field from starting of first space, then i match the starting of field with special character like (','',[:space:]) if yes then trim all the leading special character. Mysql query is with CASE like this:

CASE
  WHEN field_data_field_display_title_field_display_title_value REGEXP '(^(A|An|The)[[:space:]])' = 1 THEN
  TRIM(SUBSTR(field_data_field_display_title_field_display_title_value , INSTR(field_data_field_display_title_field_display_title_value ,' ')))
  WHEN field_data_field_display_title_field_display_title_value REGEXP '(^[\"\'[:space:]])' = 1 THEN
    TRIM(SUBSTR(field_data_field_display_title_field_display_title_value ,2))
  ELSE field_data_field_display_title_field_display_title_value
END

I am not able to trim all leading special character while i can trim the first leading special character by passing '2' in SUBSTR function. As mysql doesn't support capturing group so i can't get the matched value in captured group.

So my question is how can i get the position of first alphabetic character in field with mysql query so that i can pass that position in SUBSTR function to trim all the leading special character. I tried with [:alpha:] class like:

TRIM(SUBSTR(field_data_field_display_title_field_display_title_value ,
 INSTR(field_data_field_display_title_field_display_title_value ,[:alpha:])))

but it give mysql syntax error. Or Anybody can suggest me any other approach to trim all the leading special characters.

Thanks in Advance!

MutantMahesh
  • 1,480
  • 15
  • 20
  • See [How to do a regular expression replace in MySQL?](http://stackoverflow.com/questions/986826/how-to-do-a-regular-expression-replace-in-mysql) – Bill Karwin Jul 20 '14 at 06:57
  • Or many other similar questions: http://stackoverflow.com/search?q=%5Bmysql%5D+replace+regular+expression – Bill Karwin Jul 20 '14 at 06:58
  • The first link you provided suggest using some mysql library but i don't have permission to install library on server. – MutantMahesh Jul 20 '14 at 07:22

3 Answers3

0

There's no regexp match function that reports the position in the string, nor is there any regexp replace function in MySQL.

(Update: MySQL 8.0 now supports more regex functions, but they are not available in earlier versions of MySQL.)

If you know you're searching for a short list of specific words, you could pick the least position among several matches:

SUBSTRING(field_data_field_display_title_field_display_title_value,
  LEAST(
    INSTR(field_data_field_display_title_field_display_title_value, 'A '),
    INSTR(field_data_field_display_title_field_display_title_value, 'An '),
    INSTR(field_data_field_display_title_field_display_title_value, 'The ')
  )
)

It's usually awkward to do substring matches or replaces in SQL, because SQL is fundamentally designed to treat a column as an irreducible piece of data. Any functions to work with substrings are extensions to the language, not something built-in.

If you want better handling by string functions, it would be easier to fetch the whole string into an application, and write code using a more rich set of functions. Though I understand this is not practical if the reason for the substring manipulation you describe is for expressions that affect query results, such as the WHERE clause to restrict rows, or in the ORDER BY clause to sort.

If so, then the better solution is to change the way you store strings. Split up the strings in a prefix portion with the special characters, then a separate column for the portion starting with A, An, or The, and then perhaps even a third column with trailing text that you don't want to be part of the main text.

The advantage of splitting it up is that SQL expressions to work on the main string are much simpler, and you can even index it normally to gain a lot of performance for certain queries.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
0

I was using the mysql snippet i posted in question in ORDER BY clause to sort the data. As i was having small list of matches which i want to remove so i followed @BillKarwin suggestion. ORDER BY clause in query become something like

ORDER BY 
  CASE
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^(A|An|The)[[:space:]]' = 1 THEN
      TRIM(SUBSTR(field_data_field_display_title_field_display_title_value , INSTR(field_data_field_display_title_field_display_title_value ,' ')))
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[\']' = 1 THEN
      TRIM(LEADING '\'' FROM field_data_field_display_title_field_display_title_value)
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[[:space:]]' = 1 THEN
      TRIM(LEADING ' ' FROM field_data_field_display_title_field_display_title_value)
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[\"]' = 1 THEN
      TRIM(LEADING '"' FROM field_data_field_display_title_field_display_title_value)
    ELSE field_data_field_display_title_field_display_title_value
  END ASC
MutantMahesh
  • 1,480
  • 15
  • 20
0

If you update your MySql version (8.0+:) you can use REGEXP_INSTR function.

SELECT
  REGEXP_INSTR('500 Oracle Parkway, Redwood Shores, CA',
               '[^ ]+', 1, 6) "REGEXP_INSTR"
  FROM DUAL;

REGEXP_INSTR
------------
          37

Or if updating the MySql version is not an option, you can try a user-defined function (UDF) like mysql-udf-regexp. I haven't used it and the documentation isn't clear about how to install it.