0

I have this pattern in urls and I need to extract the 4 digit numbers out of the url. The patter looks like this:

hotel1234 
/1234.html

I tried

  select sdate,url,
  case when url like '/%.html' then substring_index(url,'.',1)
  when url REGEXP 'HOTEL'+'[0-9].{4}' then left(substring_index(url,'HOTEL',-1),4)          
        else url
        end digit
  from urltable

This does not work on some of the url. i would like to regexp on '/1234.html',what is the right expression?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Sia
  • 1
  • 1

1 Answers1

0

Following should work:

SELECT url, 
CASE WHEN url LIKE '/%.html' THEN SUBSTRING(SUBSTRING_INDEX(url,'.',1), 2)
  WHEN url REGEXP '[a-zA-Z]{5}[0-9]{4}' THEN substring(url, -4)         
        ELSE url
        END digit
FROM test;

Here's the SQL Fiddle.

Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102