0

I have a bunch url that has a string either has

hotel+4 digit number: hotel1234 
or slash+4digit.html: /1234.html

Is there a regex to extract 4 digit number like 1234 either use python or mysql?

I'm thinking 'hotel'[0-9][0-9][0-9][0-9],sth like this

Thanks!

Sia
  • 1
  • 1

2 Answers2

0

You can try the REGEXP

SELECT * FROM Table WHERE ColumnName REGEXP '^[0-9]{4}$'

or

SELECT * FROM Table WHERE ColumnName REGEXP '^[[:digit:]]{4}$';
Arulkumar
  • 12,966
  • 14
  • 47
  • 68
0

The following is a stackoverflow.com link that might be useful showing how to extract a substring from inside a string in Python?

Unfortunately, MySQL regexp simply returns true if the string exists. I have found substring_index useful if you know the text surrounding the target...

select case when ColumnName like 'hotel____' then substring_index(ColumnName,'hotel',-1)
            when ColumnName like '/____.html' then substring_index(substring_index(ColumnName,'/',-1),'.html',1)
            else ColumnName
             end digit_extraction
  from TableName
 where ...;

The case statement above isn't necessary because of the way substring_index works (by returning the entire string if the search string isn't found).

select substring_index(substring_index(substring_index(ColumnName,'hotel',-1),'/',-1),'.html',1)
  from TableName
 where ...;
Community
  • 1
  • 1
RMathis
  • 588
  • 2
  • 7