0

I need to count distinct elements of a column, but I have to ignore parts of the string.

Example: I have a column with the values:

+-----------+
|    col1   |
+-----------+
|  xxx-xx-1 |
+-----------+
|  xxx-xx-2 |
+-----------+
|  yyy-yy-1 |
+-----------+
| zz-zz-z-1 |
+-----------+
| zz-zz-z-2 |
+-----------+

If I do SELECT COUNT(DISTINCT col1) FROM mytable it will return 5, but I want to get just 3. (I need to ignore the numbers)

Is there a way to use regular expressions to ignore everything from the end of the string to the first hyphen it finds?

I need to ignore from the end of the string until it finds the first hyphen, because the size may vary.

I hope you all understand the problem, thanks everyone!

  • Possible duplicate of [MySQL: Use REGEX to extract string (select REGEX)](https://stackoverflow.com/questions/4021507/mysql-use-regex-to-extract-string-select-regex) – BugHunterUK Aug 08 '17 at 14:23

1 Answers1

3

MySQL does not readily support regular expressions for substring operations. But you can use substring_index():

SELECT COUNT(DISTINCT substring_index(reverse(col1), '-', 1) )
FROM mytable;

The expression substring_index(reverse(col1), '-', 1) will return everything up to the last hyphen (because of the reverse()). This counts the distinct values in reverse order, but that does not affect the result.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786