0

I am using mySQL to query a field which would be LastName. I am looking for any errors in the field such as any special characters or numbers. I am not terribly familiar with SQL so this has been a challenge so far. I have written simple statements with REGEXP but I have run into some issues the REGEXP i was using was:

SELECT LastName FROM `DB`.`PLANNAME` where LastName REGEXP '^([0-9])'

now this turned up results where numbers were the first character in the string and i realized that if anything was in the middle of the string that started with a letter this would not pick it out.

To be clear i just need to find the errors not write a code to clean them out.

Any help would be greatly appreciated

Thanks

Pete

chris85
  • 23,846
  • 7
  • 34
  • 51
Pete Walsh
  • 15
  • 1
  • 1
  • 5
  • 1
    Remove the `^`, if you want to find the pattern in the middle of the string. – Gordon Linoff Jul 30 '15 at 15:27
  • so is there anything to find special characters like parentheses or periods or commas that i could write in the same statement – Pete Walsh Jul 30 '15 at 15:40
  • Do you just want non-alpha characters found (or the inverse)? I think this thread would solve that, http://stackoverflow.com/questions/1471523/select-only-rows-that-contain-only-alphanumeric-characters-in-mysql (take out the `0-9`). – chris85 Jul 30 '15 at 15:48
  • that is correct Chris l only want non alpha characters found – Pete Walsh Jul 30 '15 at 15:50
  • So `^[A-Za-z]+$` will find all the correct ones? – maraca Jul 30 '15 at 15:52

2 Answers2

1

Maybe you are looking for something like this:

SELECT LastName FROM `DB`.`PLANNAME` WHERE NOT LastName REGEXP '[A-Za-z0-9]';

Here is a documentation on this:

Table 12.9 String Regular Expression Operators

FutbolFan
  • 13,235
  • 3
  • 23
  • 35
1

Something like this should do it for you.

SELECT column FROM table WHERE column REGEXP '[^A-Za-z]'

This will return any rows where a character that is not a-z. You might want to add in and '. For O'briens and von lansing etc. Any characters you think are acceptable should go in the character class [], http://www.regular-expressions.info/charclass.html.

Demo: https://regex101.com/r/nC9cG7/1

chris85
  • 23,846
  • 7
  • 34
  • 51
  • Great, please mark this as the accepted answer when you have tested this thoroughly; or post issues/questions you encounter. http://stackoverflow.com/help/accepted-answer – chris85 Jul 30 '15 at 16:08
  • Chris i tried adding hyphens and an apostrophe in the character class however even within brackets ' is just ending my code, and the [-] is still returning results with hyphens any idea on this? – Pete Walsh Jul 31 '15 at 13:11
  • Try escaping the hyphen and quote. The hyphen usually designates a range and the quote closes the regex encapsulation. `SELECT column FROM table WHERE column REGEXP '[^A-Za-z\-\']'`. That's untested, if it doesn't work let me know and I'll run a test.. – chris85 Jul 31 '15 at 14:16
  • i had to put a space between the a-z and \-\ that got rid of the hyphens however the ' is still closing the expression and throwing errors "unterminated string literal" becuase its seeing the ]' as a different section even when i double the '' it works but then shows the hyphens again and still shows names with apostrophes – Pete Walsh Jul 31 '15 at 15:18
  • and im not sure if this matters but i am running these expressions in talend SQL editor – Pete Walsh Jul 31 '15 at 15:19
  • Okay, I'm not sure why that's not working but I've encountered the same issue here. Try this solution, `SELECT column FROM table WHERE column NOT REGEXP '([:alpha:]|-|\')'` – chris85 Jul 31 '15 at 16:26
  • If you're interested in the why the first query failed I've started a question on it because I'm unsure, http://stackoverflow.com/questions/31751034/mysql-regex-error-1139-using-literal. – chris85 Jul 31 '15 at 16:35