I am taking the MySQL class by Duke on Coursera. In week two there is reference to messy data and I figured I would ask my question here. The scope of my question would be how to compare an entry in a row of table that would already match an instance except that it was entered with a hyphen, i.e. "Golden Retriever Mix" is the same instance as "Golden Retriever- Mix". And when I run a DISTINCT SELECT statement I do not want it to pull both results. The catch is, we cannot just remove all hyphens from the column fields because we still want them for instance for entry of "Golden Retriever-Airedale Terrier Mix". How would a query look for doing this. The example code that pulls in both "Golden Retriever Mix" and "Golden Retriever- Mix" is below.
SELECT DISTINCT breed,
TRIM(LEADING '-' FROM breed)
FROM dogs
ORDER BY (LEADING '-' FROM breed) LIMIT 1000, 1000;
I am thinking I need and IF/THEN statement that says
IF(REPLACE(breed,'-','') = breed)
THEN DELETE breed;
Obviously this is not correct syntax which is what I am looking for.