1

For example, I have the following table:

 id  Name - Target 
 1    A       B     
 2    C       D,G      
 3    E       D
 4    F       G,B
 5    L       M,D,B
 6    Q       G,N

I have an over 2 million records database, and I think I made an error by storing values with a comma inside a row. My goal is to query the whole database on the "Target" row, for unique, non-repeating values, in my example the query should return only "M" and "N", because "B", "D" and "G" are repeating..

Is there a single query line to get this done ?

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
ion
  • 540
  • 7
  • 20
  • @Strawberry on the "I think I made an error ?" :) thank you – ion Oct 14 '15 at 15:20
  • 1
    Yes. Correct again ;-) – Strawberry Oct 14 '15 at 15:21
  • @Strawberry, is impossible ? never happened such thing in history ? :) – ion Oct 14 '15 at 15:23
  • First, normalize your data. – Strawberry Oct 14 '15 at 15:23
  • @Strawberry I can think of a situation where my "Target" would be a "Comment" and the query would search for unique words in the whole database. There's no one single line query for that ? – ion Oct 14 '15 at 15:27
  • That's actually a common problem for people doing things like DNA analysis. While it makes sense to store the data in databases, it's often more practical to conduct that kind of analysis in application-level code. But that's not to say it cannot be done, it's just not what relational databases are really good at. – Strawberry Oct 14 '15 at 15:46
  • @Strawberry, yes I ended up getting all results of the "Target", and score the repeating times, filtered then by the 1 hit. Thank you. – ion Oct 14 '15 at 15:58
  • Good, and NOW see normalization! – Strawberry Oct 14 '15 at 15:59

1 Answers1

0

I agree that you should change your database schema.

Your question looks like duplicate of:

https://stackoverflow.com/a/17942691/4421474

So your approach if you really need one could be like:

http://sqlfiddle.com/#!9/0c3ce/4

SELECT
  SUBSTRING_INDEX(SUBSTRING_INDEX(t1.Target, ',', numbers.n), ',', -1) letter
FROM
  numbers 
INNER JOIN t1
  ON CHAR_LENGTH(t1.Target)
     -CHAR_LENGTH(REPLACE(t1.Target, ',', ''))>=numbers.n-1
GROUP BY letter
HAVING COUNT(*)=1

Notice that you need to set numbers table with values from 1 to max index of values in target column of your table.

Community
  • 1
  • 1
Alex
  • 16,739
  • 1
  • 28
  • 51