1

I need to select all data having non-duplicate IDs.. here's my sample table..

----------------------------------------------------------------------------------
ID        | Zip-Code       | Search Query        | ID_LIST
----------------------------------------------------------------------------------
1         | 1000           | Query Sample 1      | 13,14,15,
----------------------------------------------------------------------------------
2         | 2000           | Query Sample 2      | 16,13,17,
----------------------------------------------------------------------------------
3         | 3000           | Query Sample 3      | 18,17,13,
----------------------------------------------------------------------------------
4         | 4000           | Query Sample 4      | 15,16,17,18,
----------------------------------------------------------------------------------
5         | 5000           | Query Sample 5      | 19, 20,

u can notice that IDs 1 and 2 have duplicate, which is 13 on ID_LIST

2 and 3 also have duplicate, which is 13 and 17.

What I want to do is make it like this...

----------------------------------------------------------------------------------
ID        | Zip-Code       | Search Query        | ID_LIST
----------------------------------------------------------------------------------
1         | 1000           | Query Sample 1      | 13,14,15,
----------------------------------------------------------------------------------
2         | 2000           | Query Sample 2      | 16,17,
----------------------------------------------------------------------------------
3         | 3000           | Query Sample 3      | 18,
----------------------------------------------------------------------------------
5         | 5000           | Query Sample 5      | 19,20,

What query would be good for this? Any Help?

andil01
  • 377
  • 4
  • 19

1 Answers1

0

Best way to approach it is to normalize your data, as mentioned in comments. But if you absolutely have to do it this way, it would be very difficult to do in query on mysql.

I would suggest you to create a procedure for it. As and when you develop each step, you can google that particular solution of that step, and test it and build up on that. Let me know if any step sound confusing/unclear.

  1. Create a variable string, say v_vals. Initialize with null. At the end of procedure, it will contain all the distinct values of id_list (13,14...20)
  2. Iterate through each row.
  3. Count the number of comma in id_list.
  4. Loop from 1 to number of comma.
  5. In every iteration, use substring and instring to find position of each comma and then extract values from id_list. (13,14...)
  6. use another variable v_id_list. Put null in it.
  7. Search for the values (from step 5) in v_vals. If they exist in v_val, then skip them, else put them in v_val and v_id_list.
  8. Now run an update statement to update id_list with v_id_list.

Now repeat Step 3 to 8 for each row.

  1. Note that v_id_list will be reinitialize for each loop, however v_val will contain all the distinct values of id_list.
Utsav
  • 7,914
  • 2
  • 17
  • 38