1

Sorry if my question seems unclear, I'll try to explain.
I have a column in a row, for example /1/3/5/8/42/239/, let's say I would like to find a similar one where there is as many corresponding "ids" as possible.

Example:

   | My Column   |
#1 | /1/3/7/2/4/ |
#2 | /1/5/7/2/4/ |
#3 | /1/3/6/8/4/ |

Now, by running the query on #1 I would like to get row #2 as it's the most similar. Is there any way to do it or it's just my fantasy? Thanks for your time.

EDIT:
As suggested I'm expanding my question. This column represents favourite artist of an user from a music site. I'm searching them like thisMyColumn LIKE '%/ID/%' and remove by replacing /ID/ with /

Xriuk
  • 382
  • 2
  • 7
  • 26

2 Answers2

1

Since you did not provice really much info about your data I have to fill the gaps with my guesses.

So you have a users table

users table
-----------
id
name
other_stuff

And you like to store which artists are favorites of a user. So you must have an artists table

artists table
-------------
id
name
other_stuff

And to relate you can add another table called favorites

favorites table
---------------
user_id
artist_id

In that table you add a record for every artist that a user likes.

Example data

users
id    |  name
1     |  tom
2     |  john


artists
id    |  name
1     |  michael jackson
2     |  madonna
3     |  deep purple


favorites
user_id   |  artist_id
1         |  1
1         |  3
2         |  2

To select the favorites of user tom for instance you can do

select a.name
from artists a
join favorites f on f.artist_id = a.id
join users u on f.user_id = u.id
where u.name = 'tom'

And if you add proper indexing to your table then this is really fast!

juergen d
  • 201,996
  • 37
  • 293
  • 362
  • Wouldn't this be a waste of space,? Because a single column weights less than an entire table, and for me it's useless having one more table for this... – Xriuk May 06 '14 at 13:42
  • This is absolutely no waste and not useless. This is the way to do it. It is clean and fast and the data is easy selectable. – juergen d May 06 '14 at 13:43
  • 1
    @Xriuk No, it wouldn't. Space also shouldn't matter in a database. And about delimited numbers in one column read this: http://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad – fancyPants May 06 '14 at 13:44
  • @Xriuk: And BTW using `int` values instead of a string saves storage actually. – juergen d May 06 '14 at 13:59
-1

Problem is you're storing this in a really, really awkward way.

I'm guessing you have to deal with an arbitrary number of values. You have two options:

  1. Store the multiple ID's in a blob object in JSON format. While MySQL doesn't have JSON functions built in, there are user defined functions that will extract values for you, etc. See: http://blog.ulf-wendel.de/2013/mysql-5-7-sql-functions-for-json-udf/ Alternatively, switch to PostGres

  2. Add as many columns to your table as the maximum number of ID's you expect to have. So if /1/3/7/2/4/8/ is the longest entry, have 6 columns in your table. Reason this is bad: you'll have sparse columns that'll unnecessarily slow your tables.

I'm sure you could write some horrific regex to accomplish the task, but I caution on using complex regex's on enormous tables.

Alfalfa
  • 131
  • 7