0

Currently I have a column in my table which has a set of comma separated values. I am currently using it to filter the results. I am wondering if it would be possible to index on it and query directly using it.

My table is as below:

userId        types
123           A, B, C
234           B, C

If I want to query a user which has types A and C, should get 123 If with B and C then 123, 234

EDIT: I am aware the problem can be solved by normalization. However my table is actually storing json and this field is a virtual column referencing a list. there are no relations used anywhere. We are facing an issue where querying by types was not considered and is now causing performance impact

  • 1
    A proper way to store such of data is to use many to many relationship. – Maciej Los Sep 09 '20 at 10:21
  • edited the description. this is a production db and redoing the whole thing is not an option. – Antariksha Yelkawar Sep 09 '20 at 11:14
  • Believe me, this is the best option. You need to speak with your supervisor and explain him that every query which has to split data uses more resources and takes more time to execute. Here is an idea: [SQL split values to multiple rows](https://stackoverflow.com/questions/17942508/sql-split-values-to-multiple-rows) – Maciej Los Sep 09 '20 at 11:17
  • If it's a json column, check this: [12.18.3 Functions That Search JSON Values](https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html) – Maciej Los Sep 09 '20 at 12:09

3 Answers3

2

First of all, you should normalize your table and remove the CSV data. Use something like this:

userId | types
123    | A
123    | B
123    | C
234    | B
234    | C

For the specific query you have in mind, you might choose:

SELECT userId
FROM yourTable
WHERE types IN ('A', 'C')
GROUP BY userId
HAVING MIN(types) <> MAX(types);

With this in mind, MySQL might be able to use the following composite index:

CREATE INDEX idx ON yourTable (userId, types);

This index should cover the entire query above actually.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

The answer is basically no . . . but not really. The important point is that you should restructure the data and store it in a properly. And "properly" means that string columns are not used to store multiple values.

However, that is not your question. You can create an index to do what you want. Such an index would be a full-text index, allowing you to use match(). If you take this approach you need to be very careful:

  • You need to use boolean mode when querying.
  • You need to set the minimum word length so single characters are recognized as words.
  • You need to check the stop-words list, so words such as "A" and "I" are included.

So, what you want to do is possible. However, it is not recommended because the data in not in a proper relational format.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

MySQL supports Multi-Value Indexes for JSON columns as of MySQL 8.0.17. It seems like exactly your case.

Details: https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-multi-valued

MiSHuTka
  • 1,191
  • 1
  • 12
  • 20
  • Using this feature in combination with foreign keys and on delete (and maybe other scenarios) can cause segfaults. I've filed a bug report (#108663) on this which has been open for 10 months with no sign of a fix ever being released.. My faith in MySQL has been shattered. – ColinM Jul 13 '23 at 16:53