1

I'm trying to create a database structure (in sqlite or MySQL) that allows me to return values in col1 where col2 contains "tom." col2's type would ideally be some sort of list data structure but don't think this exists in SQL. The overall data is extremely large (10-50 GB), so achieving an acceptable speed is an issue. I've tried making the col2 data TEXT and using contains or like, but this is way too slow.

col1          col2
object1       tom, fred, mary, joe
object2       bob, tom, jerry
object3       sam, violet
timgeb
  • 76,762
  • 20
  • 123
  • 145

1 Answers1

0

A better design would be to normalize col2 to single values when inserting them, and then index the combination. E.g., for object1, the data would be:

col1          col2
object1       tom
object1       fred
object1       mary
object1       joe

The table then becomes (sizes are just an arbitrary example, make them as long as you need):

CREATE TABLE mytable (
    col1 VARCHAR(10) NOT NULL,
    col2 VARCHAR(10) NOT NULL,
    PRIMARY KEY (col2, col1)
);
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • Thanks. My concern here is that col2 will anywhere between have 5-20 values. Would this expand my database disk space requirement by 5-20x? –  Dec 06 '18 at 18:49