0

I have a table with images and tags in it (called tbl_images)

image_file    image_tags
----------    ----------------------------
test.png      tag1 another_tag tag2
hi.jpg        tag9 tag1 another_tag qwerty

How would I do it so that:

  • searching 'another' returns nothing
  • searching 'tag1' returns the test.png row
  • searching 'tag1 another_tag' returns the test.png row
  • searching 'qwerty tag2' returns nothing

?

As in, the ability to search using the full tag name and to search multiple tags at the same time if they all occur in the same image_tags field in the same row. Note:-

  • I'm only searching within one field
  • Its being done with a PHP search input (retrieved in $_GET['search_terms'])
  • I can't do LIKE '%tag9% because tag9 could appear in a larger tag, such as Atag98, and as I said I would only like the exact tag match.
  • I can't do LIKE ' tag9 ' (note the space before and after,) because tag9 may appear at the beginning of the image_tags field (where there is no space, so there would be no match)
  • I can't do LIKE 'tag9 another_tag' because I don't know if another_tag comes after tag9, similarly I can't do LIKE 'tag9%another_tag' for the same reason; also tag9 could be part of a larger tag (as mentioned previously.)

Also, is fulltext search appropriate for this? And if so, could you provide an example please?

A tricky problem (for me anyway;) any help would be greatly appreciated.

q3d
  • 3,473
  • 8
  • 34
  • 39
  • 2
    Are you set on this table design? Ideally, you would put the image tags into a junction table along with the image id, then do a join to pull it out. In that way, MySQL doesn't have to use the intensive LIKE to parse through individual rows - http://stackoverflow.com/questions/5812680/many-to-many-relationship – citizenen Apr 10 '12 at 22:28
  • It doesn't seem to work for me. How would I use a query to find the results? I tried: `SELECT * FROM images JOIN tags ON tags.image_id = images.id WHERE tag = 'hello'` which returned images with the hello tag, but this doesn't work when I have multiple tags. `WHERE tag = 'hello' OR tag = 'flowers'` works if the image has the 'hello' and 'flowers' tags next to its id in the tags table. But `WHERE tag = 'hello' OR tag = 'tag_doesnt_exist'` returns the row with the image in it, even if the image does not have the 'tag_dosent_exist' tag. – q3d Apr 10 '12 at 22:43
  • If you want the image to have both tags, use `AND`: `WHERE tag = 'hello' AND tag = 'tag_doesnt_exist'` – bfavaretto Apr 10 '12 at 22:55
  • @bfavaretto this returns 0 rows – q3d Apr 10 '12 at 22:57
  • Oh sorry, you're right. You'd have to add another join to search for both: `SELECT * FROM images INNER JOIN tags t1 ON t1.image_id = images.id INNER JOIN tags t2 ON t2.image_id = images.id WHERE t1.tag = 'hello' AND t2.tag = 'world'` – bfavaretto Apr 10 '12 at 23:06
  • @bfavaretto Although your solution works, unfortunately this isn't scalable when you have something like 50 tags (where lots of joins would be slow and the query output be *massive*) and are working in an automated environment, such as a web application. – q3d Apr 10 '12 at 23:09
  • So you want to search for images that have all 50 tags at the same time? You expect users to enter 50 tags into an input field and return all images that only match all 50? – bfavaretto Apr 10 '12 at 23:10
  • @bfavaretto The other problem being that it would be hard to go through the tag list automatically. 50 tags would be over the top, but even something like 5 seems inefficient. – q3d Apr 10 '12 at 23:13
  • @bfavaretto I'm sorry. This does seem like the most efficient way to do this. Thanks for your help with this. – q3d Apr 11 '12 at 18:48
  • 1
    `I can't do LIKE ' tag9 '` - actually, you could definitely use that search syntax - the only thing you would have to do is to be sure your delimiter ( in this case space ) is present around all tags - this means your field would have to start and end in space. Easy enough to do when programming the interface for it. The search would still be very slow compared to a single tag indexed approach ( 3 table system ). To search multiple tags, you would need to do multiple LIKEs `LIKE ' tag9 ' AND LIKE ' tag2 '` - the order wouldn't matter but each test does have overhead. – Steve Aug 23 '12 at 06:11

3 Answers3

7

A better solution to store tags with images would be to create a separate table where tags are linked to images. You would then create a row for each tag, e.g.

image_file (FK)  tag
---------------  -----------
test.png         tag1
test.png         another_tag
test.png         tag2
hi.jpg           tag9
hi.jpg           tag1
hi.jpg           another_tag
hi.jpg           qwerty

You can then simply look up which tags are related to an image:

SELECT tag FROM tags WHERE image_file = 'test.png'

or which images are related to a certain tag:

SELECT image_file FROM tags WHERE tag = 'tag1'

All problems with LIKE which you described in your question are avoided using this approach. Full-text search is not needed when normalizing your database like this.

Geert
  • 1,085
  • 9
  • 13
  • I over-simplified the table structure, each row has a unique Id for searching and other columns (such as username) which is used for counting and I need to select the data and display it; this doesn't seem like it would suit my needs. Thanks – q3d Apr 10 '12 at 22:28
  • 1
    @iyrag, this is a good answer (very similar to the junction table approach suggested in comments above). It doesn't matter if you have many fields, you can grab everything using joins. – bfavaretto Apr 10 '12 at 22:40
  • I can't use this approach when selecting all images with criteria. Also, if a user had 50 tags, 50 more rows would be created, which doesn't work well and is very inefficient. What if the same tag was used more than once in different images? – q3d Apr 10 '12 at 22:50
  • 1
    It is not inefficient to have multiple rows per image, relational databases are made for that. It's actually more efficient than trying to tell the tags apart if they are all in a single string. As for reusing the tags, the ideal approach would be to have three tables, one for images, one for tags, and a junction table with `image_id` and `tag_id`. – bfavaretto Apr 10 '12 at 23:09
2

With no clear delimiter for your tags, you'll need to make a regex selection.

SELECT * FROM tbl_images WHERE image_tags REGEXP '[[:<:]]tag[[:>:]]'

The [[:<:]] and [[:>:]] markers identify word boundaries in MySQL... so while a REGEX selection might be slower than LIKE, this will still work.

Read more on MySQL REGEX here: http://dev.mysql.com/doc/refman/5.1/en/regexp.html

Matt van Andel
  • 636
  • 7
  • 13
0

If all tags are seperated by a space, explode them into an array, and search the array for the keyword you are looking for. If the keyword was found, take that row.

I don't know what language you are using here is how it should go:

loop all rows:
   get values of the current row:
   tags = array of the tags cell, explode ' '
   foreach (tag in tags):
      if (tag == keyword)
         save current row info
   end foreach;
end loop;

You don't need to create a new row for each tag, like one here suggested, because this is not an ideal option.

Novak
  • 2,760
  • 9
  • 42
  • 63