Perhaps this was a wonky design decision on my part, but I created a table that has a column that holds from 1 to N comma-separated values, and I need to query whether any of several values are contained within those values.
To make it more clear, I'm allowing the user to select an array of movie genres (comedy, drama, etc.) and have a "movies" table where these genres are all contained in one column. For example, the genres column for the movie "The Princess Bride" contains the composite csv value "Adventure, Family, Fantasy"
So if the user selects at least one of those genres, that movie should be included in the result set that is returned based on the search criteria that they choose. But the movie should only be returned once, even if the user selected more than one of those genres to search for.
A previous incarnation of the database contained lookup tables and many-to-many tables which obviated this problem, but in an attempt to make it "easier" and more straightforward by making it one big (wide) table, I have run into this conundrum (of how to craft the query).
The query is currently dynamically built this way:
string baseQuery = "SELECT * FROM MOVIES_SINGLETABLE ";
string imdbRatingFilter = "WHERE IMDBRating >= @IMDBMinRating ";
string yearRangeFilter = "AND YearReleased BETWEEN @EarliestYear AND @LatestYear ";
string genreFilterBase = "AND GENRE IN ({0}) ";
string mpaaRatingFilterBase = "AND MPAARating IN ({0}) ";
string orderByPortion = "ORDER BY IMDBRating DESC, YearReleased DESC ";
...and the strings containing the selected criteria built like so:
if (filterGenres)
{
if (ckbxAction.Checked) genresSelected = "\'Action\',"; // changed to "@" hereafter:
if (ckbxAdventure.Checked) genresSelected = genresSelected + @"'Adventure',";
. . .
if (ckbxWar.Checked) genresSelected = genresSelected + @"'War',";
if (ckbxWestern.Checked) genresSelected = genresSelected + @"'Western',";
LastCommaIndex = genresSelected.LastIndexOf(',');
genresSelected = genresSelected.Remove(LastCommaIndex, 1)
}
// the same situation holds for mpaaRatings as for Genres:
if (filterMPAARatings)
{
if (ckbxG.Checked) mpaaRatingsSelected = @"'G',";
if (ckbxPG.Checked) mpaaRatingsSelected = mpaaRatingsSelected + @"'PG',";
if (ckbxPG13.Checked) mpaaRatingsSelected = mpaaRatingsSelected + @"'PG13',";
if (ckbxNR.Checked) mpaaRatingsSelected = mpaaRatingsSelected + @"'NR',";
LastCommaIndex = mpaaRatingsSelected.LastIndexOf(',');
mpaaRatingsSelected = mpaaRatingsSelected.Remove(LastCommaIndex, 1);
}
. . .
//string genreFilterBase = "AND GENRES IN ({0}) ";
if (filterGenres)
{
genreFilter = string.Format(genreFilterBase, genresSelected);
completeQuery = completeQuery + genreFilter;
}
//string mpaaRatingFilterBase = "AND MPAARating IN ({0}) ";
if (filterMPAARatings)
{
mpaaRatingFilter = string.Format(mpaaRatingFilterBase, mpaaRatingsSelected);
completeQuery = completeQuery + mpaaRatingFilter;
}
Is my design salvageable? IOW, can I retrieve the appropriate data given these admittedly questionable table design decisions?
UPDATE
I tested GMB's SQL by incorporating it into my SQL, but I may be doing something wrong, because it won't compile:
I don't know why those commas are there, but I reckon GMB is more of a SQL expert than I am...
Nevertheless, this does work (sans the commas and pipes):
UPDATE 2
I tried using CONTAINS, also:
SELECT * FROM MOVIES_SINGLETABLE WHERE IMDBRating >= 7.5 AND YearReleased BETWEEN '1980' AND '2020' AND CONTAINS (genres, 'Adventure') OR CONTAINS (genres,'Family') OR CONTAINS (genres, 'Fantasy') AND CONTAINS (MPAARating, 'G') OR CONTAINS (MPAARating, 'PG') OR CONTAINS (MPAARating, 'PG-13') ORDER BY IMDBRating DESC, YearReleased DESC
...but got "Cannot use a CONTAINS or FREETEXT predicate on table or indexed view 'MOVIES_SINGLETABLE' because it is not full-text indexed."
The answer by Alex Aza here [https://stackoverflow.com/questions/6003240/cannot-use-a-contains-or-freetext-predicate-on-table-or-indexed-view-because-it] gives a solution, but apparently it's not available for SQL Server Express:
...and besides, this will eventually (soon) be migrated to a SQLite table, anyway, and I doubt SQLite would support CONTAINS if doing it in SQL Server (albeit Express) requires, even if possible at all, hurtling through hoops.
UPDATE 3
I incorporated Lukasz's idea for SQLite (as that's what I'm now querying), with a query string that ended up being:
SELECT MovieTitle, MPAARating, IMDBRating, DurationInMinutes, YearReleased, genres, actors, directors, screenwriters FROM MOVIES_SINGLETABLE WHERE IMDBRating >= @IMDBMinRating AND (YearReleased BETWEEN @EarliestYear AND @LatestYear) AND (INSTR(genres, @genre1) > 0 OR INSTR(genres, @genre2) > 0) AND (MPAARating = @mpaaRating1) ORDER BY IMDBRating DESC, YearReleased DESC LIMIT 1000
...but still get no results.