0

I have a table:

CREATE TABLE [dbo].[usecase](
    [id] [int] NULL,
    [desc] [varchar](50) NULL,
    [comm] [varchar](50) NULL,
    [catgroup] [varchar](50) NULL
) ON [PRIMARY]

GO

Example data below:

insert into usecase (id, [desc], comm, catgroup) values(1,'desc 1', 'comment 1', '1')
insert into usecase (id, [desc], comm, catgroup) values(2,'desc 2', 'comment 2', '1,2')
insert into usecase (id, [desc], comm, catgroup) values(3,'desc 3', 'comment 3', '1,2,3')
insert into usecase (id, [desc], comm, catgroup) values(4,'desc 4', 'comment 4', '60,61')
insert into usecase (id, [desc], comm, catgroup) values(5,'desc 5', 'comment 5', '59,60,61')
insert into usecase (id, [desc], comm, catgroup) values(6,'desc 6', 'comment 6', '9,3')

The catgroup is the category id for each use case, it is a 1:n relationship - one use case could belong to more than one categories

In the above example data, category 1 has 3 use cases (1,2,3), category 2 has 2 use cases (2,3), category 3 has 2 use cases (3, 6) and so on.

What I am looking for is a query based on category id and return all the use cases that belong to this given category id.

The database is SQL Server 2014.

Dale K
  • 25,246
  • 15
  • 42
  • 71
mdivk
  • 3,545
  • 8
  • 53
  • 91
  • 3
    Read ["Is storing a delimited list in a database column really that bad?"](https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad) and revise the schema to use a linking table to assign categories to use cases. – sticky bit Feb 24 '20 at 22:47
  • 4
    Stop storing delimited data; that's the real problem here. – Thom A Feb 24 '20 at 22:47
  • Agreed, if the data was correctly normalised, then your query would be trivial. – ADyson Feb 24 '20 at 22:56
  • 1
    Probably, the first thing you should do is to learn proper database design. Your current design will only cause you headache later on. – Eric Feb 24 '20 at 23:17
  • Also, what happen if the column `catgroup` is longer than 50 characters? – Eric Feb 24 '20 at 23:18
  • Thank you Eric, if you read this post https://medium.com/@xie3208080/stackoverflow-data-analytics-with-bigquery-ea2ac3fdb3f8, you will see this is a design of the database of stackoverflow, every post's tags are stored in this pipe delimeted format. :) – mdivk Feb 25 '20 at 00:18
  • Yes but that's for BigQuery, which has a specific function for un-nesting. It's not a conventional relational database engine. SQL Server is more conventional, and in a conventional RDBMS every best-practice guidance you will ever read will tell you not to store multiple values in the same column. – ADyson Feb 25 '20 at 00:51

1 Answers1

2

As the initial comments have already stated, the best solution is to normalize your data so that you have a table like usecase_categories (usecase_id, category_id) so that no field holds more than one piece of data, but if you cannot change your data structure for some reason a clumsy condition like this can work:

WHERE catgroup = '5'  -- Is the only category
   OR catgroup LIKE '5,%' -- Is the first category
   OR catgroup LIKE '%,5' -- Is the last category
   OR catgroup LIKE '%,5,%' -- Is somewhere in the middle of the list

In this case, "category 5" is being looked for, you can modify it further to make it a little more flexible...

WHERE catgroup = '5' 
   OR catgroup LIKE CAST(somecategoryid AS VARCHAR) + ',%'
   OR catgroup LIKE '%,' + CAST(somecategoryid AS VARCHAR)
   OR catgroup LIKE '%,' + CAST(somecategoryid AS VARCHAR) + ',%'

There is probably a way with regular expressions as well, but really, you should just fix the table design.

GMB
  • 216,147
  • 25
  • 84
  • 135
Uueerdo
  • 15,723
  • 1
  • 16
  • 21
  • Thank you Uueerdo, if you read this post https://medium.com/@xie3208080/stackoverflow-data-analytics-with-bigquery-ea2ac3fdb3f8, you will see this is a design of the database of stackoverflow, every post's tags are stored in this pipe delimeted format. :) – mdivk Feb 25 '20 at 00:19
  • 1
    @mdivk "Someone does this" is not a good argument for that design strategy, and Sql Server != BigQuery; I am pretty sure BigQuery isn't even considered an RDBMS. – Uueerdo Feb 25 '20 at 00:23
  • Yes, @uueerdo, I completely agree with you. – mdivk Feb 25 '20 at 00:33