0

I have a table as below:

CREATE TABLE [dbo].[UseCase]
(
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [UseCase] [text] NOT NULL,
    [Comment] [nvarchar](40) NOT NULL,
    [Categories] [nvarchar](255) NULL
)

GO

INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 1', 'This is comment 1 for use case 1', 'Cat1|cat3|cat9')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 2', 'This is comment 1 for use case 2', 'Cat2|cat9')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 3', 'This is comment 1 for use case 3', 'Cat1|cat3')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 4', 'This is comment 1 for use case 4', 'Cat1|cat4')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 5', 'This is comment 1 for use case 5', 'Cat5')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 6', 'This is comment 1 for use case 6', 'Cat6|cat8')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 7', 'This is comment 1 for use case 7', 'Cat3|cat8')
INSERT INTO [dbo].[UseCase] ([UseCase], [Comment], [Categories])
VALUES ('This is use case 8', 'This is comment 1 for use case 8', 'Cat1|cat3|cat9')

Note the Categories column is a string with pipe | as delimiter.

Can anyone share your thought on how to write a query to get the stats like how many use cases of each category?

Thank you.

UPDATE:

Per requested by a comment, I am adding the following as expected result:

Cat1 4
Cat2 1
Cat3 4
Cat4 1
Cat5 1
Cat6 1
Cat7 0
Cat8 2
Cat9 9
mdivk
  • 3,545
  • 8
  • 53
  • 91
  • Please **[edit]** your question (by clicking on the [edit] link below it) and add the expected output based on your sample data. [Formatted text](https://meta.stackoverflow.com/a/251362) please, [no screen shots](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557). See [here](https://meta.stackexchange.com/questions/81852) for some tips on how to create nice looking tables. –  Feb 13 '20 at 20:53
  • 2
    The most correct answer here is that you should never store delimited data in a column. Your categories have a 1:many relationship with your usecase and should therefore be kept in a new table like `usecase_id | category`. Where one `usecase_id` has many records in this new table (one for each category).Then this problem disappears. – JNevill Feb 13 '20 at 20:57
  • What is your SQL Server version? – Zhorov Feb 13 '20 at 21:26
  • it's 2016, thank you. – mdivk Feb 13 '20 at 21:29
  • Read [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad), where you will see a lot of reasons why the answer to this question is **Absolutely yes!** – Zohar Peled Feb 13 '20 at 22:22

1 Answers1

2

If I understand your question correctly and you use SQL Server 2016+, the next approach using STRING_SPLIT() may help:

Statement:

SELECT ss.[value] AS Category, COUNT(*) AS [Count]
FROM dbo.UseCase uc
CROSS APPLY STRING_SPLIT(uc.Categories, '|') ss
GROUP BY ss.[value]

Result:

Category    Count
Cat1        4
Cat2        1
cat3        4
cat4        1
Cat5        1
Cat6        1
cat8        2
cat9        3
Zhorov
  • 28,486
  • 6
  • 27
  • 52
  • Assuming the OP can't normalize the database, this is the approach I would use. – Zohar Peled Feb 13 '20 at 22:22
  • Thank you very much, what if it is on and older database like 2008/R2? – mdivk Feb 13 '20 at 22:49
  • 1
    Did some search aroung to see if there is an equivamence in 2008R2, and I got lots but unfortunately none of them are elegant as STRING_SPLIT in 2016, thank you again Zhorov. – mdivk Feb 13 '20 at 23:26