1

I have a table that contains a line entry for each item that I need to group into an output with all the values of multiple rows that have the same uniqueID into one column result. The ItemType will be different but the UniqueID and OtherData shouldn't change as it's tied directly to the UniqueID, but I'm not 100% sure as there are well over 2M lines in this DB. I have seen similar questions and they would appear to do what I would like them to do, but the answers are over simplified and usually only include the unique id and the field they want on one line. I need to include about 5 other columns but I don't need to any anything fancy to them. Just trying to group results from the one column as well as return the other columns (that are likely not never be different).

To over simplify the data set this is what it looks like followed by what I'd like to do.

Example Table:

UniqueID | ItemType   | OtherData
----------------------------------
1234     | apples     | 123.1.123.1
1234     | oranges    | 123.1.123.1
2233     | red fish   | 123.5.67.2
1234     | grapes     | 123.1.123.1
2233     | blue fish  | 123.5.67.2

Desired Result:

UniqueID | ItemType   | OtherData
----------------------------------
1234     | apples, oranges, grapes | 123.1.123.1
2233     | red fish, blue fish     | 123.5.67.2

I've tried a couple versions of SELECT DISTINCT and GROUP BY but that either returns the same as if I didn't or some other undesirable result. Also tried STRING_AGG but that only works on MSSQL2017. Any help you can provide would be much appreciated, thank you!

Charlieface
  • 52,284
  • 6
  • 19
  • 43
Ryan Barnes
  • 107
  • 1
  • 10
  • The question/answer you linked to appears to do exactly what you are asking, for the RDBMS it relates to. – Stu Jul 28 '22 at 16:29
  • What is your DBMS? You want to aggregate your data and convert single strings to a comma-separated list. How to do this differs from one DBMS to another. Always tag your SQL requests with the DBMS you are using. – Thorsten Kettner Jul 28 '22 at 16:31
  • @ThorstenKettner thanks for that tip, I'll update the question. DBMS is Microsoft SQL Server 2012. – Ryan Barnes Jul 28 '22 at 16:49
  • @Stu correct, but it also only returns the initial column and the 'student names'. I have 5 or 6 other columns that I need to also include but don't need to group. I don't know enough SQL magic to also include those fields given the example in the linked question. – Ryan Barnes Jul 28 '22 at 16:50
  • Note: Tried STRING_AGG, only works in MSSQL 2017. – Ryan Barnes Jul 28 '22 at 16:57
  • Does this answer your question? [How to concatenate text from multiple rows into a single text string in SQL Server](https://stackoverflow.com/questions/194852/how-to-concatenate-text-from-multiple-rows-into-a-single-text-string-in-sql-serv) – Charlieface Jul 29 '22 at 09:13
  • Thanks for the suggestion Charlieface, that link is exactly the link I linked to in the OP. No, it's not quite as complete as I need for my purposes. – Ryan Barnes Jul 29 '22 at 19:03

3 Answers3

1

Building on the answer from the previous link you can create a cte then execute the query

This will given you the

SELECT Main.UniqueID,
       LEFT(Main.ItemTypes,Len(Main.ItemTypes)-1) As "ItemTypes"
FROM
    (
        SELECT DISTINCT ST2.UniqueID, 
            (
                SELECT ST1.ItemType + ',' AS [text()]
                FROM dbo.TheTable ST1
                WHERE ST1.UniqueID = ST2.UniqueID
                ORDER BY ST1.UniqueID
                FOR XML PATH (''), TYPE
            ).value('text()[1]','nvarchar(max)') ItemTypes
        FROM dbo.TheTable ST2
    ) [Main]

Once you have that you can build this into a cte with the with statement then join back on the table to get the rest of the data.

with ItemTypes as
(
SELECT Main.UniqueID,
       LEFT(Main.ItemTypes,Len(Main.ItemTypes)-1) As "ItemTypes"
FROM
    (
        SELECT DISTINCT ST2.UniqueID, 
            (
                SELECT ST1.ItemType + ',' AS [text()]
                FROM dbo.TheTable ST1
                WHERE ST1.UniqueID = ST2.UniqueID
                ORDER BY ST1.UniqueID
                FOR XML PATH (''), TYPE
            ).value('text()[1]','nvarchar(max)') ItemTypes
        FROM dbo.TheTable ST2
    ) [Main]
) 

Select Distinct TheTable.UniqueID, ItemTypes.ItemTypes, TheTable.OtherData
from TheTable join ItemTypes 
    on (TheTable.UniqueID = ItemTypes.UniqueID)

Results

UniqueID  ItemTypes                  OtherData
--------- -------------------------- --------------------------------
1234      apples,oranges,grapes      OtherData
2233      red fish,blue fish         OtherData

There are a few expensive operations this will be an expensive query to run. but with 2million rows should be ok with a good server.

Kevin Bosch
  • 116
  • 4
  • This is probably the most likely to work. I am still fiddling with the query itself to convert it to the dataset I use. One thing I realized after trying this is that I do a LEFT JOIN on the ItemTypes from another table. I think I can get this to work but my SQL magic isn't that good yet. Do you think you could provide a secondary example (keep what is there) but also show how to do it with a left join? Or maybe that's an entirely different solution? – Ryan Barnes Jul 29 '22 at 19:05
  • I am cobbling together a way to use this solution and I have some initial promising results. Thanks for your suggestion! – Ryan Barnes Jul 29 '22 at 19:26
  • One thing to note though, I had to remove the "LEFT (" line from the top. With that line in it didn't return anything and only threw errors at me. – Ryan Barnes Jul 29 '22 at 19:27
0

What you need is a group concat query in MySQL.

Your query will be

select UniqueID, GROUP_CONCAT(ItemType) ItemTypes, GROUP_CONCAT( DISTINCT OtherData) Otherdata from ExampleTable GROUP BY UniqueID;

A sample reference you can check here: enter image description here

What group concat does is that it will group all column values against that UniqID col and mark them in a comma separated format. Link to read more in depth about group concat can be found here: Link

Debarko
  • 62
  • 6
  • thanks, I made the mistake of not including the DBMS (MSSQL 2012). Group_Concat does not work (gives error: 'GROUP_CONCAT' is not a recognized aggregate function) which is expected as my DB isn't MySQL. Sorry. This is a great example though and looks like it would do what I'm hoping. – Ryan Barnes Jul 28 '22 at 16:53
  • You can try using STRING_AGG. https://learn.microsoft.com/en-gb/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-2017 – Debarko Jul 28 '22 at 17:01
0

Group by all columns except for the item type. Then use the string aggregation function on the item type. In SQL Server this is STRING_AGG, but as you are using an antique version of the DBMS, you will have to emulate it. I am showing the proper query with STRING_AGGhere. Use Google or stackoverflow to see how it is emulated in your old SQL Server version.

select
  uniqueid,
  string_agg(itemtype, ', ') as itemtypes
  otherdata
from mytable
group by uniqueid, otherdata
order by uniqueid, otherdata;

Two remarks:

  • UniqueID is a funny name for an ID that is not unique but occurs multiple times in the table. Misnomers like this can lead to errors and bad maintainability.
  • As to "OtherData shouldn't change as it's tied directly to the UniqueID": This indicates that your data model is flawed and your database table is not normalized. You should have two tables instead, one for the unique data, one for the details.
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • I agree with your remarks for sure. UniqueID is not the actual table title and only used here to denote a unique identifier for the data. The data is actually linking to computer host IDs built which is a pretty long string of unique identifiers. – Ryan Barnes Jul 28 '22 at 23:09
  • Part 2 of the details, the database I have access to is not the production one and is a sort of amalgamation of the main databases/tables used in our environment. So yes, we have rows that would be: *computer1 | ip address | Windows OS | ItemType_A | more_detail_item1 | etc *computer1 | ip address | Windows OS | ItemType_B | more_detail_item1 | etc *computer1 | ip address | Windows OS | ItemType_C | more_detail_item1 | etc *computer5 | ip address | Windows OS | ItemType_A | more_detail_item1 | etc and so on. While not at all efficient, it is what I have to work with. – Ryan Barnes Jul 28 '22 at 23:10