2

I am new to SQL server, I am a R user but R can't be used with my huge dataset (not enough memory).

What I want to do:

I want to create a sparse matrix from a table with only 2 columns, I dont have any value column, it seems easy but I didn't find the right way to do it.

My data:

ID_patient | ID_product
-----------------------
123            A
123            B
111            C
222            A
333            D
333            E

Ouput wanted:

ID_patient |   A   |   B   |   C   |   D   |   E   |
----------------------------------------------------
123            1       1
111                            1
222            1
333                                    1       1    

I have read that I can use the GROUP BY function or Pivot feature but what I have tried so far failed.

Edit

I don't know all the products, so the right way to do that is by using dynamic pivot ?

  • It is a simple PIVOT/conditional aggregation. Please post your attempts. – Lukasz Szozda Jun 05 '18 at 09:10
  • Thank you for your comment, I will look for this feature then. – Alexandre georges Jun 05 '18 at 09:15
  • 1
    There are any number of examples if you search on "T-SQL PIVOT" or "T-SQL Crosstab" – Nick.Mc Jun 05 '18 at 09:21
  • Here what I have found as dynimac sql : `declare @col nvarchar(max), @query nvarchar(max) select @col = stuff( (select ','+quotename(id_product) from dbo.table for xml path('')), 1,1, '') set @query = N'select * from ( select * from dbo.table )a PIVOT ( MAX(id_patient) for id_product in ('+@col+') )pvt' EXECUTE sp_executesql @query` – Alexandre georges Jun 05 '18 at 09:33

2 Answers2

2

You can try something like PIVOT

See demo

Select * from 
(select *, copy=Id_product from t)t
pivot
(
    count(copy) for ID_product in ([A],[B],[C],[D],[E]))p

If you don't know A, B, C, D, .. before hand then you should go for dynamic pivot

update: updated dynamic piv demo

declare @cols nvarchar(max)
declare @query nvarchar(max) 

select @cols=  Stuff((select ','+  quotename( ID_product) from
      (select distinct id_product from t) t  for xml path ('')),1,1,'')

select @query='Select * from 
( select *, copy=Id_product from t ) t
pivot
(count(copy) for ID_product in ( '+@cols+' ))p '
  exec(@query)
DhruvJoshi
  • 17,041
  • 6
  • 41
  • 60
2

Try this

IF OBJECT_ID('tempdb..#Temp')IS NOT NULL 
DROP TABLE #Temp

DECLARE @temp AS TABLE (ID_patient INT, ID_product varchar(10))
INSERT INTO @temp
SELECT 123,'A' UNION ALL
SELECT 123,'B' UNION ALL
SELECT 111,'C' UNION ALL
SELECT 222,'A' UNION ALL
SELECT 333,'D' UNION ALL
SELECT 333,'E'  


SELECT * INTO #Temp 
    FROM @temp
DECLARE @Sql nvarchar(max),
        @Col  nvarchar(max),
        @Col2  nvarchar(max)

SELECT @Col=STUFF((SELECT DISTINCT ', '+QUOTENAME(ID_product) FROM #Temp FOR XML PATH ('')),1,1,'')
SELECT @Col2=STUFF((SELECT DISTINCT ', '+'ISNULL('+QUOTENAME(ID_product)+','' '') AS '+QUOTENAME(ID_product) FROM #Temp FOR XML PATH ('')),1,1,'')

SET @Sql='
SELECT ID_patient,'+@Col2+'
FROM
(
SELECT *, DENSE_RANK()OVER( PARTITION BY ID_patient ORDER By ID_patient) AS [Val] FROM #Temp
)AS Src
PIVOT 
(MAX(Val) FOR ID_product IN ('+@Col+')
)AS PVT '
PRINT @Sql
EXEC (@Sql)

Result

ID_patient  A   B   C   D   E
------------------------------
111         0   0   1   0   0
123         1   1   0   0   0
222         1   0   0   0   0
333         0   0   0   1   1
Sreenu131
  • 2,476
  • 1
  • 7
  • 18