Getting all distinct values and sorting them by hour

Question

I have the following table:

ID    TimeStamp    CarNumber
1    2018\12\03 14:05:32    433
2    2018\12\03 14:13:52    420
3    2018\12\03 14:55:14    433
4    2018\12\03 15:12:03    420
5    2018\12\03 16:15:55    570

My desired output is a list of each car number in each hour of the day:

Hour       CarNumbers
0          0
...
14         433, 420
15         420
16         570
...

I'm using SQL Server 2008 so I can't use string_agg(). Is there a way to do this?

Perhaps you can find something here https://stackoverflow.com/questions/17591490/how-to-make-a-query-with-group-concat-in-sql-server — jarlh, Dec 11 '18 at 08:20
I've been playing with for xml for a while, haven't found a way yet. — user2979612, Dec 11 '18 at 08:23
Should it be querying for only 1 day, or should it also group by & output the date? And do all 24 hours need to be in the result, regardless if the data is missing some hours? — LukStorms, Dec 11 '18 at 08:27
@LukStorms For 1 day ( `where TimeStamp between 0:00 and 23:59` ), optionally grouped, if data is missing it's ok if the row is just skipped (all that not that important, I just need the numbers for each hour). — user2979612, Dec 11 '18 at 08:31

LukStorms · Accepted Answer · 2018-12-12T16:04:48.067

1

To get counts for all the hours, you'll need to left join to a list with all the hours.
And probably combined with the date of the timestamps if they span more than 1 date.

To substitute the missing STRING_AGG you could use the FOR XML trick.

Example snippet:

-- Using a table variable for demonstration
declare @Table table (id int primary key identity(1,1), [Timestamp] datetime, CarNumber int);

-- Sample data
insert into @Table ([Timestamp], CarNumber) values
('2018-12-03 14:05:32', 433),
('2018-12-03 14:13:52', 420),
('2018-12-03 14:55:14', 433),
('2018-12-03 15:12:03', 420),
('2018-12-03 16:15:55', 570),
('2018-12-09 14:00:00', 999);

-- Query
WITH RCTE_HOURS AS
(
  select 0 as [Hour] 
  union all
  select [Hour] + 1
  from RCTE_HOURS
  where [Hour] < 23
)
SELECT h.[Hour],
 COALESCE(STUFF((
   SELECT ', ' + CONVERT(VARCHAR(10), CarNumber) 
   FROM @Table t2 
   WHERE CAST(t2.[Timestamp] AS DATE) = d.[Date]
     AND DATEPART(HOUR, t2.[Timestamp]) = h.[Hour]
     AND CarNumber IS NOT NULL
   GROUP BY CarNumber
   ORDER BY MIN(t2.[Timestamp])
   FOR XML PATH ('')
   ), 1, 2, ''),'0') AS CarNumbers
FROM 
(
  SELECT DISTINCT CAST([Timestamp] AS DATE) AS [Date]
  FROM @Table
  WHERE CAST([Timestamp] AS DATE) = CAST('2018-12-03' AS DATE)
) AS d
CROSS JOIN RCTE_HOURS h
ORDER BY d.[Date], h.[Hour];

Returns:

Hour CarNumbers
---- ---------- 
0    0
1    0
2    0
3    0
4    0
5    0
6    0
7    0
8    0
9    0
10   0
11   0
12   0
13   0
14   433, 420
15   420
16   570
17   0
18   0
19   0
20   0
21   0
22   0
23   0

edited Dec 12 '18 at 16:04

answered Dec 11 '18 at 09:22

LukStorms

28,916
5
31
45

`CONCAT` is an sql server 2012+ function. `VALUES` is also not supported as I see. – user2979612 Dec 11 '18 at 12:35
@user2979612 I replaced the CONCAT with a `+` concatination, and the VALUES with a Common Table Expression. A pitty that CONCAT isn't in 2008, because it has the benefits that NULL's are ignored & INT's are implicitly casted. – LukStorms Dec 11 '18 at 12:53
Thank you, this query works with some tables. However, it does not work with big data tables (the same thing with 200k+ rows). It returns a `Timeout Expired` error. I'm currently trying to work around it, but if you have a solution I'll very much appreciate. – user2979612 Dec 13 '18 at 09:18
Without knowing what primary keys or indexes on those tables? I rather not bother myself by guessing. But I solved a similar question like this by suggesting to load data into a #temporary table with a combined primary key that would benefit the speed of the SQL. F.e. `create table #tmpCars (dt date, hr int, carNumber int not null, primary key (dt, hr));`. Then select the data you need into that temp table. Then change the SQL so it uses that temp table and joins on those fields directly (without using CAST of DATEPART like in the current SQL). – LukStorms Dec 13 '18 at 11:05
And then while you're using a temporary table, might as well create a temp table for those hour numbers, and cross join to that temp table instead of using that Recursive CTE. – LukStorms Dec 13 '18 at 11:11
Correction, a primairy key on `(dt, hr)` would give errors on duplicates. So it should be a non-unique index on `(dt, hr)` I guess. Anyway, the point is that a linking to a table mostly performs better if you do it directly on indexed fields, instead of joining on transformations of those indexed fields. – LukStorms Dec 13 '18 at 11:24

Nikhil S · Answer 2 · 2018-12-11T08:53:22.920

0

SELECT   DATEPART(HOUR, Timestamp), STUFF(
             (SELECT ',' + carnumber 
              FROM tablename t1
              FOR XML PATH (''))
             , 1, 1, '') carnumbers 
from tablename 
group by  DATEPART(HOUR, Timestamp) asc ;

edited Dec 11 '18 at 08:53

answered Dec 11 '18 at 08:37

Nikhil S

3,786
4
18
32

`missing FROM clause` – user2979612 Dec 11 '18 at 08:52

Getting all distinct values and sorting them by hour

2 Answers2