1

I have a table of data for a particular stock that contains every single transaction going back a few years. I want to group this data on a per-day basis, so that I can see things like the daily volume, etc.

The best way I have come up with so far is:

select 
datepart(year, date),
datepart(month, date),
datepart(day, date), 
sum(volume) from hi 
group by 
datepart(year, date),
datepart(month, date),
datepart(day, date)
order by
datepart(year, date),
datepart(month, date),
datepart(day, date)

My SQL is somewhat limited, so I'm wondering if there's a more efficient way to do this?

steve8918
  • 1,820
  • 6
  • 27
  • 38
  • Thanks for the responses. I tried running my method, David B's method, and Andomar's method, and then checked the execution plan. Surprisingly, the queries all took about the same time (I have about 40 million rows in the table) but when I looked at the execution plan, the query cost of first two methods were 18% each, and the query cost of converting the date to a date type was 65%. Not sure why that is, but I'll do some more investigating. Thanks everyone for your answers, I learned a lot! – steve8918 Jan 29 '12 at 20:11

4 Answers4

1
SELECT DateAdd(dd, sub.DayCount, 0) as TheDate, SUM(sub.Volume)
FROM
(SELECT DateDiff(dd, 0, date) as DayCount, Volume FROM hi)
as Sub
GROUP BY Sub.DayCount
ORDER BY Sub.DayCount

As far as efficiency goes, both queries must read the whole table once. They incur the same IO. You should messure IO with SET STATISTICS IO ON to see if that's a problem.

Since it seems that IO is the real issue, one option is to create an index which includes both Volume, DayCount (and no other columns).

Amy B
  • 108,202
  • 21
  • 135
  • 185
1

In SQL Server 2008 and above, you can use the date type. It contains only the date part of a datetime:

group by cast([date] as date)

Or for earlier versions:

group by convert(varchar(10), [date],121)

This groups on the first 10 characters of the ODBC date, or yyyy-mm-dd.

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • Thanks, this produced the most easily digestible SQL for me to understand (I'm using SQL Server 2008). However, as I noted in my comment above, when I ran the 3 queries together and checked the execution plan, this one had the highest query cost, even though the execution times were roughly the same. It seemed to spend more time in a hash map step so I will investigate more to understand why that is. – steve8918 Jan 29 '12 at 20:15
0

Here is what you are looking for

SELECT CONVERT(VARCHAR, hi.date,101), SUM(volume) 
FROM hi 
GROUP BY CONVERT(VARCHAR, hi.date,101)
0

Here's another method:

http://www.sqlfiddle.com/#!3/25480/4

select
  sum(volume) as totalPerDay,
  simple_date
from
(
select 
  convert(varchar, [date], 101) as simple_date, 
  volume, 
  id
from hi
) tmp
group by
  simple_date

Full disclosure: sqlfiddle.com is my site.

Jake Feasel
  • 16,785
  • 5
  • 53
  • 66