1

Column names: name, reading, channel, time

I'm currently trying to the find the max value for each distinct name. Each name has many values as it is read on a time interval of once every hour. The goal is to pick out the max value for each distinct name and record the time interval that it occurred in.

So far I have gotten this code:

 SELECT name, max(reading) AS "max_reading", data, channel, time 
 FROM interval_data 
 GROUP BY name, data, channel

However, this gives me an error saying that time has to be aggregated or grouped by. Is there any easy way to append the time that the max value occurred at to the output and not have to perform a grouped by/aggregate function on the column?


Sample Data:

NAME    READING     DATA    CHANNEL     TIME
1       1           1       1           1/15/2015 09:00
1       3           1       1           1/15/2015 10:00
1       2           1       1           1/15/2015 11:00
1       5           1       1           1/15/2015 12:00
2       2           1       1           1/15/2015 09:00
2       4           1       1           1/15/2015 10:00
2       6           1       1           1/15/2015 11:00
2       5           1       1           1/15/2015 12:00
3       7           1       1           1/15/2015 09:00
3       3           1       1           1/15/2015 10:00
3       5           1       1           1/15/2015 11:00
3       2           1       1           1/15/2015 12:00

Desired Output: (Max READING for each distinct NAME with TIME when the max READING reading occurred)

NAME    READING     DATA    CHANNEL     TIME
1       5           1       1           1/15/2015 12:00
2       6           1       1           1/15/2015 11:00
3       7           1       1           1/15/2015 09:00
pnear
  • 11
  • 3
  • 4
    What [tag:rdbms] are you using? – Mureinik Jan 15 '16 at 20:03
  • Time colum is not in your `group by` clause. – Nomeaning25 Jan 15 '16 at 20:03
  • It would be helpful if you can provide the sample data and the required output. Not sure if just adding TIME column to the group by clause would solve your problem – Techie Jan 15 '16 at 20:05
  • @Nimesh yes it will. The error clearly states that time has to be grouped by. Adding time under group by solves the problem – Richard Hamilton Jan 15 '16 at 20:07
  • 2
    @RichardHamilton It will resolve the error but will it produce the output required by OP. That was my concern. – Techie Jan 15 '16 at 20:09
  • 2
    I doesnt solve the problem because "Time" is probably a TimeStamp an he will get the same amount of rows as if he wasn't using MAX – Nomeaning25 Jan 15 '16 at 20:09
  • @RichardHamilton: The user's requirement was to *append the time that the max value occurred at to the output and not have to perform a grouped by/aggregate function on the column*. It would eliminate the error but it probably wouldn't give the correct result set. – Cᴏʀʏ Jan 15 '16 at 20:09
  • @Nomeaning25 that's absolutely correct. This timestamp column can be displayed by grouping at many levels like year, month, date, hour etc. so the problem has to be described properly, – Techie Jan 15 '16 at 20:11
  • @pnear What is the time interval that you are looking out for? and what is the sql rdbms? If you want to display the exact timestamp for the max reading, you will have to use subquery or join. – Techie Jan 15 '16 at 20:15
  • Look up `top-n-per-group` or `greatest-n-per-group`. For SQL Server see [Retrieving n rows per group](http://dba.stackexchange.com/questions/86415/retrieving-n-rows-per-group). – Vladimir Baranov Jan 17 '16 at 00:15
  • @Nimesh I've posted the sample data and desired output. Hopefully this should clear up the problem – pnear Jan 18 '16 at 15:07
  • @pnear You haven't mentioned what database system you are using? You can check the asnwers mentioned below if it resolves your query – Techie Jan 18 '16 at 18:59

4 Answers4

1

There is, but not in a single query, because adding time to the GROUP BY clause would render your query useless.

So, in a sub-query get all the fields you need for a reasonable grouping. Join this sub-query and get all the grouped fields, your max_reading and eventually time from interval_data.

MyBrainHurts
  • 2,480
  • 2
  • 27
  • 27
1

Without really knowing how DATA and CHANNEL come into play, you could use something like this to find what data you need for each NAME's MAX(READING):

SELECT
    id.NAME, id.READING, id.DATA, id.CHANNEL, id.TYPE
FROM 
    INTERVAL_DATA id
JOIN (
    SELECT 
        NAME, MAX(READING) AS READING
    FROM 
        INTERVAL_DATA
    GROUP BY
        NAME
    ) id_agg
    ON 
    id.NAME = id_agg.NAME 
    AND 
    id.READING = id_agg.READING    

If you need CHANNEL and TYPE to distinguish a unique row, include those in the inner-join sub-query and in the join's ON parameters.

Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
0

Select requires some fiddeling. Maybe sometnihg like this:

Select B.NAME, A."Max_READING", B.DATA, B.CHANNEL, B.TIME
    From INTERVAL_DATA B INNER JOIN 
        ( Select NAME, Max(READING) As "Max_READING", DATA, CHANNEL
        From INTERVAL_DATA Group By NAME, DATA, CHANNEL) A ON B.NAME = A.NAME AND B.DATA = A.DATA AND B.CHANNEL = A.CHANNEL

Similar problem here: SQL Select only rows with Max Value on a Column

Community
  • 1
  • 1
Nomeaning25
  • 263
  • 1
  • 11
0

You don't really need to do a GROUP BY clause, you can just grab the MAX(Reading) value using a sub query in the WHERE clause:

  IF OBJECT_ID('tempdb.dbo.#T','U') IS NOT NULL
    DROP TABLE #T

CREATE TABLE #T
    (NAME VARCHAR(50) NOT NULL,
    CHANNEL VARCHAR(50) NOT NULL,
    DATA VARCHAR(50) NOT NULL,
    READING INT NOT NULL,
    [TIME] DATETIME NOT NULL)

INSERT INTO #T
    (NAME, CHANNEL, DATA, READING, [TIME])
VALUES
    ('Oscar','NBC', 'green', 1, '2015-01-01'),
    ('Oscar','NBC', 'green', 200, '2015-01-02'),
    ('Oscar','NBC', 'green', 3, '2015-01-03'),
    ('Oscar','NBC', 'red', 4, '2015-01-01'),
    ('Oscar','NBC', 'red', 5, '2015-01-02'),
    ('Oscar','NBC', 'red', 62, '2015-01-03'),
    ('Oscar','CNN', 'red', 7, '2015-01-01'),
    ('Oscar','CNN', 'red', 8, '2015-01-02'),
    ('Oscar','CNN', 'red', 9, '2015-01-03'),
    ('Luke','NBC', 'green', 1, '2015-01-01'),
    ('Luke','NBC', 'green', 2, '2015-01-02'),
    ('Luke','NBC', 'green', 3, '2015-01-03'),
    ('Luke','NBC', 'red', 4, '2015-01-01'),
    ('Luke','NBC', 'red', 5, '2015-01-02'),
    ('Luke','NBC', 'red', 6, '2015-01-03'),
    ('Luke','CNN', 'red', 7, '2015-01-01'),
    ('Luke','CNN', 'red', 88, '2015-01-12'),
    ('Luke','CNN', 'red', 9, '2015-01-22'),
    ('George','NBC', 'green', 1, '2015-01-01'),
    ('George','NBC', 'green', 2, '2015-01-02'),
    ('George','NBC', 'green', 3, '2015-01-09'),
    ('George','NBC', 'red', 4, '2015-01-01'),
    ('George','NBC', 'red', 5, '2015-01-02'),
    ('George','NBC', 'red', 6, '2015-01-03'),
    ('George','CNN', 'red', 7, '2015-01-01'),
    ('George','CNN', 'red', 8, '2015-01-02'),
    ('George','CNN', 'red', 11, '2015-01-15')

--Provides the Max Reading for each Name
SELECT t.NAME, t.CHANNEL, t.DATA, t.READING, t.[TIME]
FROM #t AS t
WHERE
    t.reading = (SELECT MAX(sub.reading)
                FROM #t AS sub
                WHERE
                    sub.NAME = t.NAME)
--Provides the Max Reading for each name, channel, and data grouping
SELECT t.NAME, t.CHANNEL, t.DATA, t.READING, t.[TIME]
FROM #t AS t
WHERE
    t.reading = (SELECT MAX(sub.reading)
                FROM #t AS sub
                WHERE
                    sub.NAME = t.NAME
                    AND
                    sub.CHANNEL = t.CHANNEL
                    AND
                    sub.DATA = t.DATA);

The first query gives you the Max Reading and Time for each distinct name while the second provides the Max Reading and Time for each distinct name, channel, and data.

Jericho
  • 213
  • 1
  • 10