5

I have a database that looks like the following;

circuit_uid   |  customer_name   | location      | reading_date | reading_time | amps | volts  |  kw  | kwh | kva  |  pf  |  key
--------------------------------------------------------------------------------------------------------------------------------------
cu1.cb1.r1    | Customer 1       | 12.01.a1      | 2012-01-02   | 00:01:01     | 4.51 | 229.32 | 1.03 |  87 | 1.03 | 0.85 |    15
cu1.cb1.r1    | Customer 1       | 12.01.a1      | 2012-01-02   | 01:01:01     | 4.18 | 230.3 | 0.96 |  90 | 0.96 | 0.84 |    16
cu1.cb1.s2    | Customer 2       | 10.01.a1      | 2012-01-02   | 00:01:01     | 7.34 | 228.14 | 1.67 | 179 | 1.67 | 0.88 | 24009
cu1.cb1.s2    | Customer 2       | 10.01.a1      | 2012-01-02   | 01:01:01     | 9.07 |  228.4 | 2.07 | 182 | 2.07 | 0.85 | 24010
cu1.cb1.r1    | Customer 3       | 01.01.a1      | 2012-01-02   | 00:01:01     | 7.32 | 229.01 | 1.68 | 223 | 1.68 | 0.89 | 48003 
cu1.cb1.r1    | Customer 3       | 01.01.a1      | 2012-01-02   | 01:01:01     | 6.61 | 228.29 | 1.51 | 226 | 1.51 | 0.88 | 48004

What I am trying to do is produce a result that has the KWH reading for each customer from the earliest (min(reading_time)) on that date, the date will be selected by the user in a web form.

The result would be/should be similar to;

Customer 1   87
Customer 2   179
Customer 3   223

There are more than the number of rows per day shown here and there are more customers and the number of customers would change regularly.

I do not have much experience with SQL, I have looked at subqueries etc. but I do not have the chops to figure out how arrange it by the earliest reading per customer and then just output the kwh column.

This is running in PostgreSQL 8.4 on Redhat/CentOS.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Alan Ennis
  • 145
  • 1
  • 3
  • 10
  • Closely related: http://stackoverflow.com/q/3800551/939860 – Erwin Brandstetter Nov 17 '12 at 18:19
  • 2
    Please ask a **new question**. It's generally not ok to change the nature of the question substantially after good answers have been given. you can always link back to this one for context. I reverted your changes, you can find it in the edit log. – Erwin Brandstetter Nov 17 '12 at 18:46

3 Answers3

3
select customer_name,
       kwh,
       reading_date, 
       reading_time
from (
   select customer_name,
          kwh,
          reading_time,
          reading_date,
          row_number() over (partition by customer_name order by reading_time) as rn
   from readings
   where reading_date = date '2012-11-17'
) t
where rn = 1

As an alternative:

select r1.customer_name,
       r1.kwh, 
       r1.reading_date,
       r1.reading_time
from readings r1
where reading_date = date '2012-11-17'
and reading_time = (select min(r2.reading_time)
                    from readings
                    where r2.customer_name = r1.customer_name
                    and r2.read_date = r1.reading_date);

But I'd expect the first one to be faster.

Btw: why do you store date and time in two separate columns? Are you aware that this could be handled better with a timestamp column?

  • 1
    Good point about the timestamp column. Extracting date or time is super simple & super fast: `ts_column::date` or `ts_column::time`. – Erwin Brandstetter Nov 17 '12 at 18:24
  • i guess it is just my lack of experience that had me split out the date and time. Thank you for your excellent answer. Please accept my apologies and see the revised question. – Alan Ennis Nov 17 '12 at 18:45
3

This should be among the fastest possible solutions:

SELECT DISTINCT ON (customer_name)
       customer_name, kwh  -- add more columns as needed.
FROM   readings
WHERE  reading_date = user_date
ORDER  BY customer_name, reading_time

Seems to be another application of:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0
   SELECT rt.circuit_uid ,  rt.customer_name, rt.kwh
   FROM READING_TABLE rt JOIN  
       (SELECT circuit_uid, reading_time
       FROM READING_TABLE
       WHERE reading_date = '2012-01-02'
       GROUP BY customer_uid
       HAVING MIN(reading_time) = reading_time) min_time
   ON (rt.circuit_uid = min_time.circuit_uid 
      AND rt.reading_time = min_time.reading_time);

Parameterize the reading_date value in above query.

Yogendra Singh
  • 33,927
  • 6
  • 63
  • 73
  • That will give you an error because `kwh` is neither in the `group by` clause nor in an aggregate –  Nov 17 '12 at 18:11