The question was:
How to get row which was selected by aggregate function?
The question was answered and partially resolve my problem. But I still can not replace GROUP BY
with DISTINCT ON
because of next reason:
I need both:
- Select
id
of aggregated row (may be resolved withDISTINCT ON
) - Sum the
ratio
column (may be resolved withGROUP BY
)
Some amount
of resource is consumed by user. One part of day 10h user consumed 8
another part of day 10h user consumed 3
and 4h he do not consume resource. The task is to bill consumed resource by the maximum and do not bill when resource was not consumed
id | name | amount | ratio
----+------+--------+-------
1 | a | 8 | 10
2 | a | 3 | 10
I accomplish this task by next query:
SELECT
(
SELECT id FROM t2
WHERE id = ANY ( ARRAY_AGG( tf.id ) ) AND amount = MAX( tf.amount )
) id,
name,
MAX(amount) ma,
SUM( ratio )
FROM t2 tf
GROUP BY name
Why it is not allowed to use aggregation functions with DISTINCT ON
?
select distinct on ( name ) id, name, amount, sum( ratio )
from t2
order by name, amount desc
Or even simpler:
select distinct on ( name ) id, name, max(amount), sum( ratio )
from t2
This will resolve also issues with ORDER BY
. No need a workaround with subquery
Is there technical reasons which do not allow query from the last example to work as described?
UPD
In theory this can work like next:
First example:
select distinct on ( name ) id, name, amount, sum( ratio )
from t2
order by name, amount desc
When the first distinct row found, it saves its id
and name
Next time when second and next non distinct row is found it will call to sum
and accumulate ratio
Second example:
select distinct on ( name ) id, name, max(amount), sum( ratio )
from t2
When the first distinct row found, it saves its id
and name
, accumulate ratio
and set current value of ratio
as maximum
Next time when second and next non distinct row is found it will call to sum
and accumulate ratio
If any of second and/or next non distinct row has greater value for ratio
column it is saved as maximum and saved value for id
is updated
UPD
if more than one row where amount = max(amount)
Postgres can return value from either row. As this is done for any field which is not under DISTINCT ON
To be sure which of is returned the query maybe qualified by ORDER BY
clause. Like this is done here