1

I have a PostgreSQL table like this:

CREATE TABLE foo (man_id, subgroup, power, grp)
AS VALUES
    ( 1, 'Sub_A',  1, 'Group_A' ),
    ( 2, 'Sub_B', -1, 'Group_A' ),
    ( 3, 'Sub_A', -1, 'Group_B' ),
    ( 4, 'Sub_B',  1, 'Group_B' ),
    ( 5, 'Sub_A', -1, 'Group_A' ),
    ( 6, 'Sub_B',  1, 'Group_A' ),
    ( 7, 'Sub_A', -1, 'Group_B' ),
    ( 8, 'Sub_B',  1, 'Group_B' );

The power calculation works like this:

Total Power of Subgroup Sub_A in the grp Group_A is (1 + (-1) ) = 0
Total Power of Subgroup Sub_B in the grp Group_A is ((-1) + 1 ) = 0
Total Power of Subgroup Sub_A in the grp Group_B is ((-1) + (-1) ) = -2
Total Power of Subgroup Sub_B in the grp Group_B is (1 + 1 ) = 2

So the power of Sub_A in the Group_A is not equal to power of Sub_A in the Group_B
So the power of Sub_B in the Group_A is not equal to power of Sub_B in the Group_B

I want to query the database with a subgroup name. If for a same subgroup name power is equal across all the other grp names, then it will return True, else False.

As an example, sub_A and sub_B both will return False. What would be the recommended way to do this?

I want something like:

SELECT * FROM foo (solution query will be added)
WHERE subgroup = 'sub_A'

And it returns False.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Dinesh Ahuja
  • 925
  • 3
  • 15
  • 29
  • Please don't cross post. – Walter Mitty Jun 18 '18 at 23:44
  • I took the liberty to fix some distracting typos, orthogonal to the question (or so I assume). Please always disclose your version of Postgres. Roughly how many rows, distinct groups and subgroups? Cardinality and value frequencies may be relevant for choosing the best query. – Erwin Brandstetter Jun 19 '18 at 02:00
  • what result do you expect if the subgroup `Sub_A` does not exist at all, or exists for only 1 group. I feel the choice isn't clear, as one might alternatively want `NULL`, `FALSE` or `TRUE` – Haleemur Ali Jun 19 '18 at 02:53

3 Answers3

1

Assuming that in your CREATE TABLE statement, 'sub_A' is meant to be 'Sub_A' (because Postgres is case-sensitive), and that your power values are actually integers (if they aren't, just add casts; the code below is simpler without them), then you can calculate the power for each subgroup within a group as follows:

select
    subgroup,
    grp,
    sum(power) as sum_power
from
    foo
group by
    subgroup,
    grp

To determine whether all of the total power values for a subgroup are the same, just check that the minimum and maximum values are the same. Convert the previous query into a subquery, where the main query does that comparison, as follows:

select
    subgroup
from (
    select
        subgroup,
        grp,
        sum(power) as sum_power
    from
        foo
    group by
        subgroup,
        grp
    ) as subpwr
group by
    subgroup
having
    min(sum_power) = max(sum_power);
rd_nielsen
  • 2,407
  • 2
  • 11
  • 18
1

Read the question carefully

I want to query the database with a subgroup name.

And:

I want something like

SELECT * FROM foo (solution query will be added)
WHERE subgroup = 'Sub_A'

The important point for performance is to exclude irrelevant rows early and only compute aggregates for the given subgroup. Then (assuming more than a few distinct subgroups), an index on (subgroup) can help:

CREATE INDEX ON foo (subgroup);

Each of the following queries returns FALSE if at least two groups have different total sums for the given subgroup, and TRUE in all other cases (with a minor exception for query 5, see below).

Query 1

SELECT count(DISTINCT total_power) = 1
FROM  (
   SELECT sum(power) AS total_power
   FROM   foo
   WHERE  subgroup = 'Sub_B'  -- exclude irrelevant rows early!
   GROUP  BY grp
   ) sub;

Query 2

SELECT count(*) = 1
FROM  (
   SELECT true
   FROM  (
      SELECT sum(power) AS total_power
      FROM   foo
      WHERE  subgroup = 'Sub_C'
      GROUP  BY grp
      ) sub2
   GROUP  BY total_power
   ) sub2;

Query 3

SELECT count(*) OVER () = 1
FROM  (
   SELECT sum(power) AS total_power
   FROM   foo
   WHERE  subgroup = 'Sub_A'
   GROUP  BY grp
   ) sub
GROUP  BY total_power
LIMIT  1;

Query 4

(
SELECT FALSE
FROM  (
   SELECT sum(power) AS total_power
   FROM   foo
   WHERE  subgroup = 'Sub_A'
   GROUP  BY grp
   ) sub
GROUP  BY total_power
OFFSET 1
LIMIT  1
)
UNION ALL
SELECT TRUE
LIMIT 1;

This one is special. Related answers with explanation:

Query 5

SELECT min(total_power) = max(total_power)  -- can fail for NULL values
FROM  (
   SELECT sum(power) AS total_power
   FROM   foo
   WHERE  subgroup = 'Sub_A'
   GROUP  BY grp
   ) sub;

The last can fail if NULL values in power are allowed. (But you would have to define expected results in this case anyway.)

I ran an extensive test and found all queries to perform about the same under ideal conditions:

db<>fiddle here

Query 5 tended to be a tad bit faster than the rest.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • as always, your answers are illustrative. I was wondering if you could compare my solution to the 5 alternatives you have tested. I'm curious what effect avoiding a subquery in favour of a window-expression & sorting has on performance in this setting. – Haleemur Ali Jun 19 '18 at 03:17
  • 1
    @HaleemurAli: I added [your smart query](https://stackoverflow.com/a/50920135/939860) to the test battery: works, and about the same performance as the rest: *db<>fiddle [here](https://dbfiddle.uk/?rdbms=postgres_10&fiddle=39628360b8711060bb79f0588229f8b3)* – Erwin Brandstetter Jun 19 '18 at 11:21
1

Here's 1 way not yet mentioned by the other answers

SELECT SUM(power) = FIRST_VALUE(SUM(power)) OVER () powpow 
FROM foo
WHERE subgroup = 'Sub_A'
GROUP BY grp
ORDER BY powpow
LIMIT 1

-- returns:
-- false if some values differ
-- true if all values are the same
-- no rows if the where condition fails to match any rows.
Haleemur Ali
  • 26,718
  • 5
  • 61
  • 85