Postgres: select all row with count of a field greater than 1

Question

i have table storing product price information, the table looks similar to, (no is the primary key)

no   name    price    date
1    paper   1.99     3-23
2    paper   2.99     5-25
3    paper   1.99     5-29
4    orange  4.56     4-23
5    apple   3.43     3-11

right now I want to select all the rows where the "name" field appeared more than once in the table. Basically, i want my query to return the first three rows.

I tried:

SELECT * FROM product_price_info GROUP BY name HAVING COUNT(*) > 1

but i get an error saying:

column "product_price_info.no" must appear in the GROUP BY clause or be used in an aggregate function

score 83 · Accepted Answer · answered Apr 01 '16 at 14:27

83

SELECT * 
FROM product_price_info 
WHERE name IN (SELECT name 
               FROM product_price_info 
               GROUP BY name HAVING COUNT(*) > 1)

answered Apr 01 '16 at 14:27

Juan Carlos Oropeza

47,252
12
78
118

5

Better (as in faster) to use e.g. `COUNT(id)` per the official docs. – Madbreaks Jul 09 '19 at 21:01
You could join "name" inside the subquery with the group by, and use the "exists" clause instead of "in". I think it's faster – Diego Souza Nov 09 '21 at 16:25

score 12 · Answer 2 · answered Apr 01 '16 at 14:27

12

Try this:

SELECT no, name, price, "date"
FROM (
  SELECT no, name, price, "date",
         COUNT(*) OVER (PARTITION BY name) AS cnt 
  FROM product_price_info ) AS t
WHERE t.cnt > 1

You can use the window version of COUNT to get the population of each name partition. Then, in an outer query, filter out name partitions having a population that is less than 2.

answered Apr 01 '16 at 14:27

Giorgos Betsos

71,379
9
63
98

I have tested the versions from Juan Carlos Oropeza, Giorgos Betsos, and jarlh in SQLite3. This version is the fastest one. It's 26% faster than the other two. – cyfex Nov 22 '21 at 15:24

Jeff C Johnson · Answer 3 · 2019-08-28T17:41:45.930

10

Window Functions are really nice for this.

SELECT p.*, count(*) OVER (PARTITION BY name) FROM product p;

For a full example:

CREATE TABLE product (no SERIAL, name text, price NUMERIC(8,2), date DATE);

INSERT INTO product(name, price, date) values
('paper', 1.99, '2017-03-23'),
('paper', 2.99, '2017-05-25'),
('paper', 1.99, '2017-05-29'),
('orange', 4.56, '2017-04-23'),
('apple', 3.43, '2017-03-11')
;

WITH report AS (
  SELECT p.*, count(*) OVER (PARTITION BY name) as count FROM product p
)
SELECT * FROM report WHERE count > 1;

Gives:

 no |  name  | price |    date    | count
----+--------+-------+------------+-------
  1 | paper  |  1.99 | 2017-03-23 |     3
  2 | paper  |  2.99 | 2017-05-25 |     3
  3 | paper  |  1.99 | 2017-05-29 |     3
(3 rows)

edited Aug 28 '19 at 17:41

answered Sep 25 '17 at 15:41

Jeff C Johnson

187
2
9

Nice but this doesn't fully answer op's question – Madbreaks Jul 09 '19 at 21:03
2

@Madbreaks thanks for catching that. I updated the answer. – Jeff C Johnson Aug 28 '19 at 17:43

score 2 · Answer 4 · answered Apr 01 '16 at 14:27

Self join version, use a sub-query that returns the name's that appears more than once.

select t1.*
from tablename t1
join (select name from tablename group by name having count(*) > 1) t2
  on t1.name = t2.name

Basically the same as IN/EXISTS versions, but probably a bit faster.

score -1 · Answer 5 · answered Jul 13 '22 at 10:48

-1

SELECT name, count(name)
FROM product_price_info
GROUP BY name
HAVING COUNT(name) > 1
LIMIT 3

answered Jul 13 '22 at 10:48

Vitaliy Turkevich

21

This doesn't do what the question asks for. It should select all columns for each row where the `name` value appears more than once. It should not add aggregation or new columns. – Z4-tier Jul 13 '22 at 15:11
1

Funny enough this is the answer to the question I asked google and it gave me this SO – Mike Graf May 11 '23 at 14:26

Postgres: select all row with count of a field greater than 1

5 Answers5

Linked