11

How to select distinct count over multiple columns?

SELECT COUNT(DISTINCT col1, col2, col3) FROM table; 

Is there a working equivalent of this in DB2?

Sunil
  • 3,404
  • 10
  • 23
  • 31
Tonan
  • 147
  • 1
  • 1
  • 9

2 Answers2

24

There are multiple options:

select count(*) from
   (select distinct col1, col2, col3 FROM table) t

The other would be to combine the columns via a CONCAT:

select count(distinct col1 || col2 || col3) from table

The first option is the cleaner (and likely faster) one.

Siavas
  • 4,992
  • 2
  • 23
  • 33
data_henrik
  • 16,724
  • 2
  • 28
  • 49
  • The first option doesn't work in DB2 . The second one works fine, thank you. – Tonan Jan 09 '18 at 13:00
  • There was a typo in the first which I fixed. – data_henrik Jan 09 '18 at 13:05
  • ...concatenating the columns may yield surprising results (for example, rows with `(11,1,1)` and `(1, 1, 11)`), so you at minimum need to know something about your domain, and will almost always require a separator be added. Plus the concatenation will be slow on a table of any real size. @user2329435 - what do you mean "doesn't work"? That's the proper way to structure that type of query. – Clockwork-Muse Jan 09 '18 at 20:49
  • 1
    `select count(*) from (select distinct col1, col2, col3 FROM table) as correlationname` works. – Tonan Jan 10 '18 at 07:22
  • Which platform / version of Db2? My version is on LUW – data_henrik Jan 10 '18 at 08:37
  • +1 for concatenation, which also works in a larger query with other summary statistitics (e.g. `select count(*), count(distinct(*)), count(distinct col1 || col2 || col3) from table`). – jarauh Apr 11 '19 at 12:18
  • @Clockwork-Muse At which size do you expect performance problems? For me, concatenation still works ok for ~ 300,000 rows, which, to me, looks like a "real size." – jarauh Apr 11 '19 at 12:19
3
select count(distinct col1 || '^' || col2 || '^' || col3) from table

to avoid problems during concatenation like between 1 || 11 which would be the same as 11 || 1.