3

I have a table in Oracle database, which have 40 columns. I know that if I want to do a group by query, all the columns in select must be in group by.

I simply just want to do:

select col1, col2, col3, col4, col5 from table group by col3

If I try:

select col1, col2, col3, col4, col5 from table group by col1, col2, col3, col4, col5

It does not give the required output.

I have searched this, but did not find any solution. All the queries that I found using some kind of Add() or count(*) function.

In Oracle is it not possible to simply group by one column ?

UPDATE:

My apologies, for not being clear enough.

My Table:

+--------+----------+-------------+-------+
| id     | col1     | col2        | col3  |
+--------+----------+-------------+-------+
| 1      | 1        | some text 1 | 100   |
| 2      | 1        | some text 1 | 200   |
| 3      | 2        | some text 1 | 200   |
| 4      | 3        | some text 1 | 78    |
| 5      | 4        | some text 1 | 65    |
| 6      | 5        | some text 1 | 101   |
| 7      | 5        | some text 1 | 200   |
| 8      | 1        | some text 1 | 200   |
| 9      | 6        | some text 1 | 202   |
+--------+----------+-------------+-------+

and by running following query:

select col1, col2, col3 from table where col3='200' group by col1;

I will get the following desired Output:

+--------+----------+-------------+-------+
| id     | col1     | col2        | col3  |
+--------+----------+-------------+-------+
| 2      | 1        | some text 1 | 200   |
| 3      | 2        | some text 1 | 200   |
| 7      | 5        | some text 1 | 200   |
+--------+----------+-------------+-------+
pro_newbie
  • 336
  • 2
  • 6
  • 19
  • 2
    You can not do group by like that. What is that you need? Maybe if you share some sample data/output, we can help –  May 11 '17 at 15:40
  • `order by` instead of `group by`? `select col1, col2, col3, col4, col5 from table order by col3` and no you can't group by a single column when you have multiple. Think about it. if we have a,b,c in row one and x,y,c in row 2. and we gorup by the 3rd column only... what should the system pick for columns 1 and 2? a or x? b or y? the system can't decide for you which is why it's invalid. (though valid in order versions of mySQL and current version if you change a setting) – xQbert May 11 '17 at 15:43
  • Put simply in most RDBMS engines it makes no sense to have a group by w/o aggregation as the row count on aggregation has to diminish from the original set and the group by explains to what level the aggregation should occur. DISTINCT makes sense when you have duplication of records and you need to remove the duplicates MySQL extends the group by so the system "PICKS" what to display from the select when not grouped by. It could pick a or it could pick x and it could vary based from one execution to the next. Which is why most engines don't extend the group by – xQbert May 11 '17 at 15:49

5 Answers5

7

Long comment here;

Yeah, you can't do that. Think about it... If you have a table like so:

Col1 Col2 Col3
A    A    1
B    A    2
C    A    3

And you're grouping by only Col2, which will group down to a single row... what happens to Col1 and Col3? Both of those have 3 distinct row values. How is your DBMS supposed to display those?

Col1 Col2 Col3
A?   A    1?
B?        2?
C?        3?

This is why you have to group by all columns, or otherwise aggregate or concatenate them. (SUM(),MAX(), MIN(), etc..)

Show us how you want the results to look and I'm sure we can help you.

Edit - Answer:

First off, thanks for updating your question. Your query doesn't have id but your expected results do, so I will answer for each separately.

Without id

You will still need to group by all columns to achieve what you're going for. Let's walk through it.

If you run your query without any group by:

select col1, col2, col3 from table where col3='200'

You will get this back:

+----------+-------------+-------+
| col1     | col2        | col3  |
+----------+-------------+-------+
| 1        | some text 1 | 200   |
| 2        | some text 1 | 200   |
| 5        | some text 1 | 200   |
| 1        | some text 1 | 200   |
+----------+-------------+-------+

So now you want to only see the col1 = 1 row once. But to do so, you need to roll all of the columns up, so your DBMS knows what do to with each of them. If you try to group by only col1, you DBMS will through an error because you didn't tell it what to do with the extra data in col2 and col3:

select col1, col2, col3 from table where col3='200' group by col1 --Errors

+----------+-------------+-------+
| col1     | col2        | col3  |
+----------+-------------+-------+
| 1        | some text 1 | 200   |
| 2        | some text 1 | 200   |
| 5        | some text 1 | 200   |
| ?        | some text 1?| 200?  |
+----------+-------------+-------+

If you group by all 3, your DBMS knows to group together the entire rows (which is what you want), and will only display duplicate rows once:

select col1, col2, col3 from table where col3='200' group by col1, col2, col3

+----------+-------------+-------+
| col1     | col2        | col3  |
+----------+-------------+-------+
| 1        | some text 1 | 200   |
| 2        | some text 1 | 200   | --Desired results
| 5        | some text 1 | 200   |
+----------+-------------+-------+

With id

If you want to see id, you will have to tell your DBMS which id to display. Even if we group by all columns, you won't get your desired results, because the id column will make each row distinct (They will no longer group together):

select id, col1, col2, col3 from table where col3='200' group by id, col1, col2, col3

+--------+----------+-------------+-------+
| id     | col1     | col2        | col3  |
+--------+----------+-------------+-------+
| 2      | 1        | some text 1 | 200   | --id = 2
| 3      | 2        | some text 1 | 200   |
| 7      | 5        | some text 1 | 200   |
| 8      | 1        | some text 1 | 200   | --id = 8
+--------+----------+-------------+-------+

So in order to group these rows, we need to explicitly say what to do with the ids. Based on your desired results, you want to choose id = 2, which is the minimum id, so let's use MIN():

select MIN(id), col1, col2, col3 from table where col3='200' group by col1, col2, col3
--Note, MIN() is an aggregate function, so id need not be in the group by

Which returns your desired results (with id):

+--------+----------+-------------+-------+
| id     | col1     | col2        | col3  |
+--------+----------+-------------+-------+
| 2      | 1        | some text 1 | 200   |
| 3      | 2        | some text 1 | 200   |
| 7      | 5        | some text 1 | 200   |
+--------+----------+-------------+-------+

Final thought

Here were your two trouble rows:

+--------+----------+-------------+-------+
| id     | col1     | col2        | col3  |
+--------+----------+-------------+-------+
| 2      | 1        | some text 1 | 200   |
| 8      | 1        | some text 1 | 200   |
+--------+----------+-------------+-------+

Any time you hit these, just think about what you want each column to do, one at a time. You will need to handle all columns any time you do grouping or aggregates.

  • id, you only want to see id = 2, which is the MIN()
  • co1, you only want to see distinct values, so GROUP BY
  • col2, you only want to see distinct values, so GROUP BY
  • col3, you only want to see distinct values, so GROUP BY
Aaron Dietz
  • 10,137
  • 1
  • 14
  • 26
  • I have update my question, can you please take a look at it – pro_newbie May 12 '17 at 13:36
  • @pro_newbie Updated my answer. This is conceptual stuff that everyone learns differently. I tried to explain it in a few different ways. Hopefully it helps. – Aaron Dietz May 12 '17 at 14:37
  • Thanks for explaining in detail. I have now much more understanding of it. One more thing, what if all the **col2**'s values are also different from each other, then what you explained in `Without id` section would not work. It will give me all the records. How to handle that scenario ? – pro_newbie May 13 '17 at 10:31
  • @pro_newbie In that scenario you would have to decide how you want to handle `col2`. Your options would be 1.) Don't select it. 2.) Group by it, but accept the fact that the rows won't consolidate as much. (but yeah, in your hypothetical, they would not group at all...) 3.) Use an aggregate to show only the one you want (such as `MIN()`, as we did with `id` in the with id section) – Aaron Dietz May 15 '17 at 14:35
  • 4.) Concatenate `col2` values into a string, using `LISTAGG()` for example. I would only do this if you are pulling data... It is not a good way to store data. See: http://stackoverflow.com/questions/12145379/how-to-retrieve-two-columns-data-in-a-b-format-in-oracle – Aaron Dietz May 15 '17 at 14:43
  • 1
    Great answer! Went really step by step to the gist of the problem. – Dan Macak Sep 06 '18 at 12:38
2

maybe analytic functions is what you need

try smth like this:

select col1, col2, col3, col4, col5 
, sum(*) over (partition by col1) as col1_summary
, count(*) over () as total_count
from t1 

if you google the article - you find thousands on examples for example this Introduction to Analytic Functions (Part 1)

are
  • 2,535
  • 2
  • 22
  • 27
2

Why do you want to GROUP BY , wouldn't you want to ORDER BY instead?

If you state an English language version of the problem you are trying to solve (i.e. the requirements) it would be easier to be more specific.

Roger Cornejo
  • 1,507
  • 1
  • 8
  • 7
0

I guess,maybe you need upivot function

or post your specific final result you want

select  col3, col_group 
from table
UNPIVOT ( col_group for value in ( col1,col2,col4,col5)) 
Maru Jiang
  • 26
  • 4
0
SELECT * FROM table 
WHERE id IN (SELECT MIN(id) FROM table WHERE col3='200' GROUP BY col1)