2

I was encountered with a "not a GROUP BY expression"error.

I would need to search for similar title, medium and description. Artist table consist of artistid, artist last name and artist first name. work table consist of workid, title, medium, description, artistid artistid are unique key.

List the details of any works of art that have more than one copy recorded in the database.

SELECT W.workid, W.title, W.medium, W.description, W.artistid, A.FirstName, A.LastName
FROM dtoohey.work W, dtoohey.artist A
GROUP BY W.artistid, A.FirstName, A.LastName 
having count(*) > 1;
Wei Jun
  • 31
  • 1
  • 7
  • Maybe exact error message would be helpful, otherwise `having count(*) > 1` does not seem ok for me but that may mean I simply did never use it that way. – Dilettant Jun 04 '16 at 17:06
  • 1
    You are selecting from work and from artist, with no relationship between the two? Did you mean to join them, perhaps on artistid? (Do you understand my question?) Separately, please edit your post to remove the tag that doesn't apply. This is tagged both as MySQL and as Oracle, which is probably not right. –  Jun 04 '16 at 18:19
  • In my little world, there is always a relationship in the model that links the facts modeled by the table, say in this case (taking the names as is) every artist will relate to 0 or more work items. So it could be helpful to have (in a more normalized system) a third table noting only the relationships artistid to workid. Than it is even more easy to answer these kind of questions. Non normalize, the work table might carry a foreign key ref to artist.artistid, and write operations on these tables would then be chained in transactions ensuring integrety of the relation. Is there any relation? – Dilettant Jun 05 '16 at 06:20
  • Sorry for concat of to comments, but time of edit would be over, so: Try your artistid in table work as join criterium. The third realtionship might nevertheless come in handy, when a work has mor artists associated. – Dilettant Jun 05 '16 at 06:34

4 Answers4

1

Seems you don't have a proper join between the tables (i have suggest one .. you should do the correct one)

If you want the group by on W.artistIid (alias the count of the workid for each artist) you cannot have W.workid, W.title, W.medium in select

SELECT  W.artistid, A.FirstName, A.LastName
FROM dtoohey.work W
INNER JOIN dtoohey.artist A ON A.artistid = W.artistid
GROUP BY W.artistid, A.FirstName, A.LastName 
having count(*) > 1;

otherwise if you want check if the select return more that one rows for the column select you must add all column to the group by clause

SELECT W.workid, W.title, W.medium, W.description, W.artistid, A.FirstName, A.LastName
FROM dtoohey.work W
INNER JOIN dtoohey.artist A ON A.artistid = W.artistid
GROUP BY W.workid, W.title, W.medium, W.description, W.artistid, A.FirstName, A.LastName 
having count(*) > 1;
ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
  • Think perhaps the two tables need to be joined? –  Jun 04 '16 at 18:21
  • @mathguy could be .. (or not) the question is about group by .. and group by is also for cartesian product .. between the two table .. the question remain substantially the same .. with or without join .. – ScaisEdge Jun 04 '16 at 18:25
  • The question is about works that have more than one copy in the db. How could that possibly be about the Cartesian product? Let's be serious, OK? –  Jun 04 '16 at 18:34
  • In case of grouping the cartesian product `having count(*) > 1` doesn't make any sense, because all records have count>1 ... one exception is an edge case where there is at most one record in one of these two tables. – krokodilko Jun 04 '16 at 18:37
  • I have added a suggestion for JOIN .. so the OP can evaluate properly – ScaisEdge Jun 04 '16 at 18:42
  • @mathguy: isn't SELECT'ng FROM a table list with two or more tables an **implicit** join itself? – Dilettant Jun 05 '16 at 05:57
  • @Dilettant but if you don't declare the join condition you obtain a cartesia product non a proper related join .. – ScaisEdge Jun 05 '16 at 05:59
  • Yes, explicit is better than implicit in most cases, but it is not say a concat or zip of the tables columns but a FULL INNER JOIN. The target was the comment above, not the question as a whole. Did someone try the answer out? If I had more than a mobile on my hands now, I would do so ;-) – Dilettant Jun 05 '16 at 06:03
1

Maybe writing an answer (that focuses on @mathguy's observation of the missing join specifity and the one, that the mix of SELECT columns versus GROUP BY/HAVING does not fit) is better in finding out what your problem is and giving ideas how to enhance the question ;-) ... next time I suggest to be more heavy on the question, so the world does not have to work so heavy on the answers.

I do not think this is a MySQL, Oracle, or what not database specific problem, but more a SQL beginners learning journey ... you may want to also look for join explanations here: "Difference between Inner Join & Full join"

Starting from the minimal subset of info given in question: 2 tables artist and work related presumable over a shared id (say artist_id).

One thing directly asking for trouble in databases like PostgreSQL or ParStream is selecting columns in group by queries neither being grouped by nor aggregated / filtered. But here we go:

Create tables:

$psql> CREATE TABLE artist(artist_id INT, given_name VARCHAR(42), family_name VARCHAR(99));
CREATE TABLE
$psql> CREATE TABLE work(work_id INT, artist_id INT, title VARCHAR(42));
CREATE TABLE

Insert some data:

$psql> INSERT INTO artist VALUES(1, 'John', 'Doe');
INSERT 0 1
$psql> INSERT INTO artist VALUES(2, 'Natalie', 'Noir');
INSERT 0 1
$psql> INSERT INTO work VALUES(43, 1, 'The game is on');
INSERT 0 1
$psql> INSERT INTO work VALUES(44, 1, 'The game is over');
INSERT 0 1
$psql> INSERT INTO work VALUES(98, 2, 'La nuit commonce');
INSERT 0 1
$psql> INSERT INTO work VALUES(97, 2, 'Un jour se lve');
INSERT 0 1

Check what is in it:

$psql> SELECT * FROM work;
 work_id | artist_id |      title       
---------+-----------+------------------
      43 |         1 | The game is on
      44 |         1 | The game is over
      98 |         2 | La nuit commonce
      97 |         2 | Un jour se lve
(4 rows)

$psql> SELECT * FROM artist;
 artist_id | given_name | family_name 
-----------+------------+-------------
         1 | John       | Doe
         2 | Natalie    | Noir
(2 rows)

Show the implicit INNER JOIN:

$psql> SELECT * FROM work W, artist A;
 work_id | artist_id |      title       | artist_id | given_name | family_name 
---------+-----------+------------------+-----------+------------+-------------
      43 |         1 | The game is on   |         1 | John       | Doe
      43 |         1 | The game is on   |         2 | Natalie    | Noir
      44 |         1 | The game is over |         1 | John       | Doe
      44 |         1 | The game is over |         2 | Natalie    | Noir
      98 |         2 | La nuit commonce |         1 | John       | Doe
      98 |         2 | La nuit commonce |         2 | Natalie    | Noir
      97 |         2 | Un jour se lve   |         1 | John       | Doe
      97 |         2 | Un jour se lve   |         2 | Natalie    | Noir
(8 rows)

Show the explicit INNER JOIN with a dummy condition to let the parser pass our query (Update: Don't use this at home, only to show the mish-mash.):

$psql> SELECT * FROM work W INNER JOIN artist A ON 1 = 1;
 work_id | artist_id |      title       | artist_id | given_name | family_name 
---------+-----------+------------------+-----------+------------+-------------
      43 |         1 | The game is on   |         1 | John       | Doe
      43 |         1 | The game is on   |         2 | Natalie    | Noir
      44 |         1 | The game is over |         1 | John       | Doe
      44 |         1 | The game is over |         2 | Natalie    | Noir
      98 |         2 | La nuit commonce |         1 | John       | Doe
      98 |         2 | La nuit commonce |         2 | Natalie    | Noir
      97 |         2 | Un jour se lve   |         1 | John       | Doe
      97 |         2 | Un jour se lve   |         2 | Natalie    | Noir
(8 rows)

Now a more useful INNER JOIN matching only these entries from the two tables, that are related through "creator" relationship:

$psql> SELECT * FROM work W INNER JOIN artist A ON W.artist_id = A.artist_id;
 work_id | artist_id |      title       | artist_id | given_name | family_name 
---------+-----------+------------------+-----------+------------+-------------
      43 |         1 | The game is on   |         1 | John       | Doe
      44 |         1 | The game is over |         1 | John       | Doe
      98 |         2 | La nuit commonce |         2 | Natalie    | Noir
      97 |         2 | Un jour se lve   |         2 | Natalie    | Noir
(4 rows)

So above we blindly trust the data managing part to magically enter artist_id values always correctly and matching our expectations (in real life a REFERENCES foreign key constraint would surely placed on the column in the work table (no work without an artist/creator would dictate the artist table to be the "first" causally).

Above you also see that selecting from a list of tables is identical to an INNER JOIN without any constraint i.e. the cartesian product of all entries from table work with all all entries from table artist.

Now your query (edited a tad for the minimal table model) besides not clear to me in its request idea, errors out as explained on top of this answer text:

$psql>  SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.family_name FROM work W, artist A GROUP BY W.artist_id, A.given_name, A.family_name HAVING COUNT(*) > 1;

ERROR:  column "w.work_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.fam...

This is of course not cured by using a more meaningful joined input set (claro, as the error points to the mismatch in select and group by lists:

$psql> SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.family_name FROM work W INNER JOIN artist A ON W.artist_id = A.artist_id GROUP BY W.artist_id, A.given_name, A.family_name HAVING COUNT(*) > 1;

ERROR:  column "w.work_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.fam...

You need suggest input on what you want to achieve to obtain a single answer (working). Until you do so, here are some offerings:

Given that you join on only existing pairs of artis and work ids, you do not need that having clause, as neither non-existing artist nor missing works nor missing combinations of artists and works will ever enter the rowset your query works on, so:

$psql> SELECT title, R.* FROM ( SELECT W.work_id AS work_id_filtered, W.artist_id, A.given_name, A.family_name FROM work W INNER JOIN artist A ON W.artist_id = A.artist_id GROUP BY W.work_id, W.artist_id, A.given_name, A.family_name) R INNER JOIN work WW ON WW.work_id = R.work_id_filtered;
      title       | work_id_filtered | artist_id | given_name | family_name 
------------------+------------------+-----------+------------+-------------
 The game is on   |               43 |         1 | John       | Doe
 The game is over |               44 |         1 | John       | Doe
 La nuit commonce |               98 |         2 | Natalie    | Noir
 Un jour se lve   |               97 |         2 | Natalie    | Noir
(4 rows)

This should give you a bit clumsy but good enough for my sunday morning the nice list of all titles (non-grouped files) combined with the grouped fields from the inner query. Formatted query might be written as:

SELECT title,
       R.*
FROM
  (SELECT W.work_id AS work_id_filtered,
                       W.artist_id,
                       A.given_name,
                       A.family_name
   FROM
   work W
   INNER JOIN artist A ON W.artist_id = A.artist_id
   GROUP BY W.work_id,
            W.artist_id,
            A.given_name,
            A.family_name) R
INNER JOIN
work WW ON WW.work_id = R.work_id_filtered;

Removing any GROUP BY (until the question offers detail on why it would be needed for the task):

$psql> SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.family_name FROM work W INNER JOIN artist A ON W.artist_id = A.artist_id;

 work_id |      title       | artist_id | given_name | family_name 
---------+------------------+-----------+------------+-------------
      43 | The game is on   |         1 | John       | Doe
      44 | The game is over |         1 | John       | Doe
      98 | La nuit commonce |         2 | Natalie    | Noir
      97 | Un jour se lve   |         2 | Natalie    | Noir
(4 rows)

Query formatted to not have to scroll horizontally:

SELECT W.work_id, W.title,  W.artist_id, A.given_name, A.family_name 
FROM work W INNER JOIN artist A ON W.artist_id = A.artist_id;

Note: Yes, as @ThorstenKettner pointed rightfully out, I made up the term "FULL INNER JOIN", I smoetimes do, sorry. Maybe my brain needs cartesian filling to balance the LEFT|RIGHT|FULL OUTER JOINs - who knows ;-)

Community
  • 1
  • 1
Dilettant
  • 3,267
  • 3
  • 29
  • 29
  • I think that you made up the term `FULL INNER JOIN`. At least I've never in my life heard of this. What you are showing is a `CROSS JOIN`. And using `INNER JOIN ON 1=1` is just obfuscating the matter and should not be used. – Thorsten Kettner Jun 06 '16 at 08:53
  • Thanks @ThorstenKettner, so yes, in effect yes, but not formally ;-) and yes (but smugglery is often depending on obfuscation) updated the answer. Thanks again for the feedback. – Dilettant Jun 06 '16 at 09:03
0

I used scaisEdge answer and edited. What I realize was if I select workid(unique key), I won't be able to display anything but without it, all my field is up.

Wei Jun
  • 31
  • 1
  • 7
0

As has been mentioned, the problem is mainly that you don't yet fully understand what you are doing.

First point is your join. By merely separating the tables with commas, you are using a syntax that was made redundant more than twenty years ago. It seems rare that you, as a beginner, use this. You must have found this in a very old book or tutorial. In short: Don't join tables like this. Use explicit joins. A comma means CROSS JOIN. So what you have is:

FROM dtoohey.work W CROSS JOIN dtoohey.artist A

That means that you combine every artist with every work. This is most likely not what you want. You want to join related artists and works. Your query shows there is an artistid in your work table, so one work is made by one artist in your model. The appropriate join would hence be:

FROM dtoohey.work w 
INNER JOIN dtoohey.artist a ON a.artistid = w.artistid

The second point is that you are aggregating rows. GROUP BY W.artistid, A.FirstName, A.LastName tells the DBMS to aggregate the rows such that you get one result row per artist. With having count(*) > 1 you say you only want artists with more than one work. But in your select clause you are showing a work (W.workid, W.title, W.medium, W.description). Which one? If you only show one row per artist and each artist has more than one work, which work of an artist do you show? The DBMS notices that you forgot to tell it what to select and raises an error. And you will probably agree now that the query with the given GROUP BY and HAVING clauses makes no sense.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73