12

I have a table similar to:

domain      |   file      | Number 
------------------------------------     
aaa.com     | aaa.com_1   | 111
bbb.com     | bbb.com_1   | 222
ccc.com     | ccc.com_2   | 111
ddd.com     | ddd.com_1   | 222
eee.com     | eee.com_1   | 333

I need to query the number of Domains that share the same Number and their File name ends with _1. I tried the following:

select count(domain) as 'sum domains', file
from table 
group by Number
having
count(Number) >1 and File like '%\_1'; 

It gives me:

sum domains | file
------------------------------
2           | aaa.com
2           | bbb.com

I expected to see the following:

sum domains | file
------------------------------
1           | aaa.com
2           | bbb.com

Because the Number 111 appears once with File ends with _1 and _2, so it should count 1 only. How can I apply the 2 conditions that I stated earlier correctly ?

user1810868
  • 1,565
  • 8
  • 23
  • 30
  • MySQL is free to choose any value from non-aggregated columns in each group, so the value selected for `file` is indeterminate. Your expected output shows a count of 2 against `bbb.com` - but there is only one `bbb.com` domain/file in your table - why do you expect that `file` value in the second group (which presumably has `Number=222`) and not, for example, `ddd.com`? – eggyal Dec 15 '12 at 23:13
  • As you said, the selected value for `file` is undetermined. I put `bbb.com` as example. But the problem is not in the `file`. my problem in the `sum domains`. My query result counts counts the files that do not end with `_1`, which I do not want and I put a condition in the `having` clause but it seems there is something wrong. – user1810868 Dec 15 '12 at 23:21
  • Well, my comment was really intended to point out that the objective of your query is unclear. If the value returned for `file` is unimportant to you, why select it at all? It's more normal to select the column(s) on which one has grouped the results: i.e. `Number` in this case. To answer your question, it seems you *probably* intend for the filter on `file` to occur prior to the grouping - i.e. in a `WHERE` clause rather than in the `HAVING` clause (which is performed on the grouped results, after aggregation such as `COUNT(*)` has already been performed). – eggyal Dec 15 '12 at 23:23
  • You are right. the `file` is kind of meaningless. It is more meaningful to use the 'Number'. If I used `where' for to filter files wich ends with `_1`, then how can I add another condition which is that the `Number` appeared more than once. i.e, I need to query the number of files in `File` that end with `_1` and share the same `Number`. – user1810868 Dec 15 '12 at 23:39
  • Do you want those `Number`s which have more than one `file` ending with `1`, or those `NUmber`s which have more than one record at least one of which ends in `1`? – eggyal Dec 15 '12 at 23:57
  • None of them. I need to know the cases in which many (more than one) `file`that ends with `_1` use the same `number`. For example, the `number` 222 have two occurrence of files ending with `_1`. So, I want to print the counter of occurrence (i.e, 2) and the `number` that these two files that ends with `_1` have shared. i.e, my output should be: 2 | 2222 – user1810868 Dec 16 '12 at 00:09
  • So then: ``SELECT Number, COUNT(*) AS `sum domains` FROM `table` WHERE file LIKE '%\_1' GROUP BY Number HAVING `sum domains` > 1`` – eggyal Dec 16 '12 at 00:19
  • Wouldn't you also expect a result for `ddd.com`, since it qualifies just as much as `aaa.com`? – Bohemian Dec 16 '12 at 06:44
  • Yes, but when you use `group by`, it will print one item in the group. So, in the group of which has `number` is `222`, it will print normally the first item which is in my case `bbb`. `ddd` is also member of the group that has the `number` is `222`. there are other ways needed to print all items in the group. See: http://stackoverflow.com/questions/13897748/how-to-print-every-item-resulted-from-group-by-statement – user1810868 Dec 16 '12 at 10:01
  • @eggyal: Can you add it as a solution? – user1810868 Dec 16 '12 at 10:02

4 Answers4

15

As documented under SELECT Syntax:

The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization.

In other words, it is applied after the grouping operation has been performed (in contrast with WHERE, which is performed before any grouping operation). See WHERE vs HAVING.

Therefore, your current query first forms the resultset from the following:

SELECT   COUNT(domain) AS `sum domains`, file
FROM     `table`
GROUP BY Number

See it on sqlfiddle:

| SUM DOMAINS |      FILE |
---------------------------
|           2 | aaa.com_1 |
|           2 | bbb.com_1 |
|           1 | eee.com_1 |

As you can see, the values selected for the file column are merely one of the values from each group—as documented under MySQL Extensions to GROUP BY:

The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

Your current query then proceeds to filter these results according to your HAVING clause:

HAVING   COUNT(Number) > 1 AND file LIKE '%\_1'

With the values of file selected above, every single group matches on the second criterion; and the first two groups match on the first criterion. Therefore the results of the complete query are:

| SUM DOMAINS |      FILE |
---------------------------
|           2 | aaa.com_1 |
|           2 | bbb.com_1 |

Following your comments above, you want to filter the records on file before grouping and then filter the resulting groups for those containing more than one match. Therefore use WHERE and HAVING respectively (and select Number instead of file to identify each group):

SELECT   Number, COUNT(*) AS `sum domains`
FROM     `table`
WHERE    file LIKE '%\_1'
GROUP BY Number
HAVING   `sum domains` > 1

See it on sqlfiddle:

| NUMBER | SUM DOMAINS |
------------------------
|    222 |           2 |
Community
  • 1
  • 1
eggyal
  • 122,705
  • 18
  • 212
  • 237
6

i am using by following

having ( SUM(qty) > 4 AND SUM(qty) < 15 )
Ahmed Awan
  • 347
  • 3
  • 9
  • While this code snippet may solve the problem, it doesn't explain why or how it answers the question. Please [include an explanation for your code](//meta.stackexchange.com/q/114762/269535), as that really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. – Luca Kiebel Nov 08 '18 at 12:27
  • The braces, showing that you can string clauses similar to an inline select helped me with the syntax. – David Barnett May 05 '21 at 11:23
  • @LucaKiebel Next time i will explain. Thanks :) – Ahmed Awan May 24 '21 at 11:10
1

You cannot have the file name in the SELECT statement if it is not also in the GROUP BY. You have to get your GROUP BY result than JOIN back to the original and add the filter logic like so:

SELECT *
FROM
(
 select count(domain) as 'sum_domains', Number
 from table 
 group by Number
 having
 count(Number) >1
) result
join table t on result.Number = t.Number
WHERE file like '%\_1'
Ibo
  • 4,081
  • 6
  • 45
  • 65
Julie Kalu
  • 11
  • 1
0

Try the nested query below:

  select count(domain) as 'sum domains', domain as fileName
  from 
   (select domain, file from tableName
    group by Number
    having count(Number) >1) as temp
  WHERE file like '%\_1';
Yogendra Singh
  • 33,927
  • 6
  • 63
  • 73