First of all, even though this SQL: How do you select only groups that do not contain a certain value? thread is almost identical to my problem, it doesn't fully dissipate my confusion about the problem.
Let's have a table "Contacts" like this one:
+----------------------+
| Department FirstName |
+----------------------+
| 100 Thomas |
| 200 Peter |
| 100 Jerry |
+----------------------+
First, I want to group the rows by the department number and show number of rows in each displayed group. This, I believe, can be easily done by the following query.
SELECT Department, Count(*) As "Rows_in_group"
FROM Contacts
GROUP BY Department
This outputs 2 groups. First with dep.no. 100 containing 2 rows, second with 200 containing only one row.
But then, I want to extend the query to exclude any group that doesn't contain certain value in certain column (e.g. Thomas in FirstName). Here are my questions:
1) Reading the above-mentioned thread I was able to come up with this, which seems to work correctly:
SELECT Department, Count(*) As "Rows_in_group"
FROM Contacts
WHERE Department IN (SELECT Department FROM Contacts WHERE FirstName = "Thomas")
GROUP BY Department
Q: How does this work? I understand the "WHERE Department IN" part, but then I'd expect a value, but instead another nested query is included, which to me doesn't make much sense as I'm only beginner with SQL.
2) By accident I was able to come up with another query that also seems to work, but feels weird and I also don't understand its workings.
SELECT Department, Count(*) As "Rows_in_group"
FROM Contacts
GROUP BY Department
HAVING NOT SUM(FirstName = "Thomas") = 0
Q: How does this work? Why alteration: HAVING SUM(FirstName = "Thomas") > 0 doesn't work?
3) Q: Is there any simple and correct way to do this using the HAVING clause?
I expected, that simple "HAVING FirstName='Thomas'" after the GROUP BY would do the trick as it seems to follow a common language, but it does not.
Note that I want the whole groups to be chosen by the query so "WHERE FirstName='Thomas'" isn't s solution for my problem as it excludes all the rows that don't satisfy the condition before the grouping takes place (at least the way I understand it).