6

I'm pretty new to SQL, trying to wrap my head around it, but it's getting a little confusing. Here's a simplified version of what I'm working with.

I have this table called people:

+----+------------+-----------+
| id | name       | parent_id |
+----+------------+-----------+
|  1 | Bob        | 3         |
|  2 | John       | 5         |
|  3 | Larry      | 4         |
|  4 | Kevin      | 0         |
|  5 | Kyle       | 0         |
|  6 | Jason      | 5         |
|  7 | Mildred    | 4         |
|  8 | Mabel      | 6         |
+----+------------+-----------+

The above table is a list of people. Their parent_id column refers to who their parents are. If they have no parents on the table, their parent_id is 0.

Now, I want to be able to get separate lists of each group of people: grandparents, children, and grandchildren.

It's easy to get grandparents (Larry and Kevin), I can just do this query:

SELECT name FROM people WHERE parent_id = 0

But when it comes to getting children (John, Larry, Jason, and Mildred) and grandchildren (Bob and Mabel), I'm lost.

In English, this would be the process of getting the children: "Get all the results from the table. For each one, look at their parent id. Find the result in the table that has that as their id. See that person's id, and if it's 0, the original person was a child. Add them to the list of what we will display."

For the grandchildren, it would be the same as above, but just with an additional step.

Does that make sense? How can I write my process above as a SQL query?

eshellborn
  • 11,031
  • 5
  • 20
  • 29

5 Answers5

9

This can be solved using a simple JOIN.

To select the list of children:

SELECT c.name
FROM people p
JOIN people c ON c.parent_id = p.id
WHERE p.parent_id = 0

To select the list of grandchildren:

SELECT gc.name
FROM people p
JOIN people c ON c.parent_id = p.id
JOIN people gc ON gc.parent_id = c.id
WHERE p.parent_id = 0
  • Nice! Thank you. One more thing, I actually have a larger table than that, and there's another column, group_id. I am only interested in entries that have a group_id of 5, the rest totally I could ignore. So I tried filtering by adding "AND group_id = 5" at the end, but I got an error message that said "Column 'group_id' in where clause is ambiguous ". How can I fix that? – eshellborn Jul 09 '15 at 20:31
  • Since these statements select data from multiple `people` tables (same table, but used three different times), you need to specify which `group_id` (which table) you're talking about by referring to it by its table alias. In this case, do you want the grandchildren to have group_id = 5? If so, you would write `AND gc.group_id = 5`. –  Jul 09 '15 at 20:34
  • Could someone explain how the "letter.columnname" and "table letter" syntaxes work? – eshellborn Jul 09 '15 at 20:39
  • is there anyway to do this recursively... so select great-great grandchildren or any other grandchildren further down in the "family tree"? – shreddish Jul 12 '16 at 18:34
2

First of all, it's very important to know that this question is very easy to answer, IF you know that you're working with a fixed set of generations (down to grandchildren, for example). If this table is ultimately going to have many generations, and you want to (for example) find all of Kyle's descendants through the whole family tree, then you are not going to do it with a single query. (I have a stored procedure that deals with arbitrary levels of tree generations.) So for now, let's find up to grandparents / grandchildren.

As you said, finding the grandparents is easy...

mysql> select name from people where parent_id = 0;
+-------+
| name  |
+-------+
| Kevin |
| Kyle  |
+-------+
2 rows in set (0.00 sec)

Now, finding children isn't too bad.

Let's find Kyle's children:

mysql> select p1.name from people p1 where p1.parent_id in 
           (select p2.id from people p2 where p2.name = 'Kyle');             
+-------+
| name  |
+-------+
| John  |
| Jason |
+-------+
2 rows in set (0.02 sec)

And here's Kyle's grandchildren:

mysql> select p3.name from people p3 where p3.parent_id in
           (select p2.id from people p2 where p2.parent_id in
              (select p3.id from people p3 where p3.name = 'Kyle'));
+-------+
| name  |
+-------+
| Mabel |
+-------+
1 row in set (0.01 sec)

mysql> 

Going the other direction... who is Mabel's parent?

mysql> select p1.name from people p1 where p1.id = 
           (select p2.parent_id from people p2 where p2.name = 'Mabel');  
+-------+
| name  |
+-------+
| Jason |
+-------+
1 row in set (0.00 sec)

mysql> 

... and her grandparent:

mysql> select p1.name from people p1 where p1.id = 
           (select p2.parent_id from people p2 where p2.id = 
               (select p3.parent_id from people p3 where p3.name = 'Mabel'));
+------+
| name |
+------+
| Kyle |
+------+
1 row in set (0.00 sec)

So you can see the pattern I followed to make these queries should you need great-grandparents / great-grandchildren. However, the resulting query will become unwieldy if you need more generations, and a stored procedure that loops will be in order.

The Oracle database has a more elegant solution, a SQL extension called "CONNECT BY PRIOR". For some more reading (and a MySQL stored procedure example), check out Connect By Prior Equivalent for MySQL here on StackOverflow.

A final note: do yourself a favor, if you haven't already, and:

mysql> create index ix_parent_id on people(parent_id);
Query OK, 0 rows affected (0.06 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> 

It will dramatically improve performance of for these kind of queries.

Community
  • 1
  • 1
Michael C
  • 21
  • 2
  • Note that in my examples, I select "where name = "(name)". I did that for the readability of the example. In production code, you would want to know Mabel's id and select based on that. – Michael C Jul 09 '15 at 20:58
0

To get the grandchildren, Try this query.

SELECT name FROM people WHERE parent_id IN (SELECT id from people where parent_id >0);
King Size
  • 397
  • 1
  • 6
  • This only works for grandchildren that don't have children of their own. –  Jul 09 '15 at 20:39
  • Thanks @Sander for pointing this out. I just updated the query. I know this is not the most efficient or best solution, but it's good to know the options. – King Size Jul 09 '15 at 21:05
0

That would be easier to understand if you use a query designer. I am putting a picture of view designer of SQL Server. I hope it can help.

This is how you can find the children:

enter image description here

And I suggest you to allow null for parent_id field and use null instead of 0 so you can make a relation from id (as pk) to parent_id (as fk). That would make everything easier while using the designer.

Let me know if you also need an example for finding the grandchildren.

AliReza
  • 706
  • 2
  • 10
  • 30
0

This is without JOIN statement, but as a SQL learner, I find it as easier:

  1. For grandchildren:

    SELECT grandparent.name AS Grandparents, grandchild.name AS Grandchildren 
    FROM people AS grandparent, people AS parent, people
    AS grandchild WHERE grandchild.parent_id = parent.id 
    AND parent.parent_id = grandparent.id;
    
  2. For children:

    SELECT parent.name as Parent, child.name AS Child 
    FROM people AS parent, people AS child 
    WHERE child.parent_id = parent.id AND parent.parent_id = 0;
    
  3. And for all children-parent pairs:

    SELECT parent.name as Parent, child.name AS Child 
    FROM people AS parent, people AS child 
    WHERE child.parent_id = parent.id;
    

It took me a while, but it was fun :D

m.cekiera
  • 5,365
  • 5
  • 21
  • 35