1

I'm working on a MySQL database that contains persons. My problem is that, (I will simplify to make my point):

I have three tables:

Persons(id int, birthdate date)
PersonsLastNames(id int, lastname varchar(30))
PersonsFirstNames(id int, firstname varchar(30))

The id is the common key. There are separate tables for last names and first names because a single person can have many first names and many last names.

I want to make a query that returns all persons with, let's say, one last name. If I go with

select birthdate, lastname, firstname from Persons, PersonsLastNames,
PersonsFirstNames where Persons.id = PersonsLastNames.id and
Persons.id = PersonsFirstNames.id and lastName = 'Anderson'

I end up with a table like

1/1/1970 Anderson Steven //Person 1
1/1/1970 Anderson David  //Still Person 1
2/2/1980 Smith Adam      //Person 2
3/3/1990 Taylor Ed       //Person 3

When presenting this, I would like to have

1/1/1970 Anderson Steven David 
2/2/1980 Smith Adam [possibly null?]
3/3/1990 Taylor Ed [possibly null?]

How do I join the tables to introduce new columns in the result set if needed to hold several first names or last names for one person?

philfreo
  • 41,941
  • 26
  • 128
  • 141
Ed Taylor
  • 277
  • 1
  • 6
  • 13
  • What is the practical application of this? When does a person have multiple first and last names? And since the relationship you described is many to many, the result you're looking for could have any number of columns. – Ken Dec 23 '09 at 02:10
  • @Ken - it could exists in a "Known Aliases" sense. – Dereleased Dec 23 '09 at 02:22
  • You might consider renaming this question to something like "MySQL - How to return dynamic columns from JOIN?" to help future people searching for a similar issue. – philfreo Dec 23 '09 at 02:27
  • I want separation of first names and last names because there should be a simple way to get statistics of exactly how many "Steven" and "David" there are in the database. If I would store it like "Steven David" this would become more complicated I guess? The Person table is also a lot bigger in the real database. I wouldn't want the redundancy it would mean to skip the first name and last name table and only store that information in the person table? – Ed Taylor Dec 23 '09 at 02:30
  • 1
    @Ed But you lose the relationship between the two names. If a person is named Bob Jones and Steve Smith, if you store the name types in separate tables, there's no way to know if the person's alias is Bob Jones or Bob Smith. I think this should probably just be a single table. It can have a surrogate key to uniquely identify each row, a personID, firstname, lastname. This way you can still get stats on exactly how many "Steven"s there are. – Ken Dec 23 '09 at 02:36
  • The question is whether you intend the `PersonsLastNames.id` field to represent the `Person.id` or the `id` of the last name. Your question suggest the former, while your comments sort of suggest the latter, in which case you'll need another table associating each last name to each person. – philfreo Dec 23 '09 at 03:22

4 Answers4

1

SQL does not support a dynamic number of columns in the query select-list. You have to define exactly as many columns as you want (notwithstanding the * wildcard).

I recommend that you fetch the multiple names as rows, not columns. Then write some application code to loop over the result set and do whatever you want to do for presenting them.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
1

The short answer is, you can't. You'll always have to pick a fixed number of columns. You can, however, greatly improve the syntax of your query by using the ON keyword. For example:

SELECT
    birthdate,
    firstName,
    lastName
FROM
    Persons
    INNER JOIN PersonsLastNames
        ON Persons.id = PersonsLastNames.id
    INNER JOIN PersonsFirstNames
        ON Persons.id = PersonsFirstNames.id
WHERE
    lastName = 'Anderson'
GROUP BY
    lastName, firstName
HAVING
    count(lastName) = 1

Of course, my query includes a few extra provisions at the end so that only persons with only one last name specified would be grabbed, but you can always remove those.

Now, what you CAN do is choose a maximum number of these you'd like to retrieve and do something like this:

SELECT
    birthdate,
    lastName,
    PersonsFirstNames.firstName,
    IFNULL(p.firstName,''),
    IFNULL(q.firstName,'')
FROM
    Persons
    INNER JOIN PersonsLastNames
        ON Persons.id = PersonsLastNames.id
    INNER JOIN PersonsFirstNames
        ON Persons.id = PersonsFirstNames.id
    LEFT JOIN PersonsFirstNames p
        ON Persons.id = p.id
        AND p.firstName <> PersonsFirstNames.firstName
    LEFT JOIN PersonsFirstNames q
        ON Persons.id = q.id
        AND q.firstName <> PersonsFirstNames.firstName
        AND q.firstName <> p.firstName
GROUP BY
    lastName

But I really don't recommend that. The best bet is to retrieve multiple rows, and then iterate over them in whatever application you're using/developing.

Make sure you read up on your JOIN types (Left-vs-Inner), if you're not already familiar, before you start. Hope this helps.

EDIT: You also might want to consider, in that case, a slightly more complex GROUP BY clause, e.g.

GROUP BY
    Persons.id, lastName
Dereleased
  • 9,939
  • 3
  • 35
  • 51
1

Does your application really need to handle unlimited first/last names per person? I don't know your specific needs, but that seems like it may be a little extreme. Regardless...

Since you can't really have a dynamic number of columns returned, you could do something like this:

SELECT birthdate, lastname, GROUP_CONCAT(firstname SEPARATOR '|') AS firstnames
FROM Persons, PersonsLastNames, PersonsFirstNames
WHERE Persons.id = PersonsLastNames.id
AND Persons.id = PersonsFirstNames.id
GROUP BY Persons.id

This would return one row per person that has a last name, with the (unlimited) first names separated by a pipe (|) symbol, GROUP_CONCAT function.

birthdate             lastname   firstnames
---                   ---        ---
1970-01-01 00:00:00   Anderson   Steven|David
1980-02-02 00:00:00   Smith      Adam
1990-03-03 00:00:00   Taylor     Ed
philfreo
  • 41,941
  • 26
  • 128
  • 141
  • 1
    Yes the GROUP_CONCAT makes the output I want! Thank you! What about efficiency in a database with up to 8 million persons? – Ed Taylor Dec 23 '09 at 02:45
  • I assume that you won't actually have a lot of first names per person, even 10 would be a lot, right? If that's the case then I believe the GROUP_CONCAT function won't have any performance issue, rather it will be the simple joins and group by that take the time. With the right indices I think this should be fine though. Note: be sure to look at the `group_concat_max_len` setting. – philfreo Dec 23 '09 at 03:19
0

I think the closest thing you could do is to Group By Person.Id and then do string concatenation. Perhaps this post will help: How to use GROUP BY to concatenate strings in MySQL?

Community
  • 1
  • 1
Ken
  • 1,830
  • 3
  • 20
  • 32