6

I need to build a family tree in php and MySQL. I'm pretty surprised at the lack of open source customizable html family tree-building software there is out there, but I digress. I have spent a lot of time reading about storing MySQL digraphs and family trees. Everything makes sense to me: have a table with nodes (people) and a table with edges (relationships).

The only problem I have is I'm not sure of the best way to store relationships that are not necessarily adjacent, for example sibling and grandparent relationships. At first I didn't think this would be a big deal because I can just invisibly enforce a parent (everyone has parents) that would resolve these connections.

However, I also need to be able to store relationships that may not have a common parent such as romantic partners. Everything I have read suggests a parent-child relationship, but since romantic partners do not share a common parent (hopefully), I'm not sure how to store it in the edges table. Should I use a different table, or what? If it's in the same table, how do I represent this? As long as I am doing this with non-familiar relationships, I might as well do it with family too.

To sum up, three questions:

  • How do I represent lateral relationships?
  • If a lateral relationship has a common parent, how do I store it? Should this be a family flag on the table where other lateral relationships are stored?
  • How do I store parent-child relationships where the child is two or more edges away (a grandparent), but the immediate parent is unavailable?

Any help is appreciated, and if anyone has any suggestion for some javascript/html family tree building software, that would be wonderful.

MSalters
  • 173,980
  • 10
  • 155
  • 350
Explosion Pills
  • 188,624
  • 52
  • 326
  • 405

3 Answers3

7

An idea that comes from the Geneapro schema and RootsMagic.

person
------
person_id
name (etc)

life_event_types
----------------
life_event_type_id
life_event_type_description (divorce, marriage, birth, death)

life_events
-----------
life_event_id
life_event_type_id
life_event_description
life_event_date

life_event_roles
----------------
life_event_role_id
life_event_role (mother, father, child)

person_event_role
-----------------
person_id - who
life_event_id - what happened
life_event_role_id - what this person did

So you could have a life event of type "birth", and the role_id tells you who were the parents, and who was the child. This can be extended to marriages, deaths, divorces, foster parents, surrogate parents (where you might have 3 or 4 parents with a very complicated relationship), etc.

As for storing more distant relationships, you can calculate these. For example, you can calculate the Father of anybody by getting the person who has the 'father' role with a matching event_id. You can then get the father of that person, and you have the grandfather of the original person. Anywhere that somebody is unknown, create the person with unknown data.

Dan Blows
  • 20,846
  • 10
  • 65
  • 96
  • This is interesting and has the potential to solve the problem (and I'll adhere to the suggestion of leaving no spaces in distant relationships by using placeholders if necessary). The only question I have is how can I transparently determine the direction of a relationship? Right now it seems like I would have to examine the event type description or role each time to make a judgment about where one node sits relative to the other. – Explosion Pills Apr 25 '11 at 13:39
  • @tandu I'm not sure what you mean by 'direction of a relationship'? I think that comes down to your program logic, rather than SQL. Your database merely stores that (for example) in the 'birth' event of 'Bob', 'Geoff' was the Father, 'Alice' was the mother, and 'Bob' was the 'baby'. In your program, if you want to work out who are Bob's parents, then you select from person_event_role where the life_event_id is that of Bob's birth, and the life_event_role_id is either mother or father. – Dan Blows Apr 25 '11 at 13:54
  • That is what I mean by direction. For example, if the user was Bob, I need to find where in the tree Bob's son, father, and wife are. His son is south, his father north, his wife, west or east. It seems like you already answered, though: you are saying to calculate this direction based on the person event role. I can't do this transparently in MySQL if I have to examine each string in php and calculate the direction from that, but I guess you are saying that's not what the schema is for. – Explosion Pills Apr 25 '11 at 14:23
  • @tandu Precisely, that's the job of your code. You can write classes so you can use code like `$Bob->getFather()`, `$Bob->getGrandFather()` which then return a person object of the relevant person. The problem with hard-coding the direction of the relationship is that it can't cope with the massive variety of possible relationships. How would a hard-coded schema cope with Bob getting divorced and marrying his mother-in-law? You would have to rewrite a section of code just to deal with that issue. By having a loose schema like the one above, it's the same code, whoever Bob marries. – Dan Blows Apr 25 '11 at 14:41
  • @tandu Out of interest, which solution did you go with? – Dan Blows Apr 28 '11 at 14:42
  • Haven't decided yet. Probably will tonight; I'm leaning heavily towards something like this but I'm just worried about extensibility and modularity for the long haul. I'm also under big time constraints. I'll probably end up choosing this answer. – Explosion Pills Apr 28 '11 at 14:44
  • I've decided to do something very similar, but instead of having a life_events table, I just have a relationships table with the same function (and no event). The purpose of my project is to create a snapshot of relationships rather than a timeline, so I remove the unnecessarily complexity. I have a RelationshipUserMap table that has foreign keys on user, relationship, rumstatus (public, private), and relationshiptype (father, mother). The Relationship table is just like life_events, but it is not mapped to a specific event. I'm accepting your answer as inspiration. – Explosion Pills Apr 30 '11 at 14:50
  • By the way, if you think my solution is flawed in any way, please, please let me know. – Explosion Pills Apr 30 '11 at 14:50
  • @tandu Thanks for letting me know, it was an interesting question. Your solution sounds fine - removing unnecessary complexity is always a good idea. It sounds like, should you need to introduce the timeline concept, you could always migrate the data to my more complicated schema. Good luck! – Dan Blows Apr 30 '11 at 14:55
2
person
-------
person_id
other_stuff

relation
----------
person_id_1
person_id_2
relationship_type
begin_dt
end_dt

just populate the relationship type with any value you are interested in. (FK to some picklist would be great)

I put the dates on for an interesting subdiscussion/thought provokation.

Randy
  • 16,480
  • 1
  • 37
  • 55
  • This is what I was leaning towards, but I'm not sure if it's 100% the best way. This also does not necessarily solve the problem of a grandparent-child relationship with an absent parent. If the parent is not absent, there's no need to store this relationship since it is obviated by the grandparent-parent relationship, unless you are suggesting I store both. – Explosion Pills Apr 24 '11 at 23:44
  • this model would allow you to store both - i don't know that i recommend that. you are buying the work of managin redundant relationships, which is just like issues when structures are denormalized... either that or you should tightly constrain the records based on that relation type. – Randy Apr 24 '11 at 23:47
  • @tandu, biology.... a child is never born without parents. And even in computer_science trees that's a no no. The begin and end dates solves the problem nicely I think. – Johan Apr 24 '11 at 23:50
  • Please reread my question. I'm need to store a relationship between two nodes that don't share a parent. How does the dates solve this problem? Also, this provided schema doesn't indicate which person is the parent and which the child (unless you mean `1` to be parent, for instance). – Explosion Pills Apr 24 '11 at 23:53
1

The GEDCOM data model and the Gramps data model are two of the most popular formats for exchanging geneological data between different tools. Using either of these data models should both (1) make your tool more compatible with other tools and (2) ensure that your data model is compabible with many special cases, considering both data models are specially designed to deal with geneological data.

Tools like Oxy-Gen or the Gramps PHP exporter should get you on your way with respect to how to import GEDCOM data into a database.

For more details, see also my answer to “Family Tree” Data Structure.

Community
  • 1
  • 1
John Slegers
  • 45,213
  • 22
  • 199
  • 169