2

(Seems that I cannot recover my previous SE account)

I'm currently trying to make a database with the tree of life and I was wondering what would be the best approach to such a thing since it's quite huge and still evolving.

The levels (for those unfamiliar with this) are the following

Kingdom
Phylum
  SubPhylum (may be empty)
Class
  SubClass (may be empty)
Order
  SubOrder (may be empty)
Family
    SubFamily (may be empty)
  Tribe (may be empty)
    SubTribe (may be empty)
Genus
    SubGenus (may be empty)
  Section (may be empty)
    SubSection (may be empty)
  Series (may be empty)
    SubSeries (may be empty)
Species
    SubSpecies (may be empty)
  Variety (may be empty)
    SubVariety (may be empty)
  Form (may be empty)
    SubForm

(Wikipedia article: https://en.wikipedia.org/wiki/Taxonomic_rank)

My first idea was to make a single table with everything inside like this:

Class         Order         Family       Genus     Species
Actinop[...]  
Actinop[...]  Silurif[...]
Actinop[...]  Silurif[...]  Callich[...]
Actinop[...]  Silurif[...]  Callich[...] Corydoras 
Actinop[...]  Silurif[...]  Callich[...] Corydoras C. aeneus

Lots of duplicate data and can get very heavy. [...] used to make things shorter here.

So I though that I could do a table for each type:

Table Phylum :

phylum_name phylum_id kingdom_id

Table Class :

class_name class_id phylum_id kingdom_id

Table ClassSub :

subclass_name subclass_id class_id phylum_id kingdom_id

[...]

Table Species

 species_name species_id subgenus_id genus_id subtribe_id tribe_id subfamily_id family_id superfamily_id

Third approach:

Table TreeOfLife

 id   name            levelnumber          levelname  parentid   commonname
 1    Animalia        1                    kingdom    NULL
 2    Chordata        2                    phylum     1
 ...
 7    Corydoras       7                    genus      6
 8    Corydoras leucomelas  8              species    7        False spotted catfish

You get the concept. The goal here is to have a multi-level dropdown and once the user get to, let's say, the family, if the family isn't there, he can add it, add it's subfamily (if present), then the genus required to that family, then the species and add a picture of it (or whatever)

If the family Callichthyidae is present, he can then add the Corydoradinae subfamily and the Corydoras genus. Then he can add a species and a picture.

What would be the best approach to save diskspace and speed up SELECT queries? I've never worked with such complicated multi-level data and I would very much appreciate the help of the community.

Thanks!

Shadow
  • 33,525
  • 10
  • 51
  • 64
Jayd
  • 183
  • 3
  • 15

0 Answers0