I have a matrix of around 3000 species classifications e.g.
Arthropoda/Hexapoda/Insecta/Coleoptera/Cerambycidae/Anaglyptus
each line is a sequence of taxonomic classifications. What I need to do is, sort the 3000 lines so each one is unique so that the file can be fed to a program that creates phylogenetic(evolutionary) trees.
I have tried to use a set but get an error as lists are not hashable objects, however it is important to keep each line together as the values in each column for each line are nested.
Whats the best way to ensure I only have unique values in the last column but keep the integrity of each row?
many thanks