I am writing a program to analyze some spreadsheet data. There are two columns: start time and duration (both variables are Doubles). The spreadsheet is not sorted. I need to sort the columns together by start time (that is, the durations have to stay with their matching start times). There are a few thousand rows, and analysis will happen periodically so I don't want to keep sorting the entire collection over and over again as more data gets added.
A Treemap using start time as the key and duration as the value seemed perfect because it would insert the information into the correct position as it gets read in, and keep the two pieces of data together as it goes.
And it did work perfectly for 90% of my data. Unfortunately I realized tonight that sometimes 2 events will have the same start time. Since the Treemap doesn't keep duplicate keys, I lose a row when the new data overwrites the old one.
There are many posts about this (see this and this and sort of this) and I see two suggestions keep coming up:
- a custom comparator to 'trick' the Treemap into allowing duplicates.
- using something like Treemap(Double,List(Double)) to store multiple values for a key.
The first suggestion is easiest for me to implement but I read comments that this breaks the contract of the Treemap and isn't a good idea. The second suggestion can be done but will make the analysis more complicated as I'll have to iterate through the list as I iterate through the keys, instead of simply iterating through the keys alone.
What I need is a way to keep two lists sorted together and allow duplicate entries. I'm hoping someone can suggest the best way to do this. Thanks so much for your help.