I have several "data sources", each of which provides ordered timestamped data. I'd like to flatten it into a single ordered stream (like merge sort). This answer describes how to do it for two enumerables, but I am not sure how to generalize it.
Data sources are huge, so I cannot do it in memory, it has to be streamed.
To explain it with an example, I have something like this:
interface IDataSource
{
IEnumerable<DateTime> GetOrderedRecords();
}
I would like to be able to have an extension method like this:
// get all sources
IEnumerable<IDataSource> dataSources = GetAllSources();
// merge sort
IEnumerable<DateTime> flattened = dataSources
.MergeSort(s => s.GetOrderedRecords());
[Edit]
The reason I can't load everything eagerly and then sort it is because I am loading data from multiple databases and exporting it into a different one. Each IDataSource
is basically Linq-to-NHibernate under the hood, and I have millions of data rows to return.
So what I need is something like:
- From all available sources, load the next timestamp.
- Store it to disk and "forget it".
Data sources are already sorted, which makes the "merge sort" approach feasible.