I have a sparse matrix containing roughly 100 million non-zero elements:
// [Row][Column][Element]
public IDictionary<int, IDictionary<int, decimal>> MyMatrix { get; private set; }
Getting the sum of each row is very fast:
private void RowSum()
{
var rowTotals = new ConcurrentDictionary<int, decimal>();
Parallel.ForEach(MyMatrix, (row) =>
{
rowTotals.TryAdd(row.Key, row.Value.Sum(x => x.Value));
});
}
Getting the sum of each column is much slower:
private void ColumnSum()
{
var columnTotals = new ConcurrentDictionary<int, decimal>();
Parallel.ForEach(MyMatrix, (row) =>
{
foreach (var column in row.Value)
{
columnTotals.AddOrUpdate(column.Key, column.Value,
(key, old) => old + column.Value);
}
});
}
To make column calculations faster, I could create a [Column][Row][Element] matrix, but that would double the RAM requirement. Is there any approach or data structure that would allow for the column calculations to be as fast as the row calculations, without doubling the ram?