I've been wondering; in the company I work for, we manage lots of data, but as it's effectively given to us by customers, we don't necessarily trust it - with good reason. A lot of it has the wrong timestamp, or some of it is missing, or whatever else have you.
One of the tasks that I have had to do a lot recently is basically find elements that are null within a set of elements, then find the next non-null element, then average out the difference between those null records. That is, say we have dataset A:
A = { 0f, 1f, 2f, 5f, Null, Null, Null, 7f, Null, 8f }
It's important to note that we have to distinguish between 0 and Null. The difference is obviously that 0 is 0, while Null is no data at all.
Using LINQ, is there a way that we can basically access the following subsection of A:
Subsection { Null, Null, Null, 7f }
And have it in a collection such that we can transform it into (7/4f) over the four records..
Subsection { 1.75f, 1.75f, 1.75f, 1.75f }
Such that when iterating over A
again, we get the following output:
{ 0f, 1f, 2f, 5f, 1.75f, 1.75f, 1.75f, 1.75f, 4f, 4f }
Currently the way I do this is do a pass using a numeric for, looking for a null element, then storing all consecutive nulls in a List<T>
, and after finding the next non-null, assigning all variables by iterating over said List<T>
. It does the job but it looks pretty nasty.
So, for the sake of narcicissim, is there a way of doing this neatly (= less code clutter)?
Pseudo
a = { 0, 1, 2, 5, null, null, null, 7, null, 0 }
nullList = new List()
for i = 0, a.length
if i == null
nullList.add(i)
else
if nullList.length > 0
nullList.add(i)
int avg = nullList.Aggregate(x => x)
foreach element in nullList
element = avg
nullList.clear()