I have a series of datasets outputted from a program. My goal is to plot an average of the datasets as a line graph in pyplot or numpy. My problem is that the length of the outputted datasets is not controllable.
For example, I have four data sets of lengths varying between 200 and 400 points with x values normalised to figures from 0 to 1, and I want to calculate the median line for the four datasets.
All I can think of so far is to interpolate (linearly would be sufficient) to add extra data points to the shorter sequences, or somehow periodically remove values from the longer sequences. Does anyone have any suggestions?
At the moment I am importing with csv reader and appending row by row to a list, so the output is a list of lists, each with a set of xy coordinates which I think is the same as a 2d array?
I was actually thinking it may be easier to delete excess data points than to interpolate, for example, starting with four lists, I could remove unnecessary points along the x axis since they are normalised and increasing, then cull points with too small a step size by referencing the shortest list step sizes (this explanation may not be so clear, I will try to write up an example and put it up tomorrow)
An example data set would be
line1=[[0.33,2],[0.66,5],[1,5]]
line 2=[[0.25,43],[0.5,53],[0.75,6.5],[1,986]]