A friend of mine needs to to read a lot of data (about 18000 data sets) that is all formatted annoyingly. Specifically the data is supposed to be 8 columns and ~ 8000 rows of data, but instead the data is delivered as columns of 7 with the last entry spilling into the first column of the next row.
In addition every ~30 rows there is only 4 columns. This is because some upstream program is reshaping a 200 x 280 array into the 7x8120 array.
My question is this: How can we read the data into a 8x7000 array. My usual arsenal of np.loadtxt and np.genfromtxt fail when there is an uneven number of columns.
Keep in mind that performance is a factor since this has to be done for ~18000 datafiles.
Here is a link to a typical data file: http://users-phys.au.dk/hha07/hk_L1.ref