I have several datafiles that I load using
df = pd.concat((pd.read_csv(f[:-4]+'.txt', delimiter='\s+',
header=8) for f in files))
The format of this DataFrame is then
Field Temp. Momentum
0 200 25 0.541
1 300 26 0.580
2 400 25 0.700
. . . .
. . . .
0 NaN 25 0.700
1 NaN 50 0.500
. NaN 70 0.300
. . . .
I want to be able to transform this into a Pandas DataFrame where each row contains an Numpy Array like so
Field Temp. Momentum
0 np.array([200, 300, 400]) np.array([25, 26, 25]) np.array([0.541, 0.580, 0.700])
1 NaN np.array([25, 50, 70]) np.array([0.700, 0.500, 0.300])
.
.
The only way I can come up with is looping through each row and append to a Numpy array, which is then transformed to a Pandas Series and appended to a DataFrame. This seems like a very round about method of solving this problem - and it is very slow. So is there a more neat way of handling this?
Edit: The slow code is either loading with Numpy from the start as shown below or the above mentioned method, which I haven't actually coded but I am guessing is very slow
for f in files:
contents = np.loadtxt(f, skiprows=12).T
N = data.shape[0]
row = pd.Series(list(contents), index=columns[:N])
df = df.append(row, ignore_index=True)