Update:
I would now do this sort of thing in two steps:
Step 1 -- Convert from pandas dataframe to numpy array or rec-array. This is trivial via the values
or to_numpy
methods. It's a little trickier if you have strings but see here for one technique. If you have simple numeric data (and no strings), just stick to a regular numpy array and don't bother with a rec-array or structured array.
Step 2 -- use numpy's tofile
to write out a Fortran-readable binary
Original Answer:
I guess the bigger question is how to output from pandas to fortran and I'm not sure of the best way, but I'll try to show some fairly simple solutions mainly with to_csv()
.
Doing this will always give you faster IO, and I actually find binary easier than text in this case, although you do lose the ability to view the data as text.
df = pd.DataFrame({ 'x':[1.03,2.9,3.7],'y':[1,22,5] })
x y
0 1.03 1
1 2.90 22
2 3.70 5
Standard pandas output is actually exactly what you are asking for here, but I'm not sure how to get that into a file except with copy and paste. Maybe there is a way with ipython (though not that I can find).
And here's some default csv output, which is obviously not columnar:
df.to_csv('foo.csv',index=False)
%more foo.csv
x,y
1.03,1
2.9,22
3.7,5
But you may be able to get this into fortran with list directed input.
If you can live with the same format for all numbers, you could do something like this:
df.astype(float).to_csv('foo.raw',index=False,float_format='%10.5f')
%more foo.raw
x,y
1.03000, 1.00000
2.90000, 22.00000
3.70000, 5.00000
A couple notes here: that's not bad but limited in forcing you to use the same format for all numbers, which is pretty wasteful for single digit integers, for example. Also, I tried this with some NaNs and that didn't work very well. And also the commas are not needed there but when I tried to change the separator to ' ', then it quoted everything, so I just left it out.
Finally, the most flexible way might be to convert to strings and format them. This gives you some flexibility to format each column individually. Here's a simple example using a right justified format (and width of 8 for 'x' and 4 for 'y'):
df.x = df.x.map('{:>8}'.format)
df.y = df.y.map('{:>4}'.format)
df.to_csv('foo.str',index=False)
%more foo.str
x,y
1.03, 1
2.9, 22
3.7, 5
I still can't figure out how to get rid of those commas, but this way does handle NaNs successfully.