1

Link to dat file

I'm trying to convert .dat files into Excel (actually I would prefer to copy the contents into an excel file) but the numbers are pasting as text. I would like them to be floats.

I think one of the issues I'm having is that the .dat file is in a funky format.

import pandas as pd

df = pd.read_csv('C:/Users/me/Desktop/specimen/spec01/specimen.dat', sep='\t')
df.to_excel('C:/Users/me/Desktop/Rebar/specimen/specimen.xlsx')

Abb
  • 25
  • 3
  • Hi Abb, you haven't received responses because you do not have a good reproducible example. Please see https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples. If you update your Q, I'd be happy to help. In particular, having the input dat file is necessary. – Rich Andrews Mar 28 '19 at 22:39
  • Thanks for the tip Rich, I've uploaded an example dat file. If it makes it easier I only need the headers and data -- nothing above it. My ultimate goal is to take a bunch of dat files and dump them as separate worksheets in a single excel file. – Abb Mar 29 '19 at 02:50
  • ensure that there are no special character or spaces in your columns...this https://stackoverflow.com/questions/33257344/how-to-remove-special-characers-from-a-column-of-dataframe-using-module-re and http://www.datasciencemadesimple.com/strip-space-column-pandas-dataframe-leading-trailing-2/ could help – user2543622 Mar 29 '19 at 02:52

1 Answers1

0

The referenced file has an overly ornate header which must be treated. Presuming you are receiving this file repeatedly, from a source that cannot be altered, such as a server or experiment apparatus, you are left to programmatically handle the following:

MTS793|MPT|ENU|1|2|.|/|:|1|0|0|A

Operator Information                        Time:   15.133789   Sec 8/2/2018 12:18:30 PM
Is the extensometer pin pulled? ughiopuh;u
Operator Information End

Data Acquisition            Tensile Test Data           Time:   117.81934   Sec 8/2/2018 12:20:13 PM
Time    Axial Displacement  Axial Strain    Axial Force
Sec in  in/in   lbf

This must be removed to a proper set of headers to successfully use pandas to import the file.

Time;Axial Displacement;Axial Strain;Axial Force
Sec in;in/in;lbf

are likely the headers desired. This gets you into column multiindex world which is getting advanced.

Here is a functioning answer:

import openpyxl
import pandas as pd
import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

csvdata = StringIO("""MTS793|MPT|ENU|1|2|.|/|:|1|0|0|A

Operator Information                        Time:   15.133789   Sec 8/2/2018 12:18:30 PM
Is the extensometer pin pulled? ughiopuh;u
Operator Information End

Data Acquisition            Tensile Test Data           Time:   117.81934   Sec 8/2/2018 12:20:13 PM
Time    Axial Displacement  Axial Strain    Axial Force
Sec in  in/in   lbf
15.372559   -0.00026854035  -0.00013428145  -3.7030973
15.472656   2.2532094e-08   -8.3925901e-05  11.109222
15.572754   -0.00026854035  -0.00011749626  3.7030623
15.672852   2.2532094e-08   -0.00011749626  -1.7583034e-05
15.772949   0.00026858543   -0.00010071108  7.4061422""")

df = pd.read_csv(csvdata, sep="\t", header=[5,6])

# Show how pandas parses the file headers
print(df.head(6))

# The columns are referenceable as a multiindex
level0 = df.columns.get_level_values(0)
level1 = df.columns.get_level_values(1)

# Set index from multiindex columns!
# But we are not going to assign this result, because 
# there is a better trick
print(df.set_index(('Time', 'Sec')))

flattened_cols = (["{} ({})".format(x,y) for x,y in zip(level0, level1)])
df.columns = flattened_cols

# The better trick
df.set_index(flattened_cols[0], inplace=True)

# show results
print(df.head(6))

# save in desired file format
df.to_excel('specimen.xlsx')

The result in memory

            Axial Displacement (in)  Axial Strain (in/in)  Axial Force (lbf)
Time (Sec)                                                                  
15.372559             -2.685404e-04             -0.000134          -3.703097
15.472656              2.253209e-08             -0.000084          11.109222
15.572754             -2.685404e-04             -0.000117           3.703062
15.672852              2.253209e-08             -0.000117          -0.000018
15.772949              2.685854e-04             -0.000101           7.406142

The result in xls

specimen.xls

I deserve a straight up 'accepted answer' for this one! Best of luck.

Rich Andrews
  • 1,590
  • 8
  • 12
  • Awesome! Thanks for the assistance and patience. – Abb Mar 29 '19 at 17:34
  • You are very welcome, enjoy your work with Pandas. Also for the benefit of Stack Overflow's community, you may wish to rename your question, "Specify header rows and read_csv() data into column multiindex and save to_excel()" or something that captures the useful tricks shown. – Rich Andrews Mar 29 '19 at 17:49