1

I have a data measured with an instrument and the format of it is .dat is simple OLE Structured Storage. I uploaded a sample here http://www.filedropper.com/sample1

I searched a lot but I could not find a way that I extract the data using Python or R. is anyone have a solution ?

nik
  • 2,500
  • 5
  • 21
  • 48

3 Answers3

1

You can use Python with the olefile module: pip install olefile

Then to read and extract:

import olefile
ole = olefile.OleFileIO('sample1.dat')
datastream = ole.openstream('3D Data')
with open('extract.dat', 'wb') as f:
    data = datastream.read()
    f.write(data)
chrki
  • 6,143
  • 6
  • 35
  • 55
  • I installed the olefile but i got this error I first put both the script and data in a folder in my desktop then i ran it Admins-MacBook-Pro:HPLC_3D_data admin$ ls extract.py sample1.dat Admins-MacBook-Pro:HPLC_3D_data admin$ python extract.py Traceback (most recent call last): File "extract.py", line 1, in import olefile ImportError: No module named olefile – nik May 17 '16 at 08:11
  • @nik Looks like the module wasn't properly installed, are you using Python 2 or 3? I tested on 2 – chrki May 17 '16 at 08:58
  • I am using Mac and this is the version of my python Admins-MBP:~ admin$ python -V Python 2.7.10 – nik May 17 '16 at 09:12
  • did you find the problem ? – nik May 17 '16 at 14:00
  • @nik worked fine for me on OSX 10.10.5, Python 2.7.6 - I used `sudo easy_install olefile` to install the module – chrki May 17 '16 at 18:56
  • @nik Sorry I have no idea how your data was produced and what format it is in, you should probably open another question on how to interpret and use the data you have. Binary data like in your ole file can be anything. – chrki May 17 '16 at 19:54
  • the data is generated by the EzChrom .dat is simple OLE Structured Storage https://en.wikipedia.org/wiki/COM_Structured_Storage you can extract binary blocks from storage using win32 API Storage functions (https://msdn.microsoft.com/en-us/library/windows/desktop/aa380341(v=vs.85).aspx) or you can use file-management software which considers OLE storage as archive and allow extracting data blocks as files. but i prefer to use python – nik May 17 '16 at 19:55
  • @nik Yeah I was able to look at the ole archive with another program and extract data from there as well, but the data in the ole archive is just more binary data that I don't know what it's made of and how to use it (binary data). Sorry – chrki May 17 '16 at 19:59
0

I thought I'd post my findings in an answer.

Sorry to say but it appears that you can't bring the OLE structured data into R in the current format.

OLEDB connection in R

Reading in .dat files is quite simple (see here for more info import dat file into R), but the OLE format complicates things. I'd recommend either using the answer provided by @chrki or extracting to a format other than OLE and then reading it into R.

Sorry I couldn't be of more help.

Community
  • 1
  • 1
plumbus_bouquet
  • 443
  • 6
  • 7
0

you can use the pillow module for Python 3.* or PIL for Python 2.* I use Python 3.4 so:

from PIL import OleFileIO
dir(OleFileIO) # to see all the stuff available inside

From there you can dump the streams and many more things.

Documentation about OleFileIO: Here

Belial
  • 821
  • 1
  • 9
  • 12