I am using numpy.fromfile
to construct an array which I can pass to the pandas.DataFrame
constructor
import numpy as np
import pandas as pd
def read_best_file(file, **kwargs):
'''
Loads best price data into a dataframe
'''
names = [ 'time', 'bid_size', 'bid_price', 'ask_size', 'ask_price' ]
formats = [ 'u8', 'i4', 'f8', 'i4', 'f8' ]
offsets = [ 0, 8, 12, 20, 24 ]
dt = np.dtype({
'names': names,
'formats': formats,
'offsets': offsets
})
return pd.DataFrame(np.fromfile(file, dt))
I would like to extend this method to work with gzipped files.
According to the numpy.fromfile documentation, the first parameter is file:
file : file or str Open file object or filename
As such, I added the following to check for a gzip file path:
if isinstance(file, str) and file.endswith(".gz"):
file = gzip.open(file, "r")
However, when I try pass this through the fromfile
constructor I get an IOError
:
IOError: first argument must be an open file
Question:
How can I call numpy.fromfile
with a gzipped file?
Edit:
As per request in comments, showing implementation which checks for gzipped files:
def read_best_file(file, **kwargs):
'''
Loads best price data into a dataframe
'''
names = [ 'time', 'bid_size', 'bid_price', 'ask_size', 'ask_price' ]
formats = [ 'u8', 'i4', 'f8', 'i4', 'f8' ]
offsets = [ 0, 8, 12, 20, 24 ]
dt = np.dtype({
'names': names,
'formats': formats,
'offsets': offsets
})
if isinstance(file, str) and file.endswith(".gz"):
file = gzip.open(file, "r")
return pd.DataFrame(np.fromfile(file, dt))