I am reading in numbers from a file and casting them to floats. The numbers look like this.
1326.617827, 1322.954823, 1320.512821, 1319.291819...
I split each line at commas and then create the list of floats through a list comprehension.
def listFromLine(line):
t = time.clock()
temp_line = line.split(',')
print "line operations: " + str(time.clock() - t)
t = time.clock()
ret = [float(i) for i in temp_line]
print "float comprehension: " + str(time.clock() - t)
return ret
The output is looking something like this
line operations: 5.52103727549e-05
float comprehension: 0.00121321255003
line operations: 9.52025017378e-05
float comprehension: 0.000943885026522
line operations: 7.0782529173e-05
float comprehension: 0.000946716327689
Casting to an int and then dividing by 1.0 is a lot faster, but is useless in my case as I need to keep the numbers after the decimal point.
I saw this question and had a go at using pandas.Series but that went slower than what I was doing previously.
In[38]: timeit("[float(i) for i in line[1:-2].split(',')]", "f=open('pathtofile');line=f.readline()", number=100)
Out[37]: 0.10676022701363763
In[39]: timeit("pandas.Series(line[1:-2].split(',')).apply(lambda x: float(x))", "import pandas;f=open('pathtofile');line=f.readline()", number=100)
Out[38]: 0.14640622942852133
Changing the format of the file may be an option if that could speed it up, but an speeding it up at the reading end would be preferable.