I have several large comma-separated datasets with float numbers (hundreds of thousands), but some of the values may be missing (usually none are missing, or just a few hundreds) and marked as null with the string "null"
. I want to convert all the values to actual float-type data and store them in a list for further processing.
I use the following code with a try-except
statement (which is the more pythonic way AFAIK) to convert the strings to float numbers, and if an exception is raised I check if the exception was raised due to a "null"
value. If a "null"
value was responsible for the exception I just pass
. Otherwise, I raise the exception. Keep in mind that in a typical dataset there will be no or very few exceptions.
try:
num = float(string)
except ValueError:
if string == "null"
pass
else:
raise ValueError
This works pretty fine, but I realized that when once in a while I get a dataset with lots of "null"
values (thousands), the code above takes ages to execute (literally several minutes or hours, while it would have been processed within seconds normally)!
Then I tried to add an if
statement in order to first check if the string-value equals to "null"
, and only if the string is not "null"
I try to convert the string with the float function. In this case, I will have many redundant string comparisons in a typical dataset with very few or no "null"
values:
try:
if string != "null":
num = float(string)
except:
print("String {} could not be converted to float".format(string))
raise
With the following code-sample I measured the execution time for both ways, and to my surprise I see that when I add the additional if
statement, that avoids all of the exceptions unless an unexpected non-float string other than "null"
appears, my code executes much faster.
#!/usr/bin/env python
from __future__ import print_function
from timeit import Timer
mylist = "99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,-89.955946,98.5,99.983333,99.45,108.925926,99.5,99.983333,100,99.966667,98.366667,99.983333,99.983333,91.516667,99.283333,99.65,99.933333,99.983333,99.933333,99.95,99.75,99.966667,99.733333,99.966667,100,99.75,99.916667,100,99.983333,99.983333,99.233333,99.933333,99.95,99.9,99.9,99.066667,99.933333,99.966667,99.966667,99.866667,99.316667,99.883333,99.9,99.983333,99.85,98.7,99.933333,99.95,99.983333,100,99.05,100,99.866667,99.983333,99.933333,99.883333,99.983333,100,99.983333,99.983333,99.95,100,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null"
float_list = []
def tryway():
float_list = []
mysplitlist = mylist.split(",")
for string in mysplitlist:
try:
myfloat = float(string)
float_list.append(myfloat)
except ValueError:
if string == 'null':
pass
else:
raise ValueError
#print(float_list)
def ifway():
float_list = []
mysplitlist = mylist.split(",")
for string in mysplitlist:
try:
if string != "null":
myfloat = float(string)
float_list.append(myfloat)
except:
raise
#print(float_list)
if __name__ == '__main__':
print("Testing Try")
tr = Timer("tryway()","from __main__ import tryway")
print(tr.timeit(1000))
print("Testing If")
ir = Timer("ifway()","from __main__ import ifway")
print(ir.timeit(1000))
Sample execution output:
$ ./test_try_if.py
Testing Try
2.23783922195
Testing If
0.631629943848
Can someone please explain why is that happening? What would be the best way to implement this?