I have a CSV file with 58K rows of time-series data. The first column is the timestamp and it's all unique values.
When I do pd.read_csv("data.csv")
, it takes less than a second. But when I do pd.read_csv("data.csv", parse_dates=[0])
it takes more than 30 seconds. Is this the expected performance to get when parsing timestamps in a csv?
I tried all the solutions here and I couldn't improve it: Pandas: slow date conversion
Is there a way to improve the performance? What if I tell pandas the datetime format?
Here's a working repl to play with: https://repl.it/@eparizzi/Pandas-Testing-1