I tried different things for too long now.
How do I load csv data containing dates into a numpy array? This is what doesn't work. It creates a single line with everything that is supposed to be a line now being in a single cell.
import io
import numpy as np
import datetime as dt
def date_parser(d_bytes):
s = d_bytes.decode('utf-8')
return np.datetime64(dt.datetime.strptime(s, "%Y-%m-%d %H:%M:%S"))
def read_csv():
five_min_candles_str = """2020-06-01 17:05:00,9506.01,9523.31,9500.0,9514.52
2020-06-01 17:10:00,9513.44,9525.22,9500.32,9522.0
2020-06-01 17:15:00,9521.56,9525.59,9513.75,9523.53
2020-06-01 17:20:00,9523.21,9525.53,9518.78,9524.55
2020-06-01 17:25:00,9524.55,9538.4,9522.93,9528.73
2020-06-01 17:30:00,9528.73,9548.98,9527.95,9543.72
2020-06-01 17:35:00,9542.71,9547.34,9536.57,9543.66
2020-06-01 17:40:00,9543.67,9543.67,9530.0,9531.85
2020-06-01 17:45:00,9530.84,9535.01,9524.1,9526.75
2020-06-01 17:50:00,9526.47,9538.64,9521.87,9534.57
2020-06-01 17:55:00,9534.58,9548.9,9533.04,9546.98
2020-06-01 18:00:00,9548.18,9558.9,9536.99,9556.25
2020-06-01 18:05:00,9556.15,9579.8,9547.7,9574.09
2020-06-01 18:10:00,9575.0,9592.59,9571.3,9573.93
2020-06-01 18:15:00,9573.68,9610.0,9569.6,9597.78
2020-06-01 18:20:00,9597.78,9598.85,9578.0,9591.39
"""
nparray = np.genfromtxt(io.StringIO(five_min_candles_str),
delimiter=',',
dtype=[('Timestamp','datetime64[us]'),
('Open','object'),
('High','object'),
('Low','object'),
('Close','object')],
converters={0: date_parser},
)
print(nparray)
if __name__ == "__main__":
read_csv()
A solution or hint would be much appreciated!
Edit: It turned out that it was indeed working already but I expected an 2D array while it became an array of tuples after I've added the types or the converter. The reason for that are the different types in a row. See the other SO question
I marked the answer below as correct anyways as I like it more because it doesn't need any custom parsing of the date and I also like the splitlines()
solution more compared to io.StringIO()