I have a .xlsb file that I want to use pandas and analyse. I have found how to use pyxlsb to open the file and create another dataset. However, the problem now is that the time formats have changed into a different number format (e.g 41256).
The code I'm using at the moment is:
dataset = []
with open_xlsb(file) as wb: #opening an xlsb file workbook
with wb.get_sheet(1) as sheet1:
for row in sheet1.rows():
dataset.append([item.v for item in row])
dataset= pd.DataFrame(dataset[1:], columns=dataset[0])
I have already tried the convert_date as follows:
convert_date(dataset)
I have also tried the to_datetime function, but unsure if I used it correctly. For reference, the dataset I am using has dates in multiple columns and rows so I'm looking for a way to covert all of them into the right format, whilst ignoring any errors.
EDIT: So I don't have a single column with "Date", rather I have multiple columns, such as StartDate, EndDate, Last Updated and a few others. The result I want to see is if I go to a column, e.g dataset.columns['StartDate'], I want to get a date value, such as 15/03/2019, as opposed to 42156.
Any help would be much appreciated!