1

1.I am using this code in catalog.yml file

equipment_data:
  type: pandas.ExcelDataSet
  filepath: data\01_raw\Equipment Profile.xlsb
  layer: raw

  1. getting error after executing kedro run command.

` kedro.io.core.DataSetError: Failed while loading data from data set ExcelDataSet(filepath=C:/Users/Akshay Salvi/Desktop/Bizmetrics/kedro-environment/petrocaeRepo/data/01_raw/2. Cycle data (per trip)-20210113T042557Z-001/2. Cycle data (per trip)/CycleData,2020.xlsb, load_args={'engine': xlrd}, protocol=file, save_args={'index': False}, writer_args={'engine': xlsxwriter}).

Excel 2007 xlsb file; not supported `

Akshay Salvi
  • 199
  • 2
  • 5

1 Answers1

1

So the pandas.ExcelDataset simply calls pandas underneath so hopefully you can have luck following this example from another thread where the engine (provided by pip install pyxlsb installing another package) is used to parse it and simply provide the engine parameter as load_args in your YAML catalog.

datajoely
  • 1,466
  • 10
  • 13