2

I have a multi-sheet excel file saved in a .xlsb format that I wish to covert to .xlsx to utilize the openpyxl library - code already written to support the same workbook that used to be .xlsx until macro-enabled, and wouldn't save in .xlsm.

I have managed to convert from .xlsb to .csv, but cannot convert any further and have hit roadblocks with various libraries due to various formatting errors.

As my file has multiple sheets (all tables) I only need to copy and paste the text on every sheet (keeping the sheet names) and get it to a .xlsx format.

For simplicity sake, imagine all I need to do is: get sheet names, access a sheet, determine max row/column, loop: copy and paste cell values, write to .xlsx with sheet name. With the starting file being .xlsb.

Any suggestion would be much appreciated.

STGR
  • 21
  • 1
  • 1
  • 2

5 Answers5

4

Just tested this.

Pandas now supports xlsb and can open these files using Glen Thompson's method described in Read XLSB File in Pandas Python:

import pandas as pd
df = pd.read_excel('path_to_file.xlsb', engine='pyxlsb')

And then you could use:

df.to_excel('path_to_file.xlsx')

Make sure you pip install pyxlsb, openpyxl and xlrd, I always forget.

andmck
  • 101
  • 4
1

Try the latest xlsb2xlsx package on PyPI:

pip install xlsb2xlsx
python -m xlsb2xlsx /filepath_with_xlsb_file

See https://pypi.org/project/xlsb2xlsx/ for more info.

0

Office answer

Newer version of Microsoft Office Excel, OpenOffice Calc or LibreOffice Calc can read xlsb.

So you'll just need to open the file in one of those and export it to xlsx

Reference

Python answer

CSV -> xlsx

This can be done easily with pandas, a python package.

import pandas as pd
df = pd.read_csv("/path/to/file.csv")
df.to_excel("/path/to/result.xlsx")

xlsb -> xlsx

As mentioned in this answer, pandas 1.0.0 added the support for binary excel files.

import pandas as pd
df = pd.read_excel("/path/to/file.xlsb", engine="pyxlsb")
df.to_excel("/path/to/result.xlsx")

In order to use it you must upgrade pandas, and install pyxlsb and 'openpyxl':

pip install pandas --upgrade
pip install pyxlsb

(this assumes the use of python 3 since python 2 reached End of Life)

If you need more control over the file you can directly use pyxlsb

Community
  • 1
  • 1
Tommaso Fontana
  • 710
  • 3
  • 18
  • pyxlsb has an "extemely limited" functionality. :( https://github.com/willtrnr/pyxlsb – MGM Nov 06 '20 at 21:37
0

I got copy code to test run,but that return error,above error . ValueError Traceback (most recent call last) in () ----> 1 df = pd.read_excel(r'C:\Users\l84193928\Desktop\test.xlsb', engine='pyxlsb')

D:\Users\l84193928\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util_decorators.py in wrapper(*args, **kwargs) 176 else: 177 kwargs[new_arg_name] = new_arg_value --> 178 return func(*args, **kwargs) 179 return wrapper 180 return _deprecate_kwarg

D:\Users\l84193928\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util_decorators.py in wrapper(*args, **kwargs) 176 else: 177 kwargs[new_arg_name] = new_arg_value --> 178 return func(*args, **kwargs) 179 return wrapper 180 return _deprecate_kwarg

D:\Users\l84193928\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\excel.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, **kwds) 305 306 if not isinstance(io, ExcelFile): --> 307 io = ExcelFile(io, engine=engine) 308 309 return io.parse(

D:\Users\l84193928\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\excel.py in init(self, io, **kwds) 367 368 if engine is not None and engine != 'xlrd': --> 369 raise ValueError("Unknown engine: {engine}".format(engine=engine)) 370 371 # If io is a url, want to keep the data as bytes so can't pass

ValueError: Unknown engine: pyxlsb

  • Please provide additional details in your answer. As it's currently written, it's hard to understand your solution. – Community Sep 02 '21 at 05:13
  • This does not really answer the question. If you have a different question, you can ask it by clicking [Ask Question](https://stackoverflow.com/questions/ask). To get notified when this question gets new answers, you can [follow this question](https://meta.stackexchange.com/q/345661). Once you have enough [reputation](https://stackoverflow.com/help/whats-reputation), you can also [add a bounty](https://stackoverflow.com/help/privileges/set-bounties) to draw more attention to this question. – Tomerikoo Sep 02 '21 at 12:29
0

We can use Pandas, but we can lose some information from .xlsb (such as color, border, images, etc) Best way to convert .xlsb to .xlsx format without losing information is using aspose-cells Module in Python, it uses java in backend. We can install this module using pip install aspose-cells

# Use Aspose.Cells for Python via Java
# Install java runtime 64 bit
import jpype
import asposecells
jpype.startJVM()
from asposecells.api import *

# Open an excel file
workbook = Workbook(r"xlsb_filepath.xlsb")
workbook.save(r"xlsx_filepath.xlsx")

reference from:

https://blog.aspose.com/2021/05/28/convert-excel-to-image-in-python/#:~:text=Python%20Excel%20to%20Image%20Converter%20API%20In%20order,can%20install%20the%20API%20using%20the%20following%20command.

https://products.aspose.com/cells/cpp/conversion/xlsb-to-xlsx/

deepesh
  • 73
  • 1
  • 6