I want to parse excel document to lists in Python. Is there a python library which is helpful for this action? And what functions are relevant in that library?
-
1can you export the Excel file as csv? – Facundo Casco Sep 10 '11 at 15:32
-
possible duplicate of [How can I open an Excel file in Python?](http://stackoverflow.com/questions/3239207/how-can-i-open-an-excel-file-in-python) – orlp Sep 10 '11 at 15:37
6 Answers
You're best bet for parsing Excel files would be the xlrd library. The python-excel.org site has links and examples for xlrd and related python excel libraries, including a pdf document that has some good examples of using xlrd. Of course, there are also lots of related xlrd questions on StackOverflow that might be of use.
One caveat with the xlrd library is that it will only work with xls
(Excel 2003 and earlier versions of excel) file formats and not the more recent xlsx
file format. There is a newer library openpyxl for dealing with the xlsx
, but I have never used it.
UPDATE:
As per John's comment, the xlrd library now supports both xls
and xlsx
file formats.
Hope that helps.

- 1
- 1

- 5,889
- 2
- 27
- 22
-
@John Machin, the principal maintainer of xlrd, has been known to frequent SO. – PTBNL Sep 10 '11 at 15:56
-
1Related question concerning xlsx http://stackoverflow.com/questions/4371163/reading-xlsx-files-using-python – dgorissen Oct 19 '11 at 16:31
-
8
-
4
The pandas library has a quick and easy way to read excel. If it's mostly just data and nothing too complicated it'll work:
import pandas as pd
ex_data = pd.read_excel('excel_file.xlsx')
It reads it into a pandas DataFrame, which is handy for data munging, etc.
To go to a list:
ex_data['column1_name'].values.tolist()
If you have multiple tables and things in each worksheet then you may want to use another library such as xlrd or openpyxl.

- 13,746
- 5
- 87
- 117
-
1pandas library depends on xlrd < 2.0 or openpyxl to read xslx files under the hood – Yuriy Petrovskiy Sep 06 '21 at 12:04
openpyxl is a great library and supports read/write to 2010 xlsx files.
sample parsing code
from openpyxl import load_workbook
wb = load_workbook('Book1.xlsx')
ws = wb.active
for row in ws.iter_rows():
for cell in row:
print cell.value
sample writing code
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
wb = Workbook()
dest_filename = 'empty_book.xlsx'
ws1 = wb.active
ws1.title = "range names"
for row in range(1, 40):
ws1.append(range(600))
wb.save(filename = dest_filename)
you can read more here: https://openpyxl.readthedocs.io/en/stable/index.html

- 317
- 1
- 12

- 3,666
- 2
- 29
- 32
-
2
-
-
I am a developer just researching parsing techniques. This is excellent – Jason V Oct 17 '19 at 14:45
xlrd is great for simple tasks, but if you need to work with any of Excel's deeper functionality (macros, advanced plotting, etc), and you are working on a windows machine, you can use the pywin32 library to control the win32com layer. This provides access to just about everything that can be controlled via macros / Visual Basic.

- 4,289
- 2
- 17
- 17
pyExcelerator does not seem to be maintained any more, but I have been using it for quite some time and have come to really like it.
Key Points:
- Platform independent
- Does not require Excel to be installed (meaning does not us COM communications)
Update
All of my new projects have moved to xlrd.

- 7,017
- 7
- 44
- 62
-
-
1@JohnY I would have to agree with you at this point. This still however worked at the time. – Adam Lewis Jan 17 '13 at 20:52