3

I have been importing Excel files as Pandas data frames using the read_excel function with no apparent issues so far. However, I just realized that after some recent updates I'm getting the below warning:

/usr/local/lib/python3.7/site-packages/xlrd/xlsx.py:266: PendingDeprecationWarning: This method will be removed in future versions. Use 'tree.iter()' or 'list(tree.iter())' instead.

for elem in self.tree.iter() if Element_has_iter else self.tree.getiterator(): /usr/local/lib/python3.7/site-packages/xlrd/xlsx.py:312: PendingDeprecationWarning: This method will be removed in future versions. Use 'tree.iter()' or 'list(tree.iter())' instead.

for elem in self.tree.iter() if Element_has_iter else self.tree.getiterator():

Searching the internet, it seems that the xlrd is being replaced by openpyxl. Now my questions are:

  • What does this warning mean and what should I do?
  • Is my data import safe at this moment? Do I have to worry that something not working properly?
  • What are those tree.iter() or list(tree.iter()) methods? and what they are replacing?
  • Is there another method to import Excel files as pandas data frames without getting this warning already?
  • Should I report a bug or issues somewhere? Where?

my environment is:

  • macOS Mojave 10.14.6
  • Python 3.7.6
  • Pandas 1.0.0
  • xlrd 1.2.0
Foad S. Farimani
  • 12,396
  • 15
  • 78
  • 193

1 Answers1

7

Your data import is "safe" at the moment. To get rid of the warning and future-proof your code, try:

pd.read_excel(filename, engine="openpyxl")

or put this at the start of your script:

import pandas as pd
pd.set_option("xlsx", "openpyxl")
adr
  • 1,731
  • 10
  • 18
  • Is this a new feature? – Foad S. Farimani Mar 16 '20 at 19:56
  • 4
    The only thing constant with Python is change. With the possible exception of `xlrd`! – adr Mar 17 '20 at 20:05
  • I wish that `pd.set_option` existed, but after looking through the source I don't see it as of now. – shapiromatron Aug 28 '20 at 16:55
  • [set_option](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.set_option.html#pandas.set_option) does exist, but at least as of version 1.1.3, `io.excel.xlsx.reader` is ignored in the code. The only answer that seems to work is to pass `engine="openpyxl"` to `pd.read_excel`. – Leo Oct 18 '20 at 11:19