How to fix pandas pd.read_excel() error XLRDError: Excel xlsx file; not supported
Problem:
When trying to read an .xlsx
file using pandas pd.read_excel()
you see this error message:
XLRDError: Excel xlsx file; not supported
Solution
The xlrd library only supports .xls
files, not .xlsx
files. In order to make pandas able to read .xlsx
files, install openpyxl
:
sudo pip3 install openpyxl
After that, retry running your script (if you are running a Jupyter Notebook, be sure to restart the notebook to reload pandas!).
If the error still persists, you have two choices:
Choice 1 (preferred): Update pandas
Pandas 1.1.3 doesn’t automatically select the correct XLSX reader engine, but pandas 1.3.1 does:
sudo pip3 install --upgrade pandas
If you are running a Jupyter Notebook, be sure to restart the notebook to load the updated pandas version!
Choice 2: Explicitly set the engine in pd.read_excel()
Add engine='openpyxl'
to your pd.read_excel()
command, for example:
pd.read_excel('my.xlsx', engine='openpyxl')