How to read single value from XLS(X) using pandas

pandas can be used conveniently to read a table of values from Excel. When extracting data from real-life Excel sheets, there are often metadata fields which are not structured as a table readable by pandas.

Reading the pandas docs, it is not obvious how we can extract the value of a single cell (without any associated headers) with a fixed position, for example:

In this example, we want to extract cell C3, that is we want to end up with a string of value Test value #123. Since there is no clear table structure in this excel sheet (and other cells might contain other values we are not interested in - for example, headlines or headers), we don’t want to have a pd.DataFrame but simply a string.

This is how you can do it:

read_single_value_from_excel.py

def read_value_from_excel(filename, column="C", row=3):
    """Read a single cell value from an Excel file"""
    return pd.read_excel(filename, skiprows=row - 1, usecols=column, nrows=1, header=None, names=["Value"]).iloc[0]["Value"]

def read_value_from_excel(filename, column="C", row=3):
    """Read a single cell value from an Excel file"""
    return pd.read_excel(filename, skiprows=row - 1, usecols=column, nrows=1, header=None, names=["Value"]).iloc[0]["Value"]

Example usage

read_value_from_excel(“Test.xlsx”, “C”, 3) # Prints Let’s explain the parameters we’re using as arguments to pd.read_excel():

Test.xlsx: The filename of the file you want to read
skiprows=2: Row number minus one, so the desired row is the first we read
usecols="C": Which columns we’re interested in - only one!
nrows=1: Read only a single row
header=None: Do not assume that the first row we read is a header row
names=["Value"]: Set the name for the single column to Value
.iloc[0]: From the resulting pd.DataFrame, get the first row ([0]) by index (iloc)
["Value"] From the resulting row, extract the "Value" column - which is the only column available.

In my opinion, using pandas is the best way of extracting for most real-world usecases (i.e. more focused on development speed than on execution speed) because not only does it provide automatic engine selection for .xls and .xlsx files, it’s also present on most Data Science setups anyway and provides a standardized API.

Check out similar posts by category: Pandas, Python

If this post helped you, please consider buying me a coffee or donating via PayPal to support research & publishing of new posts on TechOverflow

Buy me a coffee