1

This thread doesn't seem to cover a situation I am routinely in.

Return single cell value from Pandas DataFrame

How does one return a single value, not a series or dataframe using a set of column conditions as keys? This seems to be a common need. Say you have a database of info and you need to pluck answers to questions from it, but you need one answer, not a series of possible answers. My method seems "hokey" -- not Pythonic? And maybe not good for technical reasons.

import pandas as pd
d = {'A': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'B': [1, 2, 3, 1, 2, 3, 1, 2, 3], 'C': [1, 3, 5, 
     2, 9, 7, 4, 3, 2]}
df = pd.DataFrame(data=d)

df looks like:

        A   B   C
    0   1   1   1
    1   1   2   3
    2   1   3   5
    3   2   1   2
    4   2   2   9
    5   2   3   7
    6   3   1   4
    7   3   2   3
    8   3   3   2

How to get the value in the C column where A == 1 and B == 3? In my case it's always unique, but I can see how that cannot be assumed so this method returns a series:

df[(df['A'] == 1) & (df['B'] == 3)]['C']

I don't want a series. So how to get a single value, not a series or list of one row or one element?

My method:

df[(df['A'] == 1) & (df['B'] == 3)]['C'].tolist()[0]

In the Pandas library it seems DataFrame.at is the way to go, but this method doesn't look better, though I wonder if it is technically better:

df.at[df.loc[(df['A'] == 1) & (df['B'] == 3)].index[0], 'C']

So, in your opinion, what is the best way to using multiple column conditions to find a value in a dataframe and return a single value (not a list or series)?

Blaine Kelley
  • 85
  • 1
  • 6
  • Found the pd.DataFrame.squeeze function for future reference. May be useful. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.squeeze.html – Blaine Kelley Aug 05 '21 at 12:53

2 Answers2

1

If the combination of columns A and B is unique then we can set the index in advance to efficiently retrieve a single value

df.set_index(['A', 'B']).loc[(1, 3), 'C']

Alternative approach with item

df.loc[df['A'].eq(1) & df['B'].eq(3), 'C'].item()
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
  • These are both helpful, but I suppose I should not assume they are unique as there is nothing in my database that would force it. Usually it's unique. These have issues when it is not. Also if doing it repeatedly, the df.set_index() seems to take more cycles to execute. – Blaine Kelley Jul 14 '21 at 15:13
1

I have sat with the same question a few times in the past. I have come to accept that it's not actually that common for me to do this anyway, so I usually just do this:

df.loc[(df['A'] == 1) & (df['B'] == 3), "C"].iat[0]

# frequently I also like to make it more readable like this
is1and3 = (df['A'] == 1) & (df['B'] == 3)
df.loc[is1and3, "C"].iat[0]

This is almost the same as

df.at[df.loc[(df['A'] == 1) & (df['B'] == 3)].index[0], 'C']

which essentially just grabs the first index matching the condition and passes it to .at, rather than subsetting and then grabbing the first returned value with .iat[0], but I don't really like seeing .loc and .index in the call to .at.

Obviously the problem that pandas needs to handle is that there is no guarantee that a condition will only be satisfied by exactly one value in the df, so it's left to the user to handle that.

Some basic guidance
some more in depth

Stryder
  • 848
  • 6
  • 9
  • 1
    I think it has helped me to realize one must have a clear coordinate system for looking up a value in a dataframe, which is visually 2 dimensions of rows and columns. I need a specific row and a specific column if I am to get a specific, single value. Thinking this way, how do I get the row? I have the column already. The .iat method seems to be the more pythonic option. df.loc says "get me the rows where condition is true", "C" picks the column to look up, and iat[0] says "get me the value at the first row in that returned df using that column" – Blaine Kelley Jul 14 '21 at 15:17