0

How can I get the number of the row and the column in a dataframe that contains a certain value using Pandas? For example, I have the following dataframe:

DataFrame

For example, i need to know the row and column of "Smith" (row 1, column LastName)

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
  • Does this answer your question? [Get row and column in Pandas for a cell with a certain value](https://stackoverflow.com/questions/53856763/get-row-and-column-in-pandas-for-a-cell-with-a-certain-value) – noah Dec 17 '20 at 22:08
  • From above link: `df.where(df=='Smith').dropna(how='all').dropna(axis=1).index` and `df.where(df=='Smith').dropna(how='all').dropna(axis=1).columns` – noah Dec 17 '20 at 22:09
  • 1
    I think it is also maybe worth asking WHY you need to do this. Typically pandas is used the other way (get value at x,y based on conditions). It makes me wonder if you are approaching a bigger task wrong. – noah Dec 17 '20 at 22:15

6 Answers6

1

Maybe this is a solution or a first step to a solution.

If you filter for the value you are looking for all items which are not the value you want are replaced with NaN. Now you can drop all columns where all values are NaN. This leaves a DataFrame with your item and the indices. Then you can ask for index and name.

import numpy as np
import pandas as pd
df = pd.DataFrame({'LastName':['a', 'Smith', 'b'], 'other':[1,2,3]})

value = df[df=='Smith'].dropna(axis=0, how='all').dropna(axis=1, how='all')
print(value.index.values)
print(value.columns.values)

But I think this can be improved.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
mosc9575
  • 5,618
  • 2
  • 9
  • 32
0

Here's a one liner that efficiently gets the row and column of a value:

df = pd.DataFrame({"ClientID": [34, 67, 53], "LastName": ["Johnson", "Smith", "Brows"] })
result = next(x[1:] for x in ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False)) for j, v in zip(df.columns, row_tup)) if x[0] == "Smith")
print(result)

Output

(1, "LastName")

Unpacking that one liner

# This is a generator that unpacks the dataframe and gets the value, row number (i) and column name (j) for every value in the dataframe
item_generator = ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False)) for j, v in zip(df.columns, row_tup))
# This iterates through the generator until it finds a match
# It outputs just the row and column number by leaving off the first item in the tuple
next(x[1:] for x in item_generator if x[0] == "Smith")

Props to this this answer for the second half of the solution

Matthew Cox
  • 1,047
  • 10
  • 23
0

Just to add another possible solution to the bucket. If you really need to your search your whole DataFrame, you may consider using numpy.where, such as:

import numpy as np

value = 'Smith'
rows, cols = np.where(df.values == value)

where_are_you = [(df.index[row], df.columns[col]) for row, col in zip(rows, cols)]

So, if your DataFrame is like

   ClientID First Name LastName
0        34         Mr    Smith
1        67      Keanu   Reeves
2        53     Master     Yoda
3        99      Smith    Smith
4       100      Harry   Potter

The code output will be:

[(0, 'LastName'), (3, 'First Name'), (3, 'LastName')]

Edit: Just to satisfy everybody's curiosity, here it is a benchmark of all answers

enter image description here

The code is written below. I removed the print statements to be fair, because they would make codes really slow for bigger dataframes.

val = 0

def setup(n=10):
    return pd.DataFrame(np.random.randint(-100, 100, (n, 3)))


def nested_for(df):
    index = df.index  # Allows to get the row index
    columns = df.columns  # Allows to get the column name
    value_to_be_checked = val
    for i in index[df.isin([value_to_be_checked]).any(axis=1)].to_list():
        for j, e in enumerate(df.iloc[i]):
            if e == value_to_be_checked:
                _ = "(row {}, column {})".format(i, columns[j])


def df_twin_dropna(df):
    value = df[df == val].dropna(axis=0, how='all').dropna(axis=1, how='all')
    return value.index.values, value.columns.values


def numpy_where(df):
    rows, cols = np.where(df.values == val)
    return [(df.index[row], df.columns[col]) for row, col in zip(rows, cols)]


def one_line_generator(df):
    return [x[1:] for x in ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False))
                            for j, v in zip(df.columns, row_tup)) if x[0] == "Smith"]
Ralubrusto
  • 1,394
  • 2
  • 11
  • 24
0

I tried to simplify the code and make it more readable. This is my attempt:

df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 9, 12],
                   'rebounds': [11, 8, 10, 6, 6]})
index = df.index # Allows to get the row index
columns = df.columns # Allows to get the column name
value_to_be_checked = 6
for i in index[df.isin([value_to_be_checked]).any(axis=1)].to_list():
  for j, e in enumerate(df.iloc[i]):
    if e == value_to_be_checked:
      print("(row {}, column {})".format(i, column[j])
Ersen
  • 177
  • 8
0

You can do this by looping though all the columns and finding the matching rows. This will give you a list of all the cells that matches your criteria:

Method 1(without comprehension):

import pandas as pd
# assume this df and that we are looking for 'abc'
df = pd.DataFrame({
    'clientid': [34, 67, 53],
    'lastname': ['Johnson', 'Smith', 'Brows']
})

Searchval = 'Smith'
l1 = []
#loop though all the columns
for col in df.columns:
    #finding the matching rows
    for i in range(len(df[col][df[col].eq(Searchval)].index)):
        #appending the output to the list
        l1.append((df[col][df[col].eq(Searchval)].index[i], col))
print(l1)

Method 2 (With comprehension):

import pandas as pd

df = pd.DataFrame({
    'clientid': [34, 67, 53],
    'lastname': ['Johnson', 'Smith', 'Brows']
})
#Value to search
Searchval = 'Smith'
#using list comprehension to find the rows in each column which matches the criteria
#and saving it in a list in case we get multiple matches
l = [(df[col][df[col].eq(Searchval)].index[i], col) for col in df.columns
     for i in range(len(df[col][df[col].eq(Searchval)].index))]

print(l)
roshaga
  • 257
  • 3
  • 11
-3

Thanks for submitting your request. This is something you can find with a Google search. Please make some attempt to find answers before asking a new question.

You can find simple and excellent dataframe examples that include column and row selection here: https://studymachinelearning.com/python-pandas-dataframe/

You can also see the official documentation here: https://pandas.pydata.org/pandas-docs/stable/

Select a column by column name:

df['col']

select a row by index:

df.loc['b']  
cadvena
  • 1,063
  • 9
  • 17
  • 2
    Aside from the unfriendly tone you didn't even answer the correct question – noah Dec 17 '20 at 22:01
  • I did answer the question incorrectly. My apology. I see that other have answer the question, so I will not attempt to do so here. – cadvena Dec 17 '20 at 22:34