1

Let's say that I have a pandas DataFrame:

import pandas as pd

df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})

I wanted to search for a row with the values {'id': 2, 'name': 'Howards', 'points': 5}inside this DataFrame. How can I search it to receive the index from it, if it exists?

Here comes my problem. I have a method that receives a dict with unknown keys and a DataFrame with unknown columns too. I need to search inside this DataFrame to discover if I have the searched row inside than...

I found this answer that says about a method named iterrows. Is this the best way to find the row? Code:

import pandas as pd

df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]})
df = df.reset_index()

search = {'c1': 12, 'c2': 120}
index = -1
for idx, row in df.iterrows():
    if row == search:
        index = idx

If not, what is the best way?

DazzRick
  • 85
  • 7

2 Answers2

1

With a dataframe you can select/filter the data according to your needs. You can include all the conditions, or just some of them. The resulting dataframe will contains all the rows matching the conditions. This is more effective than using loops.

import pandas as pd
df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})
row = df[(df['id']==2) & (df['name']=='Howards') & (df['points']==5) ]
print(row)
print("index=", row.index[0])
print("id=", row.iloc[0].id)

result is:

   id     name  points
1   2  Howards       5

index= 1
id= 2
Malo
  • 1,233
  • 1
  • 8
  • 25
  • Ok, and how I do it dinamically in the code? – DazzRick Aug 08 '23 at 20:41
  • What do you mean ? replace what you search by variables/input in the code you have. you could have a text input or gui or anything. – Malo Aug 08 '23 at 20:43
  • If I don't know the keys and values from the search, if I use this on a method that receive unknow dicts and DataFrames. Because I'll use this for search inside a class with multiples dataframes, so the DataFrame will depend from the DataFrame that I needed when I called the method – DazzRick Aug 08 '23 at 20:47
1

With np.logical_and on filter clauses:

df.index[np.logical_and(*[df[k].eq(v) for k, v in search_d.items()])]

Index([1], dtype='int64')
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105