0

I've read a Pandas dataframe from file:

df = pd.read_csv('data_here.csv')

When I try "str2try" in df['col2search'] it returns False, but when I try "str2try" in df['col2search'].values, it returns True (and this is what I'd expect in this case).

I don't see why there would be a behavioral difference; I read that .values returns the Numpy representation of the column, but why does "str2try" in <NDFrame representation of column> return False?

Thanks!

2 Answers2

1

A pandas Series is like a dictionary. in searches its index (or keys) so "str2try" in df['col2search'] checks whether the string is in the index of that Series:

df = pd.DataFrame({'A': [1, 2, 3]}, index=['x', 'y', 'z'])

df
Out: 
   A
x  1
y  2
z  3

'x' in df['A']
Out: True

2 in df['A']
Out: False

'x' in df['A'].values
Out: False

2 in df['A'].values
Out: True

Here's how it would behave in a dictionary:

d = {'x': 1, 'y': 2, 'z': 3}

'x' in d
Out: True

2 in d
Out: False

2 in d.values()
Out: True
ayhan
  • 70,170
  • 20
  • 182
  • 203
0

Iteration will be valid in case of list or array. Check this below explanation

import pandas as pd
frame = pd.DataFrame({'a' : ['the cat is blue', 'the sky is green', 'the dog is black']})
In [4]: f["a"]
Out[4]: 
0     the cat is blue
1    the sky is green
2    the dog is black
Name: a, dtype: object
In [5]: f["a"].values
Out[5]: array(['the cat is blue', 'the sky is green', 'the dog is black'], dtype=ob
ject)
In [6]: type(f["a"])
Out[6]: pandas.core.series.Series
In [7]: type(f["a"].values)
Out[7]: numpy.ndarray
Saket Mittal
  • 3,726
  • 3
  • 29
  • 49