.isin
checks if each value in the column is contained in a list of arbitrary values. Roughly equivalent to value in [value1, value2]
.
str.contains
checks if arbitrary values are contained in each value in the column. Roughly equivalent to substring in large_string
.
In other words, .isin
works column-wise and is available for all data types. str.contains
works element-wise and makes sense only when dealing with strings (or values that can be represented as strings).
From the official documentation:
Series.isin(values)
Check whether values are contained in Series.
Return a boolean Series showing whether each element in the Series
matches an element in the passed sequence of values exactly.
Series.str.contains(pat, case=True, flags=0, na=nan,**
**regex=True)
Test if pattern or regex is contained within a
string of a Series or Index.
Return boolean Series or Index based on whether a given pattern or
regex is contained within a string of a Series or Index.
Examples:
print(df)
# a
# 0 aa
# 1 ba
# 2 ca
print(df[df['a'].isin(['aa', 'ca'])])
# a
# 0 aa
# 2 ca
print(df[df['a'].str.contains('b')])
# a
# 1 ba