0

I want to get the sum of a column based on whether it matches multiple strings. For example,

pd.DataFrame({'key': ['b', 'b d', 'a', 'c t', 'a', 'b p'], 'data1': range(6)})

I want to be able to get the sum for data1 column IF the strings in the key column contain 'b' and 'c'. The sum in this case would be 9.

Salahuddin
  • 37
  • 1
  • 11

1 Answers1

1

Try this using regex express and string accessor, .str with contains:

df.loc[df['key'].str.contains('b|c'), 'data1'].sum()

Output:

9
Scott Boston
  • 147,308
  • 15
  • 139
  • 187