so, I am trying to analyze a whatsapp message. I tried looking for messages which contain the words "salam" (a), which messages contain both "salam" and "terima" (b), and then which messages contain "salam" but don't contain "terima" (c).
this is the code I used.
len(df[(df['message'].str.contains("salam"))])
len(df[(df['message'].str.contains("salam")) & (df['message'].str.contains("terima"))])
len(df[(df['message'].str.contains("salam")) != (df['message'].str.contains("terima"))])
in the image, a = 197, b = 143, and c = 72. Isn't it supposed to be a = b + c? Or perhaps !=
isn't the NOT
operator I should've used? Does anyone have any idea what did I do wrong?
Thank you so much for your help.