I have a data frame with a categorical column(TweetType) with three categories (T, RT and RE). I want to count how many times these categories appear and then sum them. I created three new columns, respectively T, RT, and RE.
def tweet_type(df):
result = df.copy()
result['T'] = result['tweetType'].str.contains("T")
result['RT'] = resulT['tweetType'].str.contains("RT")
result['RE'] = result['tweetType'].str.contains("RE")
return result
tweet_type(my_df)
Then I converted the boolean into 0 and 1. The problem is that the code matches T as RT and the result is not right.
What I obtain is:
TweetType RT T RE
RT 1 1 0
RE 0 0 1
T 1 0 0
RT 1 1 0