I believe to have a simple problem. I have a pandas dataframe df
looking quite similar to this:
data = [{"Text" : "Dog", "Dog" : 1},
{"Text" : "Cat", "Dog" : 0},
{"Text" : "Mouse", "Dog" : 0},
{"Text" : "Dog", "Dog" : 1}]
df = pd.DataFrame(data)
I am trying to search the column Text
for a number of keywords and count how many times they appear in each cell. The result is supposed to be stored in a new column that shows how many times the specific keyword was found. The result is supposed to be just like the Dog
column.
I tried using pandas
str.count
. It works just fine. But in the moment I try to store the result in a new column, I run in to trouble:
mykewords = ('Cat', 'Mouse')
df['Cat'] = df.Text.str.count("Cat")
I get the following error message:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if __name__ == '__main__':
I have two questions:
- What am I doing wrong and how can I solve it?
- How can loop through all keywords in
mykeywords
and get a column each?
Thank you very much for any help in advance!