I am currently working on a big data set that looks like this:
The problem that I have is that the cases that are reported daily are divided in 'Negative', 'Positive' and 'Inconclusive' cases. My goal is to sum up the number of cases that are reported daily but at the same time I also want to create separate columns for each kind of cases (a column for the Negative cases per day, one for the Positive ones and another one for the Inconclusive ones).
To reach my goal all I need to do is to somewhat filter the data set by creating a condition using the overall_outcome column and new_results_reported column. I tried it with the negative cases:
america3 = pd.DataFrame(data, columns = ['overall_outcome', 'new_results_reported']) contain_values = america3[america3['overall_outcome'].str.contains('Negative')] contain_values.head(20)
I just don't know if I am doing this correctly.If what I did is somewhat correct then I still can't figure out how to create a new column using the negative cases numbers only. And if it's not correct then I do not know what step to take next. I guess the problem is that the overall_outcome is an object and the new_results_reported is an int64.
I hope that I am making sense.