1

I am doing a dataset research and I came down with a classficaition results dataset as follow.

   Actual Prediction
0     DS    DS
1     DS    DS
2     WS    DS
3     WD    WD
4     WS    WS

I want to compare both the Actual and Prediction, and only return the rows that have Actual == DS and Prediction == DS.

Desired Output:
    Actual Prediction
0    DS    DS
1    DS    DS
2    DS    DS

lines with Actual != DS and Prediction == DS are considered wrong classification

Such that I will be able to do a calculation on the accuracy of successful classification.

I have searched across quite a lot but I was not able to solve this problem by tring out a lot of dataframe built-in functions such as count, duplicate, etc.

Any help will be very much appreciated!

Sanny H
  • 23
  • 1
  • 3

1 Answers1

0
data = {'Actual': ['DS', 'WD'], 'Prediction':  ['DS', 'WD']}  
df = pd.DataFrame(data)
df[(df.Actual == "DS") & (df.Prediction == "DS")]

Result:

    Actual  Prediction
0   DS      DS
sander
  • 1,340
  • 1
  • 10
  • 20