0

I cannot figure this out. I want to change the "type" column in this dataset to 0/1 values.

url = "http://www.stats.ox.ac.uk/pub/PRNN/pima.tr"
Pima_training = pd.read_csv(url,sep = '\s+')
Pima_training["type"] = Pima_training["type"].apply(lambda x : 1 if x == 'Yes' else 0)

I get the following error:

A value is trying to be set on a copy of a slice from a DataFrame.
AyeTown
  • 831
  • 1
  • 5
  • 20

1 Answers1

0

This is a warning and won't break your code. This happens when pandas detects chained assignment, which is when you use multiple indexing operations, and there might be ambiguity about whether you are modifying the original df or a copy of the df. Other more experienced programmers have explained it in depth in another SO thread, so feel free to give it a read for a further explanation.

In your particular example, you don't need .apply at all here (see this question for why not, but using apply on a single column is very inefficient because it loops over rows internally), and I think it makes more sense to use .replace instead, and a pass a dictionary.

Pima_training['type'] = Pima_training['type'].replace({"No":0,"Yes":1})
Derek O
  • 16,770
  • 4
  • 24
  • 43