1

I'm working on Chicago crimes dataset and having trouble with the Arrest column. It says True or False but those are strings, not boolean entries. I tried several things I found on this website but it hasn't fixed it.

test = test['Arrest'].map({'False':False, 'True':True})

This just made everything in the dataset True across all columns.

I also tried a for loop though I'm not sure I got it right.

for i in test['Arrest']:
    if i=='True':
        return 1
    else:
        return 0

I also found a suggestion for a similar problem. This was the code suggested

def str_to_bool(s):
    if s == 'True':
         return True
    elif s == 'False':
         return False
    else:
         raise ValueError

But I find that very confusing and not quite applicable either

So for a minimal working example, I'm not sure how I'm supposed to present it:

crimes2012 = pd.read_csv("C:\\Users\\Owner\\Desktop\\Chicago Dataset\\Chicago_Crimes_2012_to_2017.csv", header=0)
primary = crimes2012[['Primary Type','Arrest']].copy()
test=primary.groupby(['Primary Type','Arrest']).size().sort_values().reset_index(name='Count')
test['Arrest'] = test.Arrest.map(pd.eval)
furas
  • 134,197
  • 12
  • 106
  • 148
Mustafa Moiz
  • 53
  • 1
  • 6
  • `.map({'False':False, 'True':True})` works for me. Are you sure you used `map` and not `replace`? You can also use `.map(pd.eval)`. – cs95 Apr 07 '19 at 01:42
  • `def str_to_bool(s): return s == "True"` – furas Apr 07 '19 at 01:44
  • maybe create minimal working example so we could run it. – furas Apr 07 '19 at 01:46
  • .map({'False':False, 'True':True}) isn't working for me at all but .map(pd.eval) worked I think! – Mustafa Moiz Apr 07 '19 at 01:48
  • You're missing an `['Arrest']` in `test = test['Arrest'].map({'False':False, 'True':True})` – gmds Apr 07 '19 at 01:50
  • `.map()` works for me - but I have to assing it back to the same column `test['Arrest'] = test['Arrest'].map(...)`. If I test `print(test['Arrest'].dtype)` then it shows `bool` – furas Apr 07 '19 at 01:50
  • so for minimal working example: `test = pd.DataFrame({ "Arrest": ["False","False","False"], "X": ["True","False","True"], })`. I forgot `import pandas as pd` – furas Apr 07 '19 at 01:53
  • I tried it again the dtype is object – Mustafa Moiz Apr 07 '19 at 01:55
  • `test[['Arrest', 'X']] = test[['Arrest', 'X']].applymap(pd.eval)` for multiple columns. – cs95 Apr 07 '19 at 01:57
  • check all values in column - if you have value(s) different than `"True"`, `"False"` then `.dtype` can't be `bool` – furas Apr 07 '19 at 01:57
  • I think the suggestion of .map(pd.eval) worked, it certainly returned a bool type for Arrests. I don't know how to mark it as the answer but its working right now – Mustafa Moiz Apr 07 '19 at 01:58
  • @coldspeed I created column "X" to confirm that `test['Arrest'].map()` can't change values in other columns :) – furas Apr 07 '19 at 01:58
  • `test['Arrest'] = test['Arrest'].map()` don't forget to assign it back... – cs95 Apr 07 '19 at 01:59
  • ask coldspeed to create real answer below and then you can mark it as accepted – furas Apr 07 '19 at 01:59
  • No worries, I've marked the question as a duplicate of another one. If you are looking for something to upvote, feel free to read some of [my answers here](https://github.com/Coldsp33d/stackoverflow-pandas-canonicals) ;) – cs95 Apr 07 '19 at 02:00

0 Answers0