0

I have the following Python data frame:

state_data = {'State':['Alabama','Alaska','Arizona','Arkansas'],'PostCode':['AL','AK','AZ','AR'],'Area':['52,423','656,424','*','53,182'],'Pop':['4,040,587','550,043','3,665,228','2,350,750']}
state_data

import pandas as pd

df = pd.DataFrame(state_data)

dfdiff = df.set_index('State')
dfdiff

myfunc = lambda x: x.replace(',','')

dfdiff = map(myfunc,(dfdiff['Area'],dfdiff['Pop'])
dfdiff

I am successfully able to make a 'nameless' function with lambda, but am running into issues when attempting to apply it using the map function.

The error reads as follows:

TypeError: 'map' object is not subscriptable

How can I apply these changes to my dfdiff data frame through the map function (or is there a better way)?

  • `dfdiff = map(myfunc,(dfdiff['Area'],dfdiff['Pop'])` I count two opening parentheses and one closing. Python should have returned a syntax error and not even gotten to any error with `map` itself. Did you leave out part of the code? – Acccumulation Jul 14 '22 at 05:42
  • Apologies. I've also tried ```dfdiff = dfdiff.apply(myfunc, axis = 1) dfdiff``` but when I return ```dfdiff```, the function did not seem to apply. @Acccumulation – getintoityuh Jul 14 '22 at 05:45

1 Answers1

0

Use str.replace or apply, map gives map object not pandas DataFrame

state_data = {'State':['Alabama','Alaska','Arizona','Arkansas'],'PostCode':['AL','AK','AZ','AR'],'Area':['52,423','656,424','*','53,182'],'Pop':['4,040,587','550,043','3,665,228','2,350,750']}
state_data

import pandas as pd

df = pd.DataFrame(state_data)

dfdiff = df.set_index('State')
dfdiff['Area'] = dfdiff['Area'].str.replace(',', '')

dfdiff
    PostCode    Area    Pop
State           
Alabama AL  52423   4,040,587
Alaska  AK  656424  550,043
Arizona AZ  *   3,665,228
Arkansas    AR  53182   2,350,750

For multiple columns

df[['Area', 'Pop']] = df[['Area', 'Pop']].apply(lambda x: x.str.replace(',', ''))
    State   PostCode    Area    Pop
0   Alabama AL  52423   4040587
1   Alaska  AK  656424  550043
2   Arizona AZ  *   3665228
3   Arkansas    AR  53182   2350750
Epsi95
  • 8,832
  • 1
  • 16
  • 34
  • You also mention the ```apply``` function. Could you provide an example of using this ```apply``` as well? I'm looking to do this with multiple columns in the data frame and believe apply would be ther more efficeint way to do that. Let me know if I'm mistaken, @Epsi95 – getintoityuh Jul 14 '22 at 05:38
  • Also tried ```dfdiff = dfdiff.apply(myfunc, axis = 1)``` and this did not seem to work, @Epsi95 – getintoityuh Jul 14 '22 at 05:46
  • For apply use `dfdiff['Area'].apply(lambda x: x.replace(',',''))` but using apply is discouraged – Epsi95 Jul 14 '22 at 05:48
  • Why is using apply discouraged? Also, how would I use this for all columns, or multiple columns, at least? @Epsi95 – getintoityuh Jul 14 '22 at 05:49
  • @getintoityuh please check the edited answer – Epsi95 Jul 14 '22 at 05:52
  • https://stackoverflow.com/questions/54432583/when-should-i-not-want-to-use-pandas-apply-in-my-code#:~:text=It%20is%20because%20apply%20is,major%20overhead%20at%20each%20iteration. – Epsi95 Jul 14 '22 at 05:53