1

I'm trying out dfply as an alternative to Pandas apply and applymap. Given some fake data:

import pandas as pd
from dfply import *
df = pd.DataFrame({'country':['taiwan','ireland','taiwan', 'ireland', 'china'],
                   'num':[10.00, 10.50, 33.99, 10.50, 300],
                   'score':[1, 1, 3, 5, 10]})

   country     num  score
0   taiwan   10.00      1
1  ireland   10.50      1
2   taiwan   33.99      3
3  ireland   10.50      5
4    china  300.00     10

IRL I often need to make custom mappings. Instead of .map I tried this:

@pipe
def update_country(country):
    if country == 'taiwan':
        return 'Republic of Taiwan'
    else:
        return country

df >> mutate(new_country=update_country(X.country)) >> select(X.new_country)

But I get this output:

                                      new_country
0  <dfply.base.pipe object at 0x000001CAECD9B4F0>
1  <dfply.base.pipe object at 0x000001CAECD9B4F0>
2  <dfply.base.pipe object at 0x000001CAECD9B4F0>
3  <dfply.base.pipe object at 0x000001CAECD9B4F0>
4  <dfply.base.pipe object at 0x000001CAECD9B4F0>

Am I using the wrong decorator? Or can I do without a custom function?

Chuck
  • 1,061
  • 1
  • 20
  • 45
  • 1
    You should be doing `df = df.country.map({'taiwan': 'Republic of Taiwan'})` and put the names of all the countries to updated, in the dictionary. Pandas has it's own functions. Use of `dfply` is not needed, and introduces extensive amounts of unnecessary code. – Trenton McKinney Aug 08 '22 at 20:51

1 Answers1

1

Here you are trying to pass the series (X.country). Just use apply function

You can achieve this without decoration.

#DATA
df = pd.DataFrame({'country':['taiwan','ireland','taiwan', 'ireland', 'china'],
                   'num':[10.00, 10.50, 33.99, 10.50, 300],
                   'score':[1, 1, 3, 5, 10]})

#UTILITY FUNCTION
def update_country(country):
    if country == 'taiwan':
        return 'Republic of Taiwan'
    else:
        return country

#PIPING
#MAKE A NOTE THAT APPLY FUNCTION IS CALLED ON SERIES
result = df >> mutate(new_country=X.country.apply(update_country)) >> select(X.new_country)

print(result)
Pavan Chandaka
  • 11,671
  • 5
  • 26
  • 34