2

I have this function

def dec(x):
    """Convert to Decimal and remove exponent and trailing zeros"""
    if not x:
        return Decimal(0)
    if not isinstance(x, Decimal):
        x = Decimal(str(x))
    return x.quantize(Decimal(1)) if x == x.to_integral() else x.normalize()

In pandas I would do

df['price'].apply(dec)

However, dask doesn't support this so what is another way to convert a column into the decimal type?

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • The question is already answered here: Refer [this](https://stackoverflow.com/questions/15891038/change-column-type-in-pandas) – Varad Dec 23 '21 at 16:39

2 Answers2

1

Dask DataFrame does support apply, and it'll work well for your example:

ds = dd.from_pandas(s, npartitions=2)
ds.apply(dec).compute()

In general though, I'd suggest using map_paritions like Sultan has demonstrated.

You can also check out this blog: Parallelize pandas apply() and map() with Dask DataFrame that discusses these functions in detail.

pavithraes
  • 724
  • 5
  • 9
0

Assuming your dask dataframe is called ddf, the .map_partitions should resolve it:

def pandas_wrap(df):
   df['new_price'] = df['price'].apply(dec)
   # potentially some other pandas code
   return df

ddf = ddf.map_partitions(pandas_wrap)
SultanOrazbayev
  • 14,900
  • 3
  • 16
  • 46