How to create Pandas Series with Decimal?

Question

I'm calculating some standard deviations which are giving FloatingPointErrors. I wanted to try converting the data series to Decimal (using https://docs.python.org/3/library/decimal.html), to see if this fixes my issue.

I can't seem to make a pandas series of decimal.

How can I take a normal pd.Series of float64 and convert to a pd.Series of decimal, such that I can do:

Series.pct_change().ewm(span=35, min_periods=35).std()

score 4 · Answer 1 · answered Feb 14 '22 at 08:51

4

from decimal import Decimal

df['col_a'] = df['col_a'].apply(lambda x: Decimal(str(x)))

answered Feb 14 '22 at 08:51

David Wei

118
5

SerialDev · Answer 2 · 2016-06-29T09:34:40.533

2

would something like this work?

def column_round(decimals):
     return partial(Series.round, decimals=decimals)

df.apply(column_round(2))

alternatively lets use np.vectorize so we can use decimal.quantize function to do rounding, this will leave the variable as a decimal instead of np.float64

npquantize = np.vectorize(decimal.Decimal.quantize)

I have been looking into it and this seems to solve the issue with pct_change

ts.diff().div(ts.shift(1))

edited Jun 29 '16 at 09:34

answered Jun 29 '16 at 09:03

SerialDev

2,777
20
34

2

If I've understood correctly, this still uses floating point arithmetic; I want to enforce decimal arithmetic. – cjm2671 Jun 29 '16 at 09:10
have you considered converting the series into a numpy array and apply np.vectorize prior to applying todecimal? – SerialDev Jun 29 '16 at 09:20

score 1 · Answer 3 · answered Sep 28 '18 at 15:16

I think you can create the DataFrame directly with Decimal types and operate with the values

import pandas as pd
import numpy as np
from decimal import *

df = pd.DataFrame({
    'DECIMAL_1': [Decimal('2342.2345234'), Decimal('564.5678'), Decimal('76867.8923892')],
    'DECIMAL_2': [Decimal('67867.43534534323'), Decimal('67876.345345'), Decimal('234234.2345345')]
})
df['DECIMAL_3'] = df['DECIMAL_1'] + df['DECIMAL_2']
df.dtypes

The drawback could be that the columns dtype is going to be object and the performance will decrease, I am afraid. Anyway, I think that any operation with the Decimal will require more computation than operating with floats.

Maybe the best solution is to have a copy of the DataFrame. One DF with floats and the other one with Decimal. If you need to make fast operations you can use the DF with floats, if you need to compare or assign new values to some cells with some specific precision you can use the DF created with Decimal.

Tell me what you think about my suggestions.

Note: I made my example with DataFrame, but a DataFrame is built with Series

How to create Pandas Series with Decimal?

3 Answers3

Linked