1

I need help with truncating columns of decimals within a dataframe, that ties in with other calcuations to output a 1 or 0 binary column called "scenario1". Said dataframe has 20+ columns.

In this case I have two columns (columns A and columns B) of decimals along a time-based index. The columns may have varied amount of decimal points, and have NaN values in earlier rows.

I'm trying to truncate values in ONLY these two columns for use in a calculation and "throw away", WITHOUT changing the original columns or generating new columns.

e.g. ColA can be 4 decimals, ColB can be 6 decimals.

ColA ColB
NaN NaN
NaN NaN
0.9954 0.995642
0.9854 0.997450

If the value of ColA and ColB is close enough, I want to output TRUE. To do that, I have to somehow truncate both ColA and ColB to one decimal place without any rounding up or down. So it becomes the following:

ColA ColB
NaN NaN
NaN NaN
0.9 0.9
0.9 0.9

I need this truncation to happen within a function "scenario1", trying to have the code be efficient as possible. My current failed attempts are those lines with math.trunc(1000 * df.tenkan)/1000. This is from another post here, 3rd solution.:

def scenario1(df):
    df.loc[:, ('scenario1')] = np.where((df.close <= 2) & (df.tenkan > df.tenkan.shift(1)) &
     (df.kijun > df.kijun.shift(1)) & 
    ((math.trunc(1000 * df.tenkan) / 1000) == (math.trunc(1000 * df.kijun) / 1000)) & 
    (df.span_a > df.span_a.shift(1)) & (df.span_b > df.span_b.shift(1)) & 
    ((math.trunc(1000 * df.span_a) / 1000) == (math.trunc(1000 * df.span_b) / 1000)) & 
    (df.volume <= 300000), 1, 0)

    return df

But I got the following error: TypeError: type Series doesn't define trunc method

Any ideas how I can do this?

Screenshot of the actual dataframe below: enter image description here

Garrad2810
  • 113
  • 6

3 Answers3

2

You can use numpy trunc:

df = pd.DataFrame({'ColA ': [0.9954, 0.9854], 'ColB': [0.995642, 0.99745]})

np.trunc(10 * df) / 10

Result:

   ColA   ColB
0    0.9   0.9
1    0.9   0.9
Stef
  • 28,728
  • 2
  • 24
  • 52
  • to generalize: if `n` is the number of decimals to keep: `k = 10**n ; np.trunc(k * df) / k` – mozway Jun 15 '22 at 13:00
0

I think the easiest way to do this would be to use applymap paired with the truncate function of the math module. Here is an example:

trunc = lambda x: math.trunc(10 * x)/10

df.applymap(trunc)

You'll need to apply this over your columns of interest, but i tested it on a few arbitrary examples and it worked well. Hope that helps! Can expound on detail if necessary.

rsenne
  • 142
  • 6
0

You could also make use of regular expression:

df = pd.DataFrame({'ColA ': [0.99989, 0.986767], 'ColB': [0.9890, 0.9588]})

func = lambda x: re.match(r'\d+.\d{1}', str(x)).group(0)
df.applymap(func)

Alternatively here is the not so elegant approach where you first convert the number into a string, then you get the different parts of the string separately and lastly covert the string back to a float(Yes this is not efficient) :

def func(x): 
  # Convert number to a string 
  digits = str(x).split(".")

  # Manually put the number back together: 
  digit = digits[0] + "." + digits[1][:1]
  return float(digit)


df.applymap(func)

Results:

ColA  | ColB
0.9   |  0.9
0.9   |  0.9
Tshilidzi Mudau
  • 7,373
  • 6
  • 36
  • 49