0

Hi I have a DataFrame column like the follow.

dataframe['BETA'], which has float numbers between 0 and 100. I need to have just numbers with the same numbers of digits. Example:

Dataframe['BETA´]:

[0] 0.11 to [0] 110
[1] 1.54 to [1] 154
[2] 22.1 to [2] 221

I tried to change one by one, but its super inefficient process:

for i in range (len(df_ld)):
    nbeta=df_ld['BETA'][i]
    if nbeta<1:
        val=nbeta
        val=val*1000
        df_ld.loc[i,'BETA']=val
    if (nbeta>=1) and (nbeta<=10):
        val=nbeta
        val=val*100
        df_ld.loc[i,'BETA']=val

    if (nbeta>10) and (nbeta<=100):
        val=nbeta
        val=val*10
        df_ld.loc[i,'BETA']=val
        #print('%.f >10, %.f Nuevo valor'% (nbeta,val))

Note: The dataframe size is more then 80k elements

Please help!

Edited: Solution numpy.select

import numpy as np
x = df_ld['BETA']
condlist = [x<1, (x>=1) & (x<10),(x>=10) & (x<100)]
choicelist = [x*1000, x*100,x*10]
output=np.select(condlist, choicelist)
df_ld.insert(4,'BETA3',output,True)

Thank you!

  • always use inlines with pandas if you want efficiency... – Lore Sep 20 '19 at 14:12
  • I'd use `log10`, assuming 0 < Beta <= 100.. `s = (np.log10(df.Beta)//1)*-1+2; (df.Beta*(10**s)).astype(int)` – ALollz Sep 20 '19 at 14:40
  • Another approach is to use `.loc` to mask out the range you need to multiply and perform 3 multiplications. Check this: https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates – llalwani11 Sep 20 '19 at 14:43

1 Answers1

0

Try this.

I'm guessing your dataframe is called df_ld and your target column is df_ld['BETA'].

def multiply(column):
    newcol = []
    for item in column:
        if item<1:
            item=item*1000
            newcol.append(item)
        if (item>=1) and (item<=10):
            item=item*100
            newcol.append(item)
        if (item>10) and (item<=100):
            item=item*10
            newcol.append(item)
    return newcol

# apply function and create new column 
df_ld['newcol'] = multiply(df_ld['BETA'])
SCool
  • 3,104
  • 4
  • 21
  • 49