0

I am trying to use apply function in data frame.

Below is the sample data frame.

import pandas as pd
from CoolProp.HumidAirProp import HAPropsSI

df =pd.DataFrame()
df['T'] = [23,35,55]
df ['RH'] = [50,70,35]
df['H']= df.apply(lambda x: HAPropsSI('H','T',x ['T']+273.15, 'P',101325,'R',x ['RH']/100), axis = 1)

The actual data frame contains 400,000 rows and 8 columns. When I apply the abovementioned function in my actual data frame, it takes long time to complete. Are there any other ways, which can enhance the computing speed?

Updated

I tried to use the vetorization as follow:

df ['Enthalpy'] = np.vectorize(HAPropsSI) ('H','T',df ['T']+273.15, 'P',101325,'R',df ['RH']/100)

It shows the following error:

TypeError: Numerical inputs to HAPropsSI must be ints, floats, lists, or 1D numpy arrays.

The input for the function is as follow:

HAPropsSI('H',T',x ['T']+273.15,P',101325,'R',x ['RH']/100)

The first parameter, H is the parameter that I want to find.

The second para, T is the type of input para required and the third is the value of input para T.

The third and the fourth represent the type of input para and its value. Similar applies to the fifth and the sixth.

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Zephyr
  • 1,332
  • 2
  • 13
  • 31
  • Have you looked into multithreading [`apply`](https://stackoverflow.com/questions/45545110/make-pandas-dataframe-apply-use-all-cores) ? – Michael Szczesny Sep 22 '21 at 13:34
  • Not yet. Thanks. – Zephyr Sep 22 '21 at 13:48
  • Why not directly use `HAPropsSI` on the `df` numpy values? Like `df["T"].values + 273.15`... Works in my test and is a fraction faster – Albo Sep 22 '21 at 13:51
  • 1
    What does the function you are applying do and can you vectorise the operation? – ifly6 Sep 22 '21 at 14:04
  • 1
    If at all possible, you should rewrite `HAPropsSI` such that it directly operates on vectors of data, or dataframes, rather than on individual numbers. – cadolphs Sep 22 '21 at 22:12
  • Hi All, the function is from coolprop python package, so I am not sure if I can rewrite it. – Zephyr Sep 23 '21 at 00:19

1 Answers1

1

Have you tried to vectorize your function?

I would recommend the answers to this question here, which explain the advantages of "fake" and "true" vectorization very well.