Selectively running function on multiple pandas series depending on function input condition

Question

I have a function to calculate future payments on a loan. The function signature looks something like: calculate_payment(rate: pd.Series, remaining_term: pd.Series, remaining_balance: pd.Series)

I call the function within a larger function and feed it columns of a dataframe (ex. payment = calculate_payment(df['rate'], df['remaining_term'], df['remaining_balance'])) When I call it on my data, I get a "ZeroDivisionError: float division by zero" at the line interest_monthly = npf.ipmt(rate, 1, remaining_term, amount) because some of the remaining_term values are 0.

I want to keep the variables separated in the function for readability and general usability.

Is there a way for me to call this function only on the indices where remaining_term > 0?

I tried the below code with np.where which did not end up working. It continued to give the same ZeroDivisionError, since it seems to run the function on all values, then select the value to use for each entry which throws a ZeroDivisionError since the function is being run on the 0 remaining term entries.

interest_monthly = np.where(remaining_term > 0, npf.ipmt(rate, 1, remaining_term, amount), 0)

Another alternative is to calculate it manually without using the npf functions (see code below), but I'm also getting a ZeroDivisionError on the third line since it does the same thing as above, calculating the values for the entire series and choosing one

mir = rate.add(1)
mirp = mir.pow(remaining_term)
monthly_pmt = np.where(mirp == 1, np.where(remaining_term <= 0, 0, remaining_balance/remaining_term), 0)

I can think of a way that might possibly work to write a function and then use apply with the function, but I'm discouraged from using the apply function due to the large data (~17 million rows) that I am working with, where apply uses a lot of time to evaluate for large data sets.

EDIT: Basically what I am looking for is a way to selectively call functions on pandas columns/series based on a condition (in this case, on only one of the columns/inputs to the function). Is there a way to skip the function evaluation for those entries that would throw an error without using apply?

instead of nesting `numpy.where`, i would use `numpy.select` — Paul H, Aug 07 '23 at 20:51
_I tried the below code with np.where which did not end up working_ --> I can think of nearly limitless ways a line of code might not work. What actually happened? — Paul H, Aug 07 '23 at 20:52
`where` selects values. It does not selectively run functions. You give it arrays, not functions. — hpaulj, Aug 07 '23 at 21:41
Welcome to SO! A couple good pages to help formulate your question so as to encourage answers. They call it "Minimal Reproducible Verifiable Example": https://stackoverflow.com/help/minimal-reproducible-example. Also posting a representative portion of input data, as text: https://stackoverflow.com/q/20109391/12846804. Without those, it is possible you might receive no more than general advice. — OCa, Aug 07 '23 at 21:49
@PaulH sorry about that. It continued to give the same ZeroDivisionError as before. — Rebecca Qin, Aug 08 '23 at 15:10

Selectively running function on multiple pandas series depending on function input condition

0 Answers0