2D interpolation between two curves (arrays of inequal lengths)

Question

I'm developing an open source battery model, and I work with datasheets of different cells to use them in the model. The temperature characteristics of a battery look like this :

Battery Temperature Characteristics

When the data is sampled to numerical values, the resulting arrays do not have the same lengths :

Numerical data

I'm looking to perform a 2D interpolation in order to determine the voltage of the battery at a given capacity and temperature.

I struggle to find a good way to interpolate this kind of data. I do realize that interpolation between two arrays of unequal lengths might not be a well defined problem, but I'm trying to look for a solution which would provide reasonable results in this case.

I think that regularizing the data to a grid might work, but I suspect that it is not a very good solution in my case because of how uneven the length and shape of the curves is. I think it might cause the interpolation to be performed between two points that are far away and "do not correspond to each other" on the curves if you know what I mean.

Instead, I would be hoping for a kind of solution that would "extend" the triangular part of the dataset.

I would be very thankful if you can provide any idea that could help me find a solution.

EDIT : I will try to clarify the problem, sorry if I wasn't able to express it in a clear manner.

The input are the graphs from the datasheet, which are read into numeric values (let's say Excel/csv for storage and pandas datasheet for the Python code)

The output is a function that for any point inside the domain of definition (x=Temperature, y=Capacity), provides the interpolated value of (z=Voltage)

I do not fully understand the first question, but the confusion might come from the fact that I do not want graphs as outputs and I do not extrapolate any data.

I do not know which would be the best way to share the data, I think 170 lines might be a bit too much to copy-paste. I don't think it's exactly necessary either.

The point is that I sampled the curves on the graph for every 25 mAh of capacity. Since the battery is cut off below a certain voltage, the arrays have varying lengths : the 60°C curve ends around 4200mAh, while the -40°C ends sooner, around 3600mAh

EDIT2 N. Wouda : I hope it's allowed to share links, I uploaded the csv here : https://transferxl.com/08jXjy5T1814kr

Pranav Hosangadi : In such case, I would raise a Value Error

How are the graphs plotted on the entire range if you don't have them sampled on the entire range? By extrapolation? If so, and the plots seem reasonable, why would a regular grid (say, with spline) not be enough? — Gulzar, Jun 01 '21 at 15:17
Also, Please post data as copy-pasteable code, preferably loadable by copy-paste into a dataframe. — Gulzar, Jun 01 '21 at 15:18
And, post copy-pasteable code to create the plots from the data. — Gulzar, Jun 01 '21 at 15:18
Or maybe I don't understand and you are trying to create these graphs you know to be correct from your incomplete data? Please define the problem as inputs as outputs. — Gulzar, Jun 01 '21 at 15:24
What do you want to happen if I try to get the voltage for -40C at 4000K? — Pranav Hosangadi, Jun 01 '21 at 16:47
If I understand correctly, you are interested in a function U(Temperature, Capacity) (so a 2D fit)? This function would accept temperatures like 13.73°C and a capacity for which there are no input data available, and returns a voltage. — natter1, Jun 01 '21 at 17:40

Nelewout · Accepted Answer · 2021-06-01T20:37:52.543

SciPy has a module entirely geared towards interpolation, at scipy.interpolate. In the following code I use radial basis functions to create an interpolating function. IMHO, these result in a somewhat smoother result than using e.g. interp2d directly, and obtaining them is quite economical when the data set is not too large. The downside is that radial basis functions need not respect the scale (min/max) of your data, especially not outside the initial domain. You should check for that before using the interpolant!

(see also this answer for a nice overview of the relative strengths of different interpolating functions)

Here's the code:

import re

import numpy as np
import pandas as pd
from scipy import interpolate
    

def interp(df: pd.DataFrame):
    temp = []
    cap = []
    voltage = []

    for col in df.columns:
        if col.startswith("Voltage"):
            temp.extend([_col_to_temp(col)] * len(df.Capacity))
            cap.extend(df.Capacity)
            voltage.extend(df[col])

    x = np.array(temp)
    y = np.array(cap)
    z = np.array(voltage)

    isna = pd.isna(z)

    return interpolate.Rbf(x[~isna], y[~isna], z[~isna])


def _col_to_temp(col: str) -> float:
    # Gets temperature from column name 
    res = re.findall(r'[+-]?\d+', col)
    return float(res[0])


df = pd.read_csv("Molicel_Temperature.csv")
f = interp(df)

It is hopefully fairly straightforward. f is the interpolating function and takes a temperature and capacity argument, as e.g. f(50, 2000), returning the interpolated voltage. This results in, say, the following graph for different values of (temp, capacity):

which seems to be what you are looking for!

2D interpolation between two curves (arrays of inequal lengths)

1 Answers1