Poor performance when looping over numpy array

Question

With the function get_height I calculate the height difference between two point clouds for every scan (y,z-coordinates in my example).

My algorithm works but takes 1.93 s on average. How can I improve the performance?

EDIT: I attached a fully working example

import numpy as np
import matplotlib.pyplot as plt

def generate_random_dataset(N,x_max):
    # Create the 'x' column
    unique_x = np.linspace(0, x_max, x_max*10+1)
    x = np.random.choice(unique_x, N) # Generate the array with repeated values

    # Create the 'y' column
    y = np.random.uniform(-5, 5, N)

    # Create the 'z' column
    z = - y**2 + 5 + np.random.normal(0, 1, N)

    # Create the 'A' array
    A = np.column_stack((x, y, z))

    return A

def get_height(A0,A1):
    # get unique x values that are in both scans
    ux0 = np.unique(A0[:,0])
    ux1 = np.unique(A1[:,0])
    ux = np.intersect1d(ux0,ux1)

    # get height at each unique x value
    h = []
    for x in ux:
        # get slice of lower scan
        mask0 = (A0[:,0] == x)
        z0 = A0[mask0,2]
        
        # get slice of upper scan
        mask1 = (A1[:,0] == x)
        z1 = A1[mask1,2]

        # get height difference
        height = np.max(z1) - np.max(z0)

        # append results to list
        h.append(height)

    # convert list to array
    h = np.array(h)

    return ux, h

# run script
A0 = generate_random_dataset(N=300000,x_max=100)
A1 = generate_random_dataset(N=310000,x_max=120)
A1[:,2] = A1[:,2] - 0.001*(A1[:,0]-50)**2 + 5 # make A1 higher and different than A0


# apply function
%timeit ux,h = get_height(A0,A1)
ux0 = np.unique(A0[:,0])
ux1 = np.unique(A1[:,0])
ux = np.intersect1d(ux0,ux1)

# plot
fig = plt.figure(figsize=(4.24*1.5,3*1.5))
ax = plt.subplot(111)
ax.scatter(ux,h)
ax.set_xlabel('x [mm]')
ax.set_ylabel('h [mm]')
plt.show()

I've tried using np.lexsort approach from a previous question of mine but that approach doesn't work for two arrays.

I want to approach this problem differently (without looping over unique x values) but I can't figure out a solution.

this may help [Most efficient way to map function over numpy array](https://stackoverflow.com/questions/35215161/most-efficient-way-to-map-function-over-numpy-array) — deadshot, Jun 26 '23 at 11:49
consider making [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). It will be easier to help you -> more people will be willing to help you — dankal444, Jun 26 '23 at 12:24
I've attached a working example. Thank you for the suggestion :-) — Max M, Jun 26 '23 at 14:08

Stuart · Accepted Answer · 2023-06-26T15:15:39.087

There is probably a numpy solution, but in the meantime using pandas is much faster than a python loop with lookup in each iteration, even including the overhead of converting the arrays into dataframes.

import pandas as pd

def get_height_pd(A0, A1):
    df0 = pd.DataFrame(A0)
    df1 = pd.DataFrame(A1)
    m0 = df0.groupby(0)[2].max()
    m1 = df1.groupby(0)[2].max()
    return (m1 - m0).dropna()  # dropna gets rid of the non-intersecting ones

Alternatively, possibly a little faster, use series.

def get_height_s(A0, A1):
    s0 = pd.Series(A0[:, 2])
    s1 = pd.Series(A1[:, 2])
    m0 = s0.groupby(A0[:, 0]).max()
    m1 = s1.groupby(A1[:, 0]).max()
    return (m1 - m0).dropna()

DataFrame solution: 21.7 ms Series solution: 19.5 ms – Max M Jun 26 '23 at 15:31 — Max M, Jun 26 '23 at 15:31

score 0 · Answer 2 · answered Jun 26 '23 at 16:44

Here's an ugly numpy solution using this function to get the min and max. Transpose one of the arrays so that it is all below and in opposite direction to the other (y' = offset - y where offset is a suitable low number), concatenate the two arrays together, then find the min and max for each x. The min in each row will be offset - maximum from A1, and the max in each row will be the maximum from A0. Then reverse the transposition to get the difference in heights.

def agg_minmax(a):  # from https://stackoverflow.com/a/58908648/567595
    sidx = np.lexsort(a[:,::-1].T)
    b = a[sidx]
    m = np.r_[True,b[:-1,0]!=b[1:,0],True]
    return np.c_[b[m[:-1],:2], b[m[1:],1]]

def get_height(A0, A1):
    min0 = A0[:, 2].min()
    offset = min0 + A1[:, 2].min() - 1
    b0 = A0[:, [0, 2]]
    b1 = np.array([A1[:, 0], offset - A1[:, 2]]).T
    c = np.concatenate((b0, b1))
    agg = agg_minmax(c)
    f = agg[(agg[:, 1] < min0) & (agg[:, 2] >= min0)]   # filter out the not-applicable rows
    return f[:, 0], offset - f[:, 1] - f[:, 2]

It's slower than the pandas solutions but perhaps can be tweaked.

Using the provided example i got 182 ms ± 10.1 ms per loop. — Max M, Jun 28 '23 at 08:47

Poor performance when looping over numpy array

2 Answers2