15

Goal

I would like to compute the 3D volume integral of a numeric scalar field.

Code

For this post, I will use an example of which the integral can be exactly computed. I have therefore chosen the following function:

Integral function

In Python, I define the function, and a set of points in 3D, and then generate the discrete values at these points:

import numpy as np


# Make data.
def function(x, y, z):
    return x**y**z

N = 5
grid = np.meshgrid(
    np.linspace(0, 1, N),
    np.linspace(0, 1, N),
    np.linspace(0, 1, N)
)

points = np.vstack(list(map(np.ravel, grid))).T

x = points[:, 0]
y = points[:, 1]
z = points[:, 2]

values = [function(points[i, 0], points[i, 1], points[i, 2])
          for i in range(len(points))]

Question

How can I find the integral, if I don't know the underlying function, i.e. if I only have the coordinates (x, y, z) and the values?

henry
  • 875
  • 1
  • 18
  • 48
  • 1
    https://en.wikipedia.org/wiki/Numerical_integration#Quadrature_rules_based_on_interpolating_functions – quester Oct 19 '21 at 18:06

4 Answers4

9

A nice way to go about this would be using scipy's tplquad integration. However, to use that, we need a function and not a cloud point.

An easy way around that is to use an interpolator, to get a function approximating our cloud point - we can for example use scipy's RegularGridInterpolator if the data is on a regular grid:

import numpy as np
from scipy import integrate
from scipy.interpolate import RegularGridInterpolator

# Make data.
def function(x,y,z):
    return x*y*z

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

values = function(*np.meshgrid(x,y,z, indexing='ij'))

# Interpolate:
function_interpolated = RegularGridInterpolator((x, y, z), values)

# tplquad integrates func(z,y,x)
f = lambda z,y,x : my_interpolating_function([z,y,x])

result, error = integrate.tplquad(f, xmin, xmax, lambda _: ymin, lambda _:ymax,lambda *_: zmin, lambda *_: zmax)

In the example above, we get result = 0.12499999999999999 - close enough!

Seon
  • 3,332
  • 8
  • 27
  • 1
    And the result is probably as accurate as it gets if you consider floating point errors ;-) – C Hecht Oct 13 '21 at 14:18
  • Thanks for the answer, but what do you do, if the volume over which you want to integrate is not a cubicle? My ultimate goal is to find the integral of any shape given by x,y,z coordinate. The coordinates do lay on a uniform grid, but the shape (the integration boundaries) is defined by the outer most points, which can be arbitrary.... – henry Oct 14 '21 at 06:09
  • Does this help answer your question: https://stackoverflow.com/questions/47900969/4d-interpolation-for-irregular-x-y-z-grids-by-python – C Hecht Oct 14 '21 at 08:15
  • Thank you for the link! This shows me how to get an interpolation for a non-cubic point cloud, however do you know how to get the volume integral of an arbitrary point cloud? – henry Oct 16 '21 at 10:46
  • 2
    A terribly inefficient way to go about it would be to integrate over the smallest cube containing your cloud point (`LinearNDInterpolator` lets you set a default value for all points outside the convex hull of your cloud point, and you can just set it to 0). A less wasteful way of doing this if your cloud point doesn't look like a cuboid at all would be to use the fact that `tplquad` lets you specify integration boundaries as functions of x and x,y, but I'm not quite sure how you'd go about computing them (though scipy's `ConvexHull` probably will be a piece of the puzzle). – Seon Oct 16 '21 at 11:04
4

The easiest way to achieve what you are looking for is probably scipy's integration function. Here your example:

from scipy import integrate

# Make data.
def func(x,y,z):
    return x**y**z

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)

Are you aware that the function that you created is different from the one that you show in the image. The one you created is an exponential (x^y^z) while the one that you are showing is just multiplications. If you want to represent the function in the image, use

def func(x,y,z):
    return x*y*z

Hope this answers your question, otherwise just write a comment!

Edit:

Misread your post. If you only have the results, and they are not regularly spaced, you would have to figure out some form of interpolation (i.e. linear) and a lookup-table. If you do not know how to create that, let me know. The rest of the stated answer could still be used if you define func to return interpolated values from your original data

C Hecht
  • 932
  • 5
  • 14
2

The first answer explains nicely the principal approach to handle this. Just wanted to illustrate an alternative way by showing the power of sklearn package and machine learning regression.

Doing the meshgrid in 3D gives a very large numpy array,

import numpy as np

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1

x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

grid = np.array(np.meshgrid(x,y,z, indexing='ij'))
grid.shape = (3, 5, 5, 5) # 2*5*5*5 = 250 numbers

Which is visually not very intuitive with 250 numbers. With different possible indexing ('ij' or 'xy'). Using regression we can get the same result with few input points (15-20).

# building random combinations from (x,y,z)

X = np.random.choice(x, 20)[:,None]
Y = np.random.choice(y, 20)[:,None]
Z = np.random.choice(z, 20)[:,None]

xyz = np.concatenate((X,Y,Z), axis = 1)
data = np.multiply.reduce(xyz, axis = 1)

So the input (grid) is just a 2D numpy array,

xyz.shape
(20, 3)

With the corresponding data,

data.shape = (20,)

Now the regression function and integration,

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from scipy import integrate

pipe=Pipeline([('polynomial',PolynomialFeatures(degree=3)),('modal',LinearRegression())])
pipe.fit(xyz, data)

def func(x,y,z):
    return pipe.predict([[x, y, z]])

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)
print(result)
0.1257

This approach is useful with limited number of points.

Damir Devetak
  • 726
  • 4
  • 10
0

Based on your requirements, it sounds like the most appropriate technique would be Monte Carlo integration:

# Step 0 start with some empirical data
observed_points = np.random.uniform(0,1,size=(10000,3))

unknown_fn = lambda x: np.prod(x) # just used to generate fake values

observed_values = np.apply_along_axis(unknown_fn, 1, observed_points) 

K = 1000000

# Step 1 - assume that f(x,y,z) can be approximated by an interpolation
# of the data we have (you could get really fancy with the 
# selection of interpolation method - we'll stick with straight lines here)

from scipy.interpolate import LinearNDInterpolator
f_interpolate = LinearNDInterpolator(observed_points, observed_values)

# Step 2 randomly sample from within convex hull of observed data
# Step 2a - Uniformly sample from bounding 3D-box of data
lower_bounds = observed_points.min(axis=0)
upper_bounds = observed_points.max(axis=0)

sampled_points = np.random.uniform(lower_bounds, upper_bounds,size=(K, 3))
# Step 2b - Reject points outside of convex hull...
# Luckily, we get a np.nan from LinearNDInterpolator in this case

sampled_values = f_interpolate(sampled_points)
rejected_idxs = np.argwhere(np.isnan(sampled_values))

# Step 2c - Remember accepted values of estimated f(x_i, y_i, z_i)
final_sampled_values = np.delete(sampled_values, rejected_idxs, axis=0)

# Step 3 - Calculate estimate of volume of observed data domain
#  Since we sampled uniformly from the convex hull of data domain,
#  each point was selected with P(x,y,z)= 1 / Volume of convex hull
volume = scipy.spatial.ConvexHull(observed_points).volume

# Step 4 - Multiply estimated volume of domain by average sampled value
I_hat = volume * final_sampled_values.mean()
print(I_hat)

For a derivation of why this works see this: https://cs.dartmouth.edu/wjarosz/publications/dissertation/appendixA.pdf

Mike Holcomb
  • 403
  • 3
  • 9
  • Does this work with a non cubic integration boundary? Any arbitrary shape? – henry – henry Oct 20 '21 at 06:42
  • The way it is coded, it works only with convex integration boundaries but that covers a lot cases. https://en.wikipedia.org/wiki/Convex_hull. Part of the issue with non-convex boundaries is estimating the value of the unknown function across interior angles. Imagine you have a 5 point star shaped integration boundary in 2-D. If I have two points on the boundary sitting across an interior angle on different "arms", what is the value of the unknown function in between them? You could come up with an alternative possibly like chopping up the region into convex parts that you sum together. – Mike Holcomb Oct 20 '21 at 06:58