3

I have some kind of a complicated geometrical question. I have two series of points represented by their x,y and z as numpy arrays (in reality I have thousands in each series but make it here simple for a better understanding). First data set (arr_1) represents a cutting surface. This surface is usually vertical and displaces horizontal surfaces. The second set is points of the vertical surfaces that are displaces by a surface (the surface that arr_1 is representing). At the moment I have some redundant data in these two sets. Firstly, in arr_1 I prefer to have only four corners points and make a surface by myself using them. Secondly, in arr_2 I also do not need the data that are too close to arr_1. Finally, I want to terminate the points excising in both sides of the surface which is made by four corners of the arr_1.

import numpy as np
arr_1= np.array ([[31.95548952,  7.5       , 12.5       ],
       [31.95548952, 22.5       , 12.5       ],
       [37.5       ,  7.5       , 24.20636043],
       [37.5       , 22.5       , 24.20636043],
       [43.78154278,  7.5       , 37.5       ],
       [43.78154278, 22.5       , 37.5       ],
       [55.32209575,  7.5       , 62.5       ],
       [55.32209575, 22.5       , 62.5       ],
       [62.5       ,  7.5       , 78.50696445],
       [62.5       , 22.5       , 78.50696445],
       [66.52446985,  7.5       , 87.5       ],
       [66.52446985, 22.5       , 87.5       ]])
arr_2= np.array([[87.5       ,  7.5       , 49.99997914],
       [87.5       , 22.5       , 50.00001192],
       [62.5       ,  7.5       , 50.00004172],
       [62.5       , 22.5       , 50.00007749],
       [46.8747884 ,  7.5       , 62.5       ],
       [46.87483609, 22.5       , 62.5       ],
       [37.5       ,  7.5       , 69.99973059],
       [37.5       , 22.5       , 69.99977231],
       [12.5       ,  7.5       , 70.00012398],
       [12.5       , 22.5       , 70.00015974]])

Then, I find the distances between each set of these two arrays using the following code:

from scipy.spatial import distance
distances=distance.cdist(arr_1, arr_2)

When I run it, distances is an array with 12 rows and 10 columns. Now, I want to remove the points of arr_2 which closer than a threshold (let's say 10) to any point of arr_1. I showed these two points in a black rectangle in the first uploaded photo. A good solution is proposed here, but it does not solve my problem because I want to compare the distance of each row of arr_1 with each row of arr_2. I appreciate any solution to do it. In fact, arr_1 contains the points of a surface but I do not need all of them to make my surface. I only pick for corners of this set to make my surface. I use a highly time consuming for loop, so I appreciate any faster method to find corners of my points:

corner1 = np.array(arr_1.min (axis=0)) # this gives the row containing MIN values of x,y and z
corner2 = np.array([])
corner4 = np.array([])
corner3 = np.array(arr_1.max (axis=0)) # this gives the row containing MAX values of x,y and z
# the next block will find the corner in which x and y are minimum and z is maximum
for i in arr_1[:,0]:
    if i == max (arr_1[:,0]):
        for j in arr_1[:,1]: 
            if j == min (arr_1[:,1]):
                for h in arr_1[:,2]:
                    if h == max (arr_1[:,2]):
                        corner2 = np.append(corner2, np.array([i,j,h]))
                        corner2=corner2[0:3]
# the next block will find the corner in which x and z are minimum and y is maximum
for m in arr_1[:,0]:
    if m == min (arr_1[:,0]):
        for n in arr_1[:,1]: 
            if n == max (arr_1[:,1]):
                for o in arr_1[:,2]:
                    if o == min (arr_1[:,2]):
                        corner4 = np.append(corner4, np.array([m,n,o]))
                        corner4=corner4[0:3]

Finally, after extracting four corners, I want to make a surface using them, and find the perpendicular (green arrows) or horizonal (red arrows) projection of adjacent points of arr_2 on it. I have no idea idea how to find the surface and projections. Thanks for reading and payying attention to my detailed issue! I appreciate if anyone propose any solution to any part of it. enter image description here

enter image description here

Ali_d
  • 1,089
  • 9
  • 24
  • Nice question - I think there might be some milage in the [3-point plane theorem](https://en.wikipedia.org/wiki/Plane_(geometry)#Determination_by_contained_points_and_lines) so that rather than picking 4 corners, you could just randomly choose any 3 points, and [determine your plane from those](https://math.stackexchange.com/questions/99299/best-fitting-plane-given-a-set-of-points). If there's noise in your data, then you might want to run a few samples and agree on the most characteristic set of point as a faster way of arriving at your baseline planes. – Thomas Kimber Oct 01 '20 at 11:50
  • Dear @Thomas Kimber, Thanks for your feedback on the surface creation from points. It was fantastic. Have you any idea how to find the projections of adjacent points on the surface? Thanks in advance. – Ali_d Oct 01 '20 at 12:02
  • 1
    For the first part you can get rid of the closest point with `arr_2[np.where(np.min(distance.cdist(arr_1, arr_2),axis=0)>10)[0],:]` – obchardon Oct 01 '20 at 12:24
  • 2
    For the corner, one option, that work with your example input (but perhaps not with more points if the repartition is not symetrical) could be `idx = np.argsort(distance.cdist([np.mean(x1,axis=0)],x1)).flatten()[0:4]`. We take the 4 farest points in comparison with the center of mass of all points. – obchardon Oct 01 '20 at 12:47
  • Dear @obchadron, thanks for your hints. Your solutions are truly fantastic. How can I subtract the points which define with the threshold from the arr_2? I saved the out put of `arr_2[np.where(np.min(distance.cdist(arr_1, arr_2),axis=0)>10)[0],:]` as a new array (`new_arr`). then, I tried using `np.setdiff1d(arr_2,new_arr)` but it was not successful. I appreciate your help. – Ali_d Oct 01 '20 at 13:14
  • I found how to subtract an array from another one here: https://stackoverflow.com/questions/55530975/efficiently-delete-each-row-of-an-array-if-it-occurs-in-another-array-in-pure-nu – Ali_d Oct 01 '20 at 13:41
  • Dear @obchardon, Thanks for your hints. Are you sure about the solution proposed for finding the corners? I tried it and it is giving me : `array([6, 7, 4, 5], dtype=int64)`. But, I want to extract the points which make four corners. – Ali_d Oct 01 '20 at 14:04
  • @Ali_d It output the index (the position in the point array) not the points itself. Get the corner points with `arr_1[idx,:]` – obchardon Oct 01 '20 at 14:16
  • @obchardon, Thanks for replying. But, it does not give the correct points. I checked it and resulted indexes do not represent the four corner points. – Ali_d Oct 01 '20 at 14:20
  • Look into triangulation algorithms. Delaunay for example. It will reduce your search space and noise dependence enormously if done right. Then you can just average all the resulting vectors – Mad Physicist Oct 01 '20 at 14:23
  • 1
    Just to be clear, are you looking for the intersection of two planes? – Mad Physicist Oct 01 '20 at 14:24
  • 1
    Dear @Mad Physicist, Yes. Exacltly. I want to know where my arr_2 is cutting arr_1. They meet this cutting surface two times. one plane is created by connecting four corners but another one is only the points which are close to this surface. – Ali_d Oct 01 '20 at 14:27
  • Can you do me a favor? Can you explain what `arr_1` and 2 are in detail in the question? The type of shape the data represents, etc. Perhaps in the first couple of paragraphs? – Mad Physicist Oct 01 '20 at 14:31
  • @Ali_d my bad I took the closest point no the farest, here you go : `idx = np.argsort(distance.cdist([np.mean(arr_1,axis=0)],arr_1)).flatten()[-4:]` – obchardon Oct 01 '20 at 14:34
  • @Mad Physicist, question is updated. Thanks for any feedback. – Ali_d Oct 01 '20 at 14:40
  • You can greatly reduce the complexity of the problem if, as you say, you find a fit for arr_1. Rather than four points, I recommend a normal vector. Trimming arr_2 becomes much easier then. Is that an acceptable approach? – Mad Physicist Oct 01 '20 at 15:22
  • Here's another link to a [least-squares solution](https://stackoverflow.com/questions/35070178/fit-plane-to-a-set-of-points-in-3d-scipy-optimize-minimize-vs-scipy-linalg-lsts) Here, if you calculated the formulae for plane 1 and plane 2, you could algebraically solve the two equations which should give the solution for a line vector describing the line of intersection - you could then calculate some threshold distance between points and locations on that vector to establish which points are in the intersection zone. – Thomas Kimber Oct 01 '20 at 15:35
  • @ThomasKimber. We're on the same page. Definitely the right solution in most cases – Mad Physicist Oct 01 '20 at 15:51
  • @Mad Physicist, passing a surface through all the points of arr_1 and trimming arr_2 into that surface is of course a very good solution. I thought maybe it would be easier to extract four corners. – Ali_d Oct 01 '20 at 16:43
  • @Ali_d. Unfortunately four corners don't make a plane most of the time. But it's the right idea. I'll work out an answer for you in a bit. – Mad Physicist Oct 01 '20 at 17:01
  • @ThomasKimber. Links appreciated. I've metabolized them into my answer. – Mad Physicist Oct 01 '20 at 21:21
  • @MadPhysicist cool you're very welcome - you beat me to the answer - I was still scribbling away trying to work out the formula for expressing the intersection line - I've posted anyway, if only to share the neat plot I put together in writing it up! – Thomas Kimber Oct 01 '20 at 21:37

2 Answers2

2

Let's split this problem up into a couple of parts.

  1. You have a bunch of datapoints that describe a noisy plane, arr_1.
  2. You want to know where it intersects another plane, described by arr_2.
  3. You want to establish some threshold in arr_2 about that intersection.

The approach I show here is to assume that the data is a measurement of some truth value, and you want to perform these operations on your best guess at that value, not the raw data. To that end:

Part 1: Least squares fit to plane

There are a couple of different ways to describe a plane, such as a normal vector and a point. The simplest for least-squares fitting is probably

a * x + b * y + c * z = 1

Assuming that your data is reasonably well behaved, there should be no catch to doing a simple fit using the equation

arr_1 @ [[a], [b], [c]] = 1   # almost python pseudo code

Since there is no single solution possible to this with more than four points, you can run np.linalg.lstsq to get the values optimized according to the MSE:

plane_1 = np.linalg.lstsq(arr_1, np.ones((arr_1.shape[0], 1), dtype=arr_1.dtype))[0].ravel()

If arr_2 is a plane as well, you do the same thing to it to get plane_2.

Part 2: Intersection of planes

Many solutions for the intersection of planes rely on the normal vectors of the planes. I will assume that both planes have dependence on the Y coordinate (which seems safe from the plots). In that case, you can follow this answer on Math Stack Exchange. Setting y = t, you can solve for the line from the system

a1 * x + c1 * z = 1 - b1 * t
a2 * x + c2 * z = 1 - b2 * t

Here, the vector [a1, b1, c1] is plane_1. After working out the details, you get

m = np.cross(plane_1, plane_2)
b = np.array([plane_1[2] - plane_2[2], 0, plane_2[0] - plane_1[0]]) / m[1]
m /= m[1]
line = (m * t + b)

This is a parametrization for any value of t.

Part 3: Distance from point to line

To threshold the values of arr_2 about the line computed above in terms of m and b, you need a formula for the distance between a point and a line. This can be done with e.g., the method in the posts here and here.

For example, a single point p can be processed as follows:

t = (p - b).dot(m) / m.dot(m)
dist = np.sqrt(np.sum((p - (m * t + b))**2))

If you are only interested in doing the thresholding, you can compare dist**2 to the square of the threshold and save some cycles on the square root since both functions are monotonic.

TL;DR

Inputs: arr_1, arr_2, distance_threshold.

# Find planes in data
plane_1 = np.linalg.lstsq(arr_1, np.ones((arr_1.shape[0], 1), dtype=arr_1.dtype))[0].ravel()
plane_2 = np.linalg.lstsq(arr_2, np.ones((arr_2.shape[0], 1), dtype=arr_2.dtype))[0].ravel()

# Find intersection line, assume plane is not y=const
m = np.cross(plane_1, plane_2)
b = np.array([plane_1[2] - plane_2[2], 0, plane_2[0] - plane_1[0]]) / m[1]
m /= m[1]

# Find mask for arr_2
t = ((arr_2 - b).dot(m) / m.dot(m))[:, None]
dist2 = np.sum((arr_2 - (m * t + b))**2, axis=1)
mask = dist2 >= distance_threshold**2

# Apply mask
subset = arr_2[mask, :]

Appendix 1: RMSE

The really nice thing about this approach (besides the fact that it limits your algorithm to ~O(n)), as mentioned before, is that it de-noises your data. You can use the result of the least squares fits to compute the RMSE of the plane data about the fit to get a sense of just how planar your measurements really are. The second return value of lstsq is the RMSE metric.

Appendix 2: Distance of point to plane

If the data in arr_2 is indeed not planar, you can subset it a little differently. Rather than finding the intersection of a pair of planes, you can use the formula for the distance between a single point and a plane directly, as given here:

np.abs(p * plane_1 - 1) / np.sqrt(plane1.dot(plane_1))

The code then becomes

# Find planes in data
plane_1 = np.linalg.lstsq(arr_1, np.ones((arr_1.shape[0], 1), dtype=arr_1.dtype))[0].ravel()
plane_2 = np.linalg.lstsq(arr_2, np.ones((arr_2.shape[0], 1), dtype=arr_2.dtype))[0].ravel()

# Find mask for arr_2
dist2 = (arr_2 * plane_1 - 1)**2 / plane_1.dot(plane_1)
mask = dist2 >= distance_threshold**2

# Apply mask
subset = arr_2[mask, :]
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • thanks a lot, but I couldn't run it. I saw an error in the fifth line(`m = np.cross(...)`): `incompatible dimensions for cross product (dimension must be 2 or 3)`. Also points of arr_2 are most of the time too chaotic. I showed a simple case here and points are gently distributed but in reality have complex pattern, it maybe too tough for `np.linalg.lstsq` to find the plane. arr_1 sit in a predictable way. To remove close points, method of @obchardon is efficient. My concern is project of adjacent points of arr_2 on the plane passing arr_1 (finding x,y,z of projection points?). – Ali_d Oct 02 '20 at 08:20
  • @Ali_d. I'll add the option of point projection. It's actually ok if `arr_2` is chaotic. Linear least squares will always converge as long as the problem is not under determined. I'm surprised that `cross` is failing. That can only happen if `arr_1` or `arr_2` has a shape [1] other than 3 – Mad Physicist Oct 02 '20 at 14:21
  • @ Mad Physicist, I want to get the coordinates of the projection (shadow) of the points of arr_2 which are sandwiching the generated plane. In my simplified case, four points of arr_2 should be projected on the created surface. both the horizontal or perpendicular projections are ok for me (as my fig shows). thanks for being that much supportive. – Ali_d Oct 02 '20 at 14:25
  • @Ali_d. `np.linalg.lstsq` returns a whole bunch of stuff. You only want the first output, so I've changed the code to grab that. – Mad Physicist Oct 02 '20 at 15:26
  • @Ali_d. Test this out. If it does what you want. I'll stop here. If you still want planar projection instead of linear, I can work that out for you in a couple of steps. – Mad Physicist Oct 02 '20 at 15:45
  • I appreciate your hints. Which variable has the coordinates of the projection of adjacent point of the surface passing points of arr_1? Is it `t`? Now, `t` represents the equivalent y of all points on the surface passing arr_1, isn't it? Is it possible to directly extract the x,y and z of projection of only four adjacent points (after removing that redundant close points to the plane) rather than all of them? – Ali_d Oct 02 '20 at 17:20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222421/discussion-between-mad-physicist-and-ali-d). – Mad Physicist Oct 02 '20 at 18:18
2

Here's some code I put together - it covers the first part of your question - which is to find and define "surfaces" from your collections of points.

Any flat surface (a plane) can be defined by the formula ax + by + cz + d = 0 where a,b,c are coefficients and d is some constant. We can write that into a function like the one below:

def linear_plane(data, a, b, c):
    x = data[0]
    y = data[1]
    z = data[2]
    return (a * x) + (b * y) + (c*z)

Then, after some imports, we can use scipy's curve_fit to search for and find the parameters that best fit that function for each of your arrays. curve_fit takes input values does the linear_plane function and tries to tweak a,b,c so that the results of that function match the a list containing lots of -1's. This comes from rearranging the equation for the case where d = 1

from scipy.optimize import curve_fit

v1,err1 = curve_fit(linear_plane,arr_1.T,np.repeat(-1,len(arr_1)))
v2,err2 = curve_fit(linear_plane,arr_2.T,np.repeat(-1,len(arr_2)))

Where each of v1, v2 are the coefficients a,b,c that most closely define the two planes you're talking about. err1 and err2 show the residual "errors" so watch out if these get too large.

Now the coefficients are found, you can use them to visualise the two planes:

%matplotlib inline
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter3D(xs=arr_1[:,0], ys=arr_1[:,1], zs=arr_1[:,2], c="blue")
ax.scatter3D(xs=arr_2[:,0], ys=arr_2[:,1], zs=arr_2[:,2], c="red")
xx1, yy1 = np.meshgrid([min(arr_1[:,0]),max(arr_1[:,0])],[min(arr_1[:,1]),max(arr_1[:,1])])
xx2, yy2 = np.meshgrid([min(arr_2[:,0]),max(arr_2[:,0])], [min(arr_2[:,1]),max(arr_2[:,1])])

# Use the coefficients in v1[0], v1[1], v1[2] to calculate z-values for all xx/yy values
z1 = (-v1[0] * xx1 - v1[1] * yy1 - 1) * 1. /v1[2]


# Use the coefficients in v2[0], v2[1], v2[2] to calculate z-values for all xx/yy values
z2 = (-v2[0] * xx2 - v2[1] * yy2 - 1) * 1. /v2[2]

# Using xx,yy and z values, plot the surfaces fit into the graph.
ax.plot_surface(xx1, yy1, z1, alpha=0.2, color=[0,0,1])
ax.plot_surface(xx2, yy2, z2, alpha=0.2, color=[1,0,0])

Here, we calculate a collection of z values for all x's and y's falling within the max and min bounds of each plane, and then plot these as surfaces showing the intersection:

enter image description here

I notice Mad Physicist has just posted a more complete answer, so will leave this here for now. One thing I'd like to be able to do is extend the surface fitting so that the planes don't have to be linear - like an analogous to some 3d polynomial, which should just be a case of substituting the function used in the curve_fit call.

The other part missing from here is how to calculate the line equation that describes where these two planes cross, which I believe is found in the other (better) answer.

Thomas Kimber
  • 10,601
  • 3
  • 25
  • 42
  • Nice. You can simplify the plane equation by dividing through by `d`. – Mad Physicist Oct 01 '20 at 21:57
  • @Thomas Kimber, thanks. The main issue is that the geometry of arr_2 is not simple. I just want to the locations where adjacent (to the surface passing arr_1) points of arr_2 are projected on that surface. Points of arr_1 are distributed gently and it is logical to consider a planar surface passing them. But for arr-2 it is not possible. – Ali_d Oct 02 '20 at 08:44
  • OK, so there's probably a way, similar to the finding of distance from a point to a line, where instead you could measure the distance from each point to the arr_1 plane. The shortest distance from any point onto that plane would be along the plane's normal vector (or its inverse) and you've got the starting point from which you're measuring, so you'd want to find where along that vector there was an intersection with the ax+by+cz+d formula for the plane arr_1 - that *should* just be a case of doing a little algebra. If I get time, I'll try and figure out how you might achieve that. – Thomas Kimber Oct 02 '20 at 09:32
  • @Thomas Kimber, I appreciate your contribution. The main point is that I just want to know the locations of the projection of adjacent point of the plane, on the plane. Like the uploaded fig, just coordinates will be OK. I mean, coordinate of the projection (shadow) of the points of arr_2 which are sandwiching the generated plane. – Ali_d Oct 02 '20 at 10:17