Generate random points in 3D space with minimum nearest-neighbor distance

Question

Similar questions:
Generating N random points with certain predefined distance between them

But they are either in matlab or does not fullfill the required task.

I have to create N number of points inside a box of length that the distance between any two points is larger than delta.

For example: Let's say I have a box of length 10 Angstrom on x,y,z axis.
I want to have 200 random points inside this box so that minimum distance between any two points is larger than 3 Angstrom.

Attempt:

#!python
# -*- coding: utf-8 -*-#
import numpy as np
np.random.seed(100)
np.set_printoptions(2)

box_length = 10
d = box_length
threshold = 6
num_points = 5
x1, y1, z1 = np.random.random(num_points)* box_length, np.random.random(num_points)* box_length, np.random.random(num_points)* box_length
x2, y2, z2 = np.random.random(num_points)* box_length, np.random.random(num_points)* box_length, np.random.random(num_points)* box_length

# print(len(x1))

# just for checking make ponts integers
for i in range(len(x1)):
    x1[i] = int(x1[i])
    x2[i] = int(x2[i])
    y1[i] = int(y1[i])
    y2[i] = int(y2[i])
    z1[i] = int(z1[i])
    z2[i] = int(z2[i])


print(x1)
print(y1)
print(z1)
print("\n")

pt1_lst = []
pt2_lst = []
for i in range(len(x1)):
    a, b, c    = x1[i], y1[i], z1[i]
    a2, b2, c2 = x2[i], y2[i], z2[i]
    dist2      = (a-a2)**2 + (b-b2)**2 + (c-c2)**2

    print("\n")
    print(a,b,c)
    print(a2,b2,c2)
    print(dist2)

    if dist2 > threshold**2:
        pt1 = (a,b,c)
        pt2 = (a2,b2,c2)
        pt1_lst.append(pt1)
        pt2_lst.append(pt2)



print("points")
print(pt1_lst)
print(pt2_lst)

Problem on Code: This code compares points from points1 to points2 but does not compare within itself inside points1 and points2.

There might be better algorithms to solve this problem and hats off to those guys who come with the brilliant idea to solve the problem.

Thanks.

PS: I did some research and try to find related links, however was unable to solve the problem. Still some related links are:

Update::

I attempted Stefans' code below, It works for N= 10 but I tried it for N = 200 and it is using extremely large time ( I stopped the code after 10 minutes).

Is there any efficient way of doing this?

Help will be truly appreciated!!

All paths of length L from node n using python
Create Random Points Inside Defined Rectangle with Python

You can use cdist in Scipy (https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) with these steps: 1- Generate random points. 2- Use cdist. 3- Delete these points that are within a specified distance. — Mohamed Ibrahim, Nov 28 '20 at 02:33

score 1 · Answer 1 · answered Oct 31 '17 at 22:08

1

Common name for such distribution is Poisson spheres sampling. There is known O(n) which does this - please check here

answered Oct 31 '17 at 22:08

Severin Pappadeux

18,636
3
38
64

Thanks for the great info, It would be nicer if there is a working code to generate this kind of data, if you are aware of!!! – Nov 02 '17 at 17:22

Han-Kwang Nienhuys · Answer 2 · 2020-07-14T13:40:24.243

Several of the other solutions take n random points in the box, and discard the whole set if any pair is too close. This is not efficient; it's more effective to find the worst offending point and move it, as follows:

import numpy as np

def get_sphere_distribution(n, dmin, Ls, maxiter=1e4, allow_wall=True):
    """Get random points in a box with given dimensions and minimum separation.
    
    Parameters:
      
    - n: number of points
    - dmin: minimum distance
    - Ls: dimensions of box, shape (3,) array 
    - maxiter: maximum number of iterations.
    - allow_wall: whether to allow points on wall; 
       (if False: points need to keep distance dmin/2 from the walls.)
        
    Return:
        
    - ps: array (n, 3) of point positions, 
      with 0 <= ps[:, i] < Ls[i]
    - n_iter: number of iterations
    - dratio: average nearest-neighbor distance, divided by dmin.
    
    Note: with a fill density (sphere volume divided by box volume) above about
    0.53, it takes very long. (Random close-packed spheres have a fill density
    of 0.64).
    
    Author: Han-Kwang Nienhuys (2020)
    Copying: BSD, GPL, LGPL, CC-BY, CC-BY-SA
    See Stackoverflow: https://stackoverflow.com/a/62895898/6228891 
    """
    Ls = np.array(Ls).reshape(3)
    if not allow_wall:
        Ls -= dmin
    
    # filling factor; 0.64 is for random close-packed spheres
    # This is an estimate because close packing is complicated near the walls.
    # It doesn't work well for small L/dmin ratios.
    sphere_vol = np.pi/6*dmin**3
    box_vol = np.prod(Ls + 0.5*dmin)
    fill_dens = n*sphere_vol/box_vol
    if fill_dens > 0.64:
        msg = f'Too many to fit in the volume, density {fill_dens:.3g}>0.64'
        raise ValueError(msg)
    
    # initial try   
    ps = np.random.uniform(size=(n, 3)) * Ls
    
    # distance-squared matrix (diagonal is self-distance, don't count)
    dsq = ((ps - ps.reshape(n, 1, 3))**2).sum(axis=2)
    dsq[np.arange(n), np.arange(n)] = np.infty

    for iter_no in range(int(maxiter)):
        # find points that have too close neighbors
        close_counts = np.sum(dsq < dmin**2, axis=1)  # shape (n,)
        n_close = np.count_nonzero(close_counts)
        if n_close == 0:
            break
        
        # Move the one with the largest number of too-close neighbors
        imv = np.argmax(close_counts)
        
        # new positions
        newp = np.random.uniform(size=3)*Ls
        ps[imv]= newp
        
        # update distance matrix
        new_dsq_row = ((ps - newp.reshape(1, 3))**2).sum(axis=-1)
        dsq[imv, :] = dsq[:, imv] = new_dsq_row
        dsq[imv, imv] = np.inf
    else:
        raise RuntimeError(f'Failed after {iter_no+1} iterations.')

    if not allow_wall:
        ps += dmin/2
    
    dratio = (np.sqrt(dsq.min(axis=1))/dmin).mean()
    return ps, iter_no+1, dratio

I experimented with different strategies for deciding which ones to move (see also the edit history of this answer); it appears that moving one bad point with every iteration is much more efficient than trying to move multiple bad points at once. It makes the difference between converging up to fill density 0.25 versus 0.53.

Benchmarking below for dmin=1 and box size LxLxL; 3.33 corresponds to the size in the question. The highest n for each box size is the last one that converges within 3e+4 iterations. The column d_nn is the average nearest-neighbor distance.

        L    n  n_iter  d_nn  density
0    3.33   10       9  1.39    0.123
3    3.33   20      40  1.16    0.246
7    3.33   40    5510  1.06    0.493
8    3.33   43    2591  1.03    0.530

9    5.70   10       2  1.73    0.033
14   5.70   60      45  1.22    0.199
16   5.70  150    4331  1.05    0.499
17   5.70  165   25690  1.06    0.549

18  10.00   40       3  1.97    0.030
22  10.00   70      10  1.71    0.053
23  10.00  150      47  1.40    0.113
24  10.00  300     276  1.19    0.225
25  10.00  600    4808  1.07    0.451
27  10.00  720   26418  1.05    0.541

The density value is the density for spheres of diameter dmin (approximated to correct for wall effects). The limit for random packed spheres is about 0.64; this algorithm clearly doesn't do as well as throwing marbles in a box.

Note that the question asked for n=200 in an L=3.33*dmin box, which would be a packing density of 1.84 and which would never fit.

For comparison, the benchmark for an algorithm which discards the entire trial solution if there is any close pair:

        L   n  n_iter  d_nn  density
0    3.33  10      52  1.38    0.123
1    3.33  14     224  1.25    0.172
2    3.33  15   24553  1.15    0.185

3    5.70  10       2  2.02    0.033
4    5.70  20      93  1.42    0.066
5    5.70  29    2089  1.36    0.096
6    5.70  30   10886  1.33    0.100

7   10.00  40      10  2.05    0.030
8   10.00  60    1120  1.79    0.045
9   10.00  68    5521  1.66    0.051
11  10.00  70   28545  1.64    0.053

This algorithm takes many more iterations and moreover the iterations themselves ae slower because all points and all distances need to be regenerated and evaluated with every iteration.

Stefan Pochmann · Answer 3 · 2017-10-31T18:48:26.633

Let's say I have a box of length 10 Angstrom on x,y,z axis. I want to have 10 random points inside this box so that minimum distance between any two points is larger than 3 Angstrom.

I think this works, repeatedly generating ten random points in that box until the distances are all large enough:

>>> import numpy as np
>>> from itertools import combinations

>>> while True:
        P = np.random.rand(10, 3) * 10
        if all(np.linalg.norm(p - q) > 3
               for p, q in combinations(P, 2)):
            break

>>> P
array([[ 9.02322366,  6.13576854,  3.1745708 ],
       [ 6.48005836,  7.5280536 ,  4.66442095],
       [ 5.78306167,  1.83922896,  9.48337683],
       [ 0.70507032,  0.20737532,  5.31191608],
       [ 3.71977864,  6.40278939,  3.81742814],
       [ 0.03938102,  6.7705456 ,  6.28841217],
       [ 3.27845597,  2.98811665,  4.81792286],
       [ 7.74422021,  9.30027671,  8.1770998 ],
       [ 0.28544716,  0.35155801,  9.77847352],
       [ 4.84536373,  4.21378476,  0.4456017 ]])

Takes about 50 attempts to find a good set of points. Here I try 1000 times, 20 times it was good:

>>> sum(all(np.linalg.norm(p - q) > 3
            for p, q in combinations(np.random.rand(10, 3) * 10, 2))
        for _ in range(1000))
20

Could not solve it using the KDtree. For a larger number of points you could use `scipy.spatial.distance.pdist` and `scipy.spatial.distance.squareform` to speed up the calculation. — Joe, Oct 31 '17 at 19:00
@Joe Yeah, for larger numbers that could be better. Though they compute all distances or the minimum distance, right? I did that at first as well, but then realized that it's much faster to stop and retry as soon as I find a pair with a too small distance. That's an advantage of this self-made brute-force checking. — Stefan Pochmann, Oct 31 '17 at 19:10
@StefanPochmann, This is so good for N=10 but when I tried to run this algorithm for N=200 It takes too much time (may be hours) !!! — , Nov 02 '17 at 17:07

Jommy · Answer 4 · 2020-07-14T11:10:12.447

A Latin Hypercube will divide the domain in cells, randomly choose a set of them, and put a random point in each. In some unlucky cases, some points might get too close to each other, but we can check the distance and possibly generate a new case.

Here is an implementation for a 2D domain based on OpenTURNS. The distance is checked for all points, but in more complex examples kd-tree might be more efficient. The code for plotting is taken from here.

"""
Generate random points in a rectangle, with a minimum distance.
"""

# %% Import.

import openturns as ot
import numpy as np
from scipy.spatial.distance import pdist
import matplotlib.pyplot as plt
import matplotlib.animation as animation


# %% Generate points.

# Boundaries.
x_min, x_max = -15, +15
y_min, y_max = -10, +10

# Number of points per case.
n_points = 8

# Number of cases to generate.
n_cases = 50

# Minimum distance.
min_dist = 4

# Uniform distribution.
distribution = ot.ComposedDistribution([ot.Uniform(x_min, x_max),
                                        ot.Uniform(y_min, y_max)])

# Initialize.
ot.RandomGenerator.SetSeed(0)
p_all = np.zeros((n_cases, n_points, 2))
i = 0

# Generate the points.
while i < n_cases:
    # Create a new Latin Hypercube experiment.
    experiment = ot.LHSExperiment(distribution, n_points, False, True)
    # Generate the points.
    dat = np.array(experiment.generate())
    # Check the distance.
    dist = np.min(pdist(dat))
    if dist >= min_dist:
        p_all[i, :, :] = dat
        i += 1
        print('Case {} / {}'.format(i, n_cases))
    else:
        print('Case discarded, min dist = {:.3f}'.format(dist))


# %% Plot.

def init():
    pathcol.set_offsets([[], []])
    return [pathcol]


def update(i, pathcol, points):
    pathcol.set_offsets(points[i])
    return [pathcol]


fig, ax = plt.subplots()
ax.grid()
ax.set_aspect('equal')
ax.set_xlim(1.1 * x_min, 1.1 * x_max)
ax.set_ylim(1.1 * y_min, 1.1 * y_max)
ax.plot([x_min, x_max, x_max, x_min, x_min],
        [y_min, y_min, y_max, y_max, y_min],
        color=(0.3, 0.3, 0.3))

pathcol = ax.scatter([], [])
anim = animation.FuncAnimation(
    fig, update, init_func=init, fargs=(pathcol, p_all), interval=1000, frames=n_cases,
    blit=True, repeat=False)
plt.show()

For a 3D domain we can write:

# Boundaries.
z_min, z_max = -20, +20

# Uniform distribution.
distribution = ot.ComposedDistribution([ot.Uniform(x_min, x_max),
                                        ot.Uniform(y_min, y_max),
                                        ot.Uniform(z_min, z_max)])

# Initialize.
p_all = np.zeros((n_cases, n_points, 3))

A translation in MATLAB should be straightforward using lhsdesign and pdist.

As noticed by Han-Kwang Nienhuys, in some cases there might be performance issues, and in that case other solutions should be sought.

Interesting, but the question is about placing points in 3D. If I adapt your code with a z dimension (equal size as y) and set `n_points=40`, it runs like forever even though the total excluded volume of the points is only 20% of what's available. — Han-Kwang Nienhuys, Jul 14 '20 at 09:46
Thanks for noticing! Indeed, any solution based on a create and discard strategy will at some point face performance issues. — Jommy, Jul 14 '20 at 10:51

score 0 · Answer 5 · answered Nov 22 '21 at 04:57

0

I solved a similar problem by generating a random point in a grid, removing this point and neighbors from the random range of options and choosing another point.

answered Nov 22 '21 at 04:57

Matheus MN

1

1

To convince others, show your code. – pjs Nov 22 '21 at 05:14
As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 22 '21 at 06:43
This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/30401244) – ZF007 Nov 22 '21 at 14:12

Generate random points in 3D space with minimum nearest-neighbor distance

5 Answers5

Linked