1

I have 200 data points, each point is a list of 3 numbers that represents the position. I want to sample N=100 points from this 3D space, but with the constraint that the minimum distance between every two points must be larger than 0.15. The script below is the way I sample the points, but it just keeps running and never stops. Also, if I set a N larger than a value, the code cannot find all N points because I sample each point randomly and it gets to a point where no points can be sampled that isn't too close to the current points, but in reality the N can be much larger than this value if the point distribution is very "dense" (but still satisfies minimum distance larger than 0.15). Is there a more efficient way to do this?

import numpy as np
import random
import time

def get_random_points_not_too_close(points, npoints, min_distance):
    random.shuffle(points)
    final_points = [points[0]]
    while len(final_points) < npoints:
        for point in points:
            if point in final_points:
                continue
            elif min([np.linalg.norm(np.array(p) - np.array(point)) for p in final_points]) > min_distance:
                final_points.append(point)

    return final_points


data = [[random.random() for i in range(3)] for j in range(200)]
t1 = time.time()
sample_points = get_random_points_not_too_close(points=data, npoints=100, min_distance=0.15)
t2 = time.time()
print(t2-t1)
Shaun Han
  • 2,676
  • 2
  • 9
  • 29
  • Back of an envelope calculation here, but if you have a cubic space, an N of 100, and a minimum density of 0.15, then assuming an even distribution, the bounds of that space must be no greater than 4.641 on each side. If your points all exist within a smaller space than that, then it's not reasonable to expect to find regions where the nearest neighbour is > 0.15. As you increase the lengths of the sides of the cube, given an N of 100, and measuring the extents of your cube, you should be able to find a more reasonable minimum distance. – Thomas Kimber Sep 15 '20 at 16:30
  • I want to say that I generate a random 3D space just to use as an example. In reality I have 200 points that are not evenly distributed. – Shaun Han Sep 15 '20 at 16:34
  • The same calculations should apply - yes, your dataset ought to be more clustered than this perfectly distributed artificial example, but that should still set your expectations. What happens when you reduce your threshold of 0.15? Does it complete more quickly? – Thomas Kimber Sep 15 '20 at 16:45
  • Another angle would be to calculate all the nearest neighbour distances, and review how many of them meet your criteria - once you're happy there's a good chance of success, then you can revert back to this code that relies on the assumption of separation. – Thomas Kimber Sep 15 '20 at 16:47
  • Does this answer your question? [Generate random points in 3D space with minimum nearest-neighbor distance](https://stackoverflow.com/questions/47041784/generate-random-points-in-3d-space-with-minimum-nearest-neighbor-distance) – Peter O. Sep 15 '20 at 16:50
  • No, sampling N points in 3D space is completely different from generating points. – Shaun Han Sep 15 '20 at 17:22

1 Answers1

0

Your algorithm could work for small sets of points, but it does not run in a deterministic time.

I did the following to create a random forest (simulated trees): first generate a square grid of points where the grid point distance is 3x the minimal distance. Now you take each point in the regular grid and translate it in a random direction, at a random distance which is max your distance. The result points will never be closer than max your distance.

enter image description here

Goodies
  • 1,951
  • 21
  • 26