0

I have a set of 500,000 spheres 3D coordinates along with their radii (non-unique) which are put in a pandas DataFrame object.

I would like to create an efficient search routine where I sample many points' coordinates, and the routine returns if each point is in a sphere, and if yes, in which sphere it is.

To do that, I read that using a cartesian search mesh can be the solution. I therefore created a mesh with a given cell size, and assigned ids to each sphere. In case it is useful, I also stored the differences between the center of the cell and the center of the sphere.

import pandas as pd
import numpy as np
size_cell = 20
fake_coordinates = np.uniform(-200,200, (500000, 3)) # Here spheres will cross, which should not be the case in the real input. If point in two spheres, choose the first one that comes.
data = pd.DataFrame(fake_coordinates, columns=['x','y','z'])
data['r'] = np.uniform(1,3, 500000)

x_vect = np.arange(data.x.min()-np.max(data.r), data.x.max()+np.max(data.r), size_cell)
y_vect = np.arange(data.y.min()-np.max(data.r), data.y.max()+np.max(data.r), size_cell)
z_vect = np.arange(data.z.min()-np.max(data.r), data.z.max()+np.max(data.r), size_cell)

data['i_x'] = ((data.x-x_vect[0])//size_cell).astype(int)
data['i_y'] = ((data.y-y_vect[0])//size_cell).astype(int)
data['i_z'] = ((data.z-z_vect[0])//size_cell).astype(int)

data['dx'] = data.x-(x_vect[data.i_x]+x_vect[data.i_x+1])/2
data['dy'] = data.y-(y_vect[data.i_y]+y_vect[data.i_y+1])/2
data['dz'] = data.z-(z_vect[data.i_z]+z_vect[data.i_z+1])/2

Then, I noticed that a sphere which has its center that belongs to a neighboring cell can cross its cell and therefore, the sampled point can be in this neighbor sphere. So I computed the crossing nature of each sphere.

data['crossing_x_left']  = data.dx < - (size_cell - data.r)
data['crossing_x_right'] = data.dx >   (size_cell - data.r)
data['crossing_y_left']  = data.dy < - (size_cell - data.r)
data['crossing_y_right'] = data.dy >   (size_cell - data.r)
data['crossing_z_down']  = data.dz < - (size_cell - data.r)
data['crossing_z_top']   = data.dz >   (size_cell - data.r)

Now I should sample N points:

sample = np.uniform(-200,200, (1000000, 3))

If I take into account only one point and only candidate spheres with the center within the cell:

i_x = (sample[0,0]-x_vect[0])//size_cell
i_y = (sample[0,1]-y_vect[0])//size_cell
i_z = (sample[0,2]-z_vect[0])//size_cell
ok_x = data['i_x']==i_x
ok_y = data['i_y']==i_y
ok_z = data['i_z']==i_z
candidates = data.loc[ok_x & ok_y & ok_z]

Now I want to test many points at the same time, but also take into account the crossing spheres. But then I run into a method issue. How to efficiently compute (probably meaning processing all points and all spheres at the same time using matrices) if the points belong to a sphere, and if yes, in which sphere they are? The part where spheres can cross the boundary makes it difficult for me to see.

yvrob
  • 105
  • 10
  • It is a little confusing that what you need. If you want to specify which points are placed in which spheres, why you get meshed the system in cubic? I think this meshing can not be helpful enough (you mentioned the meshing in the title too, is it a part of the needed code? if not, It was better to write the title more relevant without mentioning the meshing). IIUC, using `balltree` from scikit learn will get the main need that you need easily. Perhaps my ex question and answers will be helpful in this regard: https://stackoverflow.com/q/71104627/13394817 – Ali_Sh Jul 14 '22 at 14:33
  • Octrees with cartesian mesh, perhaps? – Severin Pappadeux Jul 14 '22 at 19:21

0 Answers0