I have a point cloud of 6 millions x, y and z points I need to process. I need to look for specific points within this 6 millions xyz points and I have using pandas df.isin()
function to do it. I first save the 6 millions points into a pandas dataframe (save under the name point_cloud
) and for the specific point I need to look for into a dateframe as well (save under the name specific_point
). I only have two specific point I need to look out for. So the output of the df.isin()
function should show 2 True
value but it is showing 3 instead.
In order to prove that 3 True values are wrong. I actually iterate through the 6 millions point clouds looking for the two specific points using iterrows()
. The result was indeed 2 True value. So why is df.isin()
showing 3 instead of the correct result of 2?
I have tried this, which result true_count
to be 3
label = (point_cloud['x'].isin(specific_point['x']) & point_cloud['y'].isin(specific_point['y']) & point_cloud['z'].isin(specific_point['z'])).astype(int).to_frame()
true_count = 0
for index, t_f in label.iterrows():
if int(t_f.values) == int(1):
true_count += 1
print(true_count)
I have tried this as well, also resulting in true_count
to be 3.
for t_f in (point_cloud['x'].isin(specific_point['x']) & point_cloud['y'].isin(specific_point['y']) & point_cloud['z'].isin(specific_point['z'])).values
true_count = 0
if t_f == True:
true_count += 1
Lastly I tried the most inefficient way of iterating through the 6 millions points using iterrows()
but this result the correct value for true_count
which is 2.
true_count = 0
for index_sp, sp in specific_point.iterrows():
for index_pc, pc in point_cloud.iterrows():
if sp['x'] == pc['x'] and sp['y'] == pc['y'] and sp['z] == pc['z]:
true_count += 1
print(true_count)
Do anyone know why is df.isin()
behaving this way? Or have I seem to overlook something?