1

I am working on the classification of a 3D point cloud using several python libraries (whitebox, PCL, PDAL). My goal is to classify the soil. The data set has been classified by a company so I am based on their classification as ground truth.

For the moment I am able to classify the soil, to do that I declassified the data set and redo a classification with PDAL. Now I'm at the stage of comparing the two datasets to see the quality of my classification.

I made a script which takes the XYZ coordinates of the 2 sets and puts it in a list and I compare them one by one, however the dataset contains around 5 millions points and it takes 1 minute by 5 points at the begining. After few minutes everything crash. Can anyone give me tips? Here a picture of my clouds The set at the lets is the ground truth and at the right is the one classified by me

André Caceres
  • 719
  • 4
  • 15
Taha
  • 23
  • 3
  • 1
    Welcome to GIS SE. Would it be possible to include the images within your question rather than providing a link? That way if the URL becomes inactive the images will remain on this site. – Borealis Feb 13 '20 at 13:05
  • i need at least 10 recomandation before that … so for the moment it's not possible – Taha Feb 13 '20 at 16:01
  • You should be at 10+ now. – Borealis Feb 13 '20 at 16:06

1 Answers1

0

Your problem is that you are not using any spatial data structure to ease your point proximity queries. There are several ways you can mitigate this issue, such as KD tree and Octree.

By using such spatial structures you will be able to discard a large portion of unnecessary distance computations, thus improving the performance.

André Caceres
  • 719
  • 4
  • 15