0

I have a cloud of data (X,Y,Z,thickness) In R, I would like calculate my standard deviation by element compared to his neighbors the close ones.

My first code is :

CubeSize = 50
  
  #Run on all element
  for (i in 1:length(Data_Ext)){

    #Create un data.frame for all points in range of my CubeSize
    CarreMobile <- Data_Ext %>%
      filter(X < Data_Ext$X[i] + CubeSize) %>%
      filter(X > Data_Ext$X[i] - CubeSize) %>%
      
      filter(Y < Data_Ext$Y[i] + CubeSize) %>%
      filter(Y > Data_Ext$Y[i] - CubeSize) %>%
      
      filter(Z < Data_Ext$Z[i] + CubeSize) %>%
      filter(Z > Data_Ext$Z[i] - CubeSize)
    
    #Calculate the Standard deviation
    Data_Ext2$SD[i] <- sd(CarreMobile$th)
  }

It do the job, but I have 2 millions points... It's 2 days of calcul. So I try that :

CubeSize = 50

Data_Ext$Xmax <- Data_Ext$X + CubeSize
Data_Ext$Xmin <- Data_Ext$X - CubeSize

Data_Ext$Ymax <- Data_Ext$Y + CubeSize
Data_Ext$Ymin <- Data_Ext$Y - CubeSize

Data_Ext$Zmax <- Data_Ext$Z + CubeSize
Data_Ext$Zmin <- Data_Ext$Z - CubeSize

Data_Ext<- Data_Ext %>% 
  mutate(SD = sd(Data_Ext %>% 
                      filter(X < Xmax) %>% 
                      filter(X > Xmin) %>%
                      
                      filter(Y < Ymax) %>% 
                      filter(Y > Ymin) %>% 
                      
                      filter(Z < Zmax) %>% 
                      filter(Z > Zmin)
                )
         )

But this one don't work. Do you have a solution to not use for


Data

set.seed(1)

Data_Ext<- data_frame(X = sample(900:1000, 100, replace = TRUE),
Y = sample(900:1000, 100, replace = TRUE),
Z = sample(90:100, 100, replace = TRUE),
Th = sample(10:20, 100, replace = TRUE)
) 
Tobie
  • 1
  • 1
  • Please provide sample data. Is there a particular reason you're using `dplyr`? If you're concerned about speed, I would think `matrix` and/or `data.table`. – r2evans Oct 05 '21 at 18:42
  • I use before this point `dplyr`, and switch data.frame to matrix is resource intensive. – Tobie Oct 06 '21 at 19:46
  • Please provide sample data, see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Oct 06 '21 at 21:58
  • My data look like that (with lot of data) : `DataRandom <- data_frame(X = sample(900:1000, 100, replace = TRUE), Y = sample(900:1000, 100, replace = TRUE), Z = sample(90:100, 100, replace = TRUE), Th = sample(10:20, 100, replace = TRUE) ) ` – Tobie Oct 07 '21 at 19:49
  • I added that sample data to your question, since (1) comments are not great for code, (2) some readers may skim/skip comments, and (3) when there are more comments, the Stack interface can *hide* comments. It's best to keep all relevant code, error, messages, etc within the question itself. Having said that, a few things: (1) with random data, please use `set.seed` so that we all have the same data; (2) please include your expected output given this sample data; (3) please bring your code into alignment with this sample data (e.g., variable name. Thank you. – r2evans Oct 07 '21 at 19:52

0 Answers0