Is it possible to call a custom function inside of DPLYR mutate()?

Question

calculated delta

The aim of the code is to compare sets of pixels from data frames of three companies to provide a difference and match if within the buffer. I realised that I could check for every possible combination across the three with a conditional statements, after many attempts to separate out into a function I can't get my head around it. I would really appreciate some insight into this challenge :)

library(dplyr)                  
library(tidyr)

#--------------------------------------------------------------------
  #function
#--------------------------------------------------------------------
  ## A funnction to calculate the (average) difference between
  ## like pixel coordinates for X and Y, it does this by  
  ## using conditional statments on dataframe columns; when conditions
  ## are met performs a calcuation on them to get the difference.
 
Delta <- function(a, b, c){
  
  Calc1 <- function(i,j,k){
    one  <- round(abs(((i - j) + (i - k) + (j - k))/3))
  }
  
  Calc2 <- function(ii,jj){
    two <- round(abs(ii - jj))
  }
  
  if (!is.na(a >= 0 & b >= 0 & c >= 0)){
    
    diff <- Calc1(a,b,c)
    
  } else if (!is.na(a >= 0 & b >= 0) & c == "NA"){
    
    diff <- Calc2(a,b)
    
  } else if (!is.na(a >= 0 & c >= 0) & b == "NA"){
    
    diff<- Calc2(a,c)
    
  } else {
    
    diff <- Calc2(b,c)
  }
}

buffer = 20

#--------------------------------------------------------------------
  #creating the data set
#--------------------------------------------------------------------

## where jpg is the file name
## pixel X & Y are coordinates of the image
## and .r, .p, .t are the attached company suffixes
  
n <- 100

jpg <- paste0(sample(c(2000000:2045888), n, rep=TRUE),'.jpg')

pixelX.r <- sample(c(0:15000, 100), n, rep=TRUE)

pixelX.p <- sample(c(0:15000, 100), n, rep=TRUE)

pixelX.t <- sample(c(0:15000, 100), n, rep=TRUE)

pixelY.r <- sample(c(0:15000, 100), n, rep=TRUE)

pixelY.p <- sample(c(0:15000, 100), n, rep=TRUE)

pixelY.t <- sample(c(0:15000, 100), n, rep=TRUE)



df <- data.frame(pixelX.r, pixelY.r, pixelX.p, pixelY.p, pixelY.t, pixelX.t)

jpg.df <- data.frame(jpg)

df <- apply (df, 2, function(x) {x[sample( c(1:n), floor(n/10))] <- NA; x} )

pixel.comparison  <- df %>%
                      bind_cols(jpg.df, id=NULL)%>%
                        select('jpg', 'pixelX.r', 'pixelY.r', 'pixelX.p', 'pixelY.p', 'pixelY.t', 'pixelX.t') %>%
                         rowwise() %>%
                          mutate(delta.X = Delta(pixelX.p, pixelX.r, pixelX.t),
                                 delta.Y = Delta(pixelY.p, pixelY.r, pixelY.t),
                                 X.Match = if_else((delta.X <= buffer), 1, 0),
                                 Y.Match = if_else((delta.Y <= buffer), 1, 0)) %>%
                                   distinct()```


this is the error message I keep getting
-------------------------------------------
```Error in `mutate()`:
! Problem while computing `delta.X = Delta(PixelX,
  PixelX.r, PixelX.t)`.
i The error occurred in row 1.
Caused by error in `if (!is.na(a >= 0 & c >= 0) & b == "NA") ...`:
! missing value where TRUE/FALSE needed


#----------------------------------------------------------
# this was the original bit of my code before I started playing with function 
#---------------------------------------------------------- 
#delta.X = round(abs(((PixelX.p - PixelX.r) + (PixelX.p-PixelX.t) + (PixelY.r-PixelY.t))/3)),
#delta.Y = round(abs(((PixelY.p - PixelY.r) + (PixelY.p-PixelY.t) + (PixelY.r-PixelY.t))/3)),```

It is better to show a small reproducible example for testing — akrun, Mar 03 '22 at 16:36
Haven't gone through your code in detail, but you most likely want to change `b == "NA"` and `c == "NA"` to `is.na(b)` and `is.na(c)`. `b == "NA"` is testing whether `b` equals the character string "NA", rather than testing if it's a missing / `NA` value. — zephryl, Mar 03 '22 at 16:41
[See here](https://stackoverflow.com/q/5963269/5325862) on making a reproducible example that is easier for folks to help with. That includes a sample of your data, ideally as it is at the point of having the issue, i.e. if the question isn't about merging, we don't need that part, just the merged data. Have you verified that the function works on a vector *outside* of `mutate`? — camille, Mar 03 '22 at 16:49
thank you for your suggestions, I really appreciate the support. I have edited the script to produce sample code. And yes, I had as is.na(b)/ is.na(c) originally, I thought that they both test for a character string of "NA" and produced a binary 1 or 0 my bad. I haven't yet tested outside of mutate -- I wouldn't know the best way to go about this without seeing a column being added with the mutated data frame :) — MrChester-Morris, Mar 03 '22 at 18:33
I managed to get it working on the sample code and the main script, Not worked out completely as intended though, as it produces multiple duplicate entries/ rows — the conditions must be being met multiple times, thank you all — MrChester-Morris, Mar 03 '22 at 19:15

Is it possible to call a custom function inside of DPLYR mutate()?

calculated delta

0 Answers0