-2

I got a problem with the use of MUTATE, please check the next code block.

output1 <- mytibble %>% 
  mutate(newfield = FND(mytibble$ndoc)) 
output1

Where FND function is a FILTER applied to a large file (5GB):

FND <- function(n){
  result <- LARGETIBBLE %>% filter(LARGETIBBLE$id == n)
  return(paste(unique(result$somefield),collapse=" "))
}

I want to execute FND function for each row of output1 tibble, but it just executes one time.

dan1st
  • 12,568
  • 8
  • 34
  • 67
  • 1
    you can remove the `LARGETIBBLE$` the `paste` `collapse` is returning just a single string whiich gets recycled. If you can show a small example 10 -15 rows and th expecfted output, it would be great. Also, the `mytibble$ndoc` would be just `ndoc` – akrun May 02 '20 at 22:05
  • Akrun, I want to return a single string, but many times for each output1 rows. – YakovSingh May 02 '20 at 22:09
  • 2
    Would you be kind enough to provide a small example or would you want me to do some guess work for hours – akrun May 02 '20 at 22:12
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. We don't need your actual input data. Just something that would allow use to run and test the code. – MrFlick May 02 '20 at 22:13
  • I'll take care for reproduciple example the next time, sr. I got the answer : rowwise() keyword. Thanks for your time. – YakovSingh May 02 '20 at 22:22

2 Answers2

1

Never use $ in dplyr pipes, very rarely they are used. You can change your FND function to :

library(dplyr)

FND <- function(n){
   LARGETIBBLE %>% filter(id == n) %>% pull(somefield)  %>% 
                  unique %>% paste(collapse = " ")
}

Now apply this function to every ndoc value in mytibble.

mytibble %>% mutate(newfield = purrr::map_chr(ndoc, FND))

You can also use sapply :

mytibble$newfield <- sapply(mytibble$ndoc, FND)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
-1

FND(mytibble$ndoc) is more suitable for data frames. When you use functions such as mutate on a tibble, there is no need to specify the name of the tibble, only that of the column. The symbols %>% are already making sure that only data from the tibble is used. Thus your example would be:


output1 <- mytibble %>% 
  mutate(newfield = FND(ndoc)) 

FND <- function(n){
  result <- LARGETIBBLE %>% filter(id == n)
  return(paste(unique(result$somefield),collapse=" "))
}

This would be theoretically, however I do not know if your function FND will work, maybe try it and if not, give some practical example with data and what you are trying to achieve.

  • Thanks for the answer, I'll take notes about your recommendations. Finally, I got the solution using rowwise() keyword. – YakovSingh May 02 '20 at 22:25