1

I have code similar to this, with pipes to create the dataframe "full_tb", that fails because the penultimate line (the mutate to produce an ID column) is calling an object ("full_tb") that hasn't been created yet.

library(random)
library(dplyr)  

set.seed(1)
Codes <- as.vector(randomStrings(n = 10, len = 3, digits = TRUE, upperalpha = FALSE,
         unique = TRUE))

frame1 <- data.frame(
  A = sort(Codes),
  B = sample(x = c("Tree", "Shrub", "Fern"), size = 10, replace = TRUE))
)

frame2 <- data.frame(
  Row_no = sort(sample(x = 1:10)),
  C = sample(x = sample(x = c("Tree", "Shrub", "Fern"), size = 30, replace = TRUE))
)

# Here is where the problem begins

full_tb <- frame1 %>% mutate(Row_no = as.numeric(rownames(frame1))) %>%  
           inner_join(frame2) %>%  
           mutate(ID = as.numeric(rownames(full_tb))) %>%  
           select(ID, A, B, C)

# Joining by = "Row_no"  
# Error in mutate_impl(.data, dots):   
# Evaluation error: object 'full_tb' not found   

However, if I split the pipes into two chunks, it runs ok.

full_tb <- frame1 %>% mutate(Row_no = as.numeric(rownames(frame1))) %>%  
           inner_join(frame2)

# Joining by = "Row_no"  

full_tb  <- full_tb %>% mutate(ID = as.numeric(rownames(full_tb))) %>%  
            select(ID, A, B, C)

Is there a workaround to pipe everything into one chunk without having to divide the first code block into two parts?

JJJ
  • 1,009
  • 6
  • 19
  • 31
Darius
  • 489
  • 2
  • 6
  • 22
  • 1
    Please share sample of your data using `dput()` (not `str` or `head` or picture/screenshot) so others can help. See more here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 – Tung Sep 30 '18 at 00:58

1 Answers1

0

By adding a dot as an argument of rownames in the chunk of code where we want to pipe, the IDs are generated for the whole dataframe after the join. Therefore, there is no need to specify the name of the dataframe in rownames.

full_tb <- frame1 %>% mutate(Row_no = as.numeric(rownames(frame1))) %>%  
           inner_join(frame2) %>%  
           mutate(ID = as.numeric(rownames(.))) %>%  
           select(ID, A, B, C)
Darius
  • 489
  • 2
  • 6
  • 22