How to bind multiple data frame output from a function without a for loop in R

Question

I have the following data frame

I have a function that takes the above data frame and a year value, performs a regression (V1 against ID), and returns a data frame containing the fitted coefficients for each ID for that year:

ID Coeff
 1   4  
 2   1  
 3   2  
 .....

I would like to run the above function for a set of year values, extract the ID and its corresponding fitted coefficients for that year, and bind them into a data frame:

Year ID Coeff 
2000 1  4 
2000 2  1 
2000 3  2  
2001 1  3  
2001 2  1  
2001 3  5  
.....

I can do the above with a for loop but I'm wondering if there's a better alternative (using dplyr or something else).

Edit:

data(iris)
set.seed(2)
iris$Sepal.Length <- as.factor(iris$Sepal.Length)
iris$Sepal.Width <- as.factor(iris$Sepal.Width)
iris$Random <- sample(0:1, size = nrow(iris), replace = TRUE)

fit_function <- function(df, Species) {
    fit <- glm(Random ~ -1+Sepal.Length + Sepal.Width, 
           data = df[df$Species == Species,], 
           family = "binomial")
    final_df <- data.frame(Species = Species, Name = names(coef(fit)), Coef = unname(coef(fit)))
    return(final_df)
}

all <- c()

for (i in unique(as.character(iris$Species))) {
    all <- rbind(all, fit_function(iris, i))    
}

I don't have a list of data frame, I'm calling a function which returns a data frame repeatedly with a for loop, and right now I'm calling rbind at each iteration of the for loop to bind my current data frame with the new data frame from the function call at that iterationn, which is very inefficient. — Yandle, Mar 14 '19 at 22:18
I've only used lapply to apply a function over columns of a data frame, how do I use lapply to apply over multiple subsets of a dataframe (grouped, based on my example, by Species of iris)? — Yandle, Mar 14 '19 at 23:57
No, you're right. I totally misunderstood your problem. Take a look at this question: [Use dplyr's group_by to perform split-apply-combine](https://stackoverflow.com/questions/26664644/use-dplyrs-group-by-to-perform-split-apply-combine). I know I keep throwing duplicates at you, but `iris %>% group_by(Species) %>% do(fit_function(.))` replicates your for loop results (just remove mentions of `Species` from `fit_function`, since the `group_by` takes care of that. — divibisan, Mar 15 '19 at 00:26

Sandy · Answer 1 · 2019-03-15T00:16:34.153

0

You could try though MySQL within R. Let's say your first data frame is df1 and your second data frame is df2. Then you could try:

# Install the necessary package
library(sqldf)

sqldf('SELECT Year, df1.ID, Coeff
       FROM df1 JOIN df2
       ON df1.ID = df2.ID')

Since ID is common between the two data frames, you need to always sepcify which particular ID you are using.

edited Mar 15 '19 at 00:16

answered Mar 13 '19 at 22:01

Sandy

1,100
10
18

score 0 · Answer 2 · answered Mar 13 '19 at 22:17

I don't really understand the logistics of your question and without workable data or your code so far it's really impossible to know exactly what you're asking. In the future you should realize it's polite to include a sample of your data using dput() and show the code you have thus far. This is how I would go about solving your problem given the information you have posted:

library(tidyverse)

dat <- tribble(~"Year", ~"ID", ~"V1", 
        2000, 1,  4, 
        2000, 2,  1, 
        2000, 3,  2,  
        2001, 1,  3,  
        2001, 2,  1,  
        2001, 3,  5)

dat %>% 
  group_split(Year) %>% 
  map_df(~lm(V1 ~ as.factor(ID), data = .x) %>% 
        broom::tidy() %>% 
        select(term, estimate) %>% 
        mutate(YEAR = unique(.x$Year)))
#> # A tibble: 6 x 3
#>   term           estimate  YEAR
#>   <chr>             <dbl> <dbl>
#> 1 (Intercept)        4.    2000
#> 2 as.factor(ID)2    -3.    2000
#> 3 as.factor(ID)3    -2.    2000
#> 4 (Intercept)        3.    2001
#> 5 as.factor(ID)2    -2.    2001
#> 6 as.factor(ID)3     2.00  2001

^{Created on 2019-03-13 by the reprex package (v0.2.1)}

My apologies, I have attached a sample code in my question above. My main goal is to get rid of the for loop at the end. — Yandle, Mar 14 '19 at 22:20

How to bind multiple data frame output from a function without a for loop in R

2 Answers2