0

Suppose I have a data frame called pop, and I wish to split this data frame by a categorical variable called replicate. This replicate consists out of 110 categories, and I wish to perform analyses on each data frame then the output of each must be combined to create a new data frame. In other words suppose it is replicate i then I wish to create data frame i and perform a logistic regression on i and save beta 0 for i. All the beta 0 will be combined to create a table with all the beta 0 for replicate 1-110. I know that's A mouth full but thanks in advance.

smci
  • 32,567
  • 20
  • 113
  • 146
wild west
  • 15
  • 3
  • *"I want to group pop by replicate, then summarize by the beta 0 coefficient of logistic regression on that group"* – smci May 21 '18 at 09:29
  • Can you try giving a reproductible example ? [link](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Wilcar Jun 09 '18 at 05:22

1 Answers1

0

Since you didn't give some sample data I will use mtcars. You can use split to split a data.frame on a categorical value. Combining this with map and tidy from the purrr and broom packages you can create a dataframe with all the beta's in one go.

So what happens is 1: split data.frame, 2: run regression model 3: tidy data to get the coefficients out and create a data.frame of the data.

You will need to adjust this to your data.frame and replicate variable. Broom can handle logistic regression so everything should work out.

library(purrr)
library(broom)

my_lms <- mtcars %>%
  split(.$cyl) %>%
  map(~ lm(mpg ~ wt, data = .x)) %>%
  map_dfr(~ tidy(.))

my_lms
         term  estimate std.error statistic      p.value
1 (Intercept) 39.571196 4.3465820  9.103980 7.771511e-06
2          wt -5.647025 1.8501185 -3.052251 1.374278e-02
3 (Intercept) 28.408845 4.1843688  6.789278 1.054844e-03
4          wt -2.780106 1.3349173 -2.082605 9.175766e-02
5 (Intercept) 23.868029 3.0054619  7.941551 4.052705e-06
6          wt -2.192438 0.7392393 -2.965803 1.179281e-02

EDIT

my_lms <- lapply(split(mtcars, mtcars$cyl), function(x) lm(mpg ~ wt, data = x))
my_coefs <- as.data.frame(sapply(my_lms, coef))
my_coefs
                    4         6         8
(Intercept) 39.571196 28.408845 23.868029
wt          -5.647025 -2.780106 -2.192438

#Or transpose the coefficents if you want column results.
t(my_coefs)
  (Intercept)        wt
4    39.57120 -5.647025
6    28.40884 -2.780106
8    23.86803 -2.192438
phiver
  • 23,048
  • 14
  • 44
  • 56