Questions tagged [glm]

For questions relating to generalized linear models. For the GLM math library, see the [glm-math] tag.

Generalized linear models are a class that encompasses a variety of standard statistical models, including ordinary least squares (a.k.a. linear models, OLS) regression, probit, logistic regression, Poisson regression, and other methods that can be expressed in the standard GLM form.

Consider whether your question is better suited to Cross Validated, the Stack Exchange site for statistics and machine learning. Questions on Stack Overflow should be about programming issues arising from fitting models to data.

In scientific software for statistical computing and graphics, a GLM can be estimated by the function glm.

2019 questions
158
votes
6 answers

How to succinctly write a formula with many variables from a data frame?

Suppose I have a response variable and a data containing three covariates (as a toy example): y = c(1,4,6) d = data.frame(x1 = c(4,-1,3), x2 = c(3,9,8), x3 = c(4,-4,-2)) I want to fit a linear regression to the data: fit = lm(y ~ d$x1 + d$x2 +…
grautur
  • 29,955
  • 34
  • 93
  • 128
76
votes
2 answers

Confidence intervals for predictions from logistic regression

In R predict.lm computes predictions based on the results from linear regression and also offers to compute confidence intervals for these predictions. According to the manual, these intervals are based on the error variance of fitting, but not on…
unique2
  • 2,162
  • 2
  • 18
  • 23
66
votes
4 answers

Warning: non-integer #successes in a binomial glm! (survey packages)

I am using the twang package to create propensity scores, which are used as weights in a binomial glm using survey::svyglm. The code looks something like this: pscore <- ps(ppci ~ var1+var2+.........., data=dt....) dt$w <- get.weights(pscore,…
Robert Long
  • 5,722
  • 5
  • 29
  • 50
62
votes
3 answers

How to debug "contrasts can be applied only to factors with 2 or more levels" error?

Here are all the variables I'm working with: str(ad.train) $ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ... $ Team : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 1 1 1…
Troy
  • 683
  • 1
  • 7
  • 8
52
votes
5 answers

Extract pvalue from glm

I'm running many regressions and am only interested in the effect on the coefficient and p-value of one particular variable. So, in my script, I'd like to be able to just extract the p-value from the glm summary (getting the coefficient itself is…
ch-pub
  • 1,664
  • 6
  • 29
  • 52
35
votes
0 answers

Fractional logit model in R

I would like to estimate covariate effects on a response whose values take on values in [0,1]. That is, the values of the response variable live between 0-1 (inclusive). I would like to use the fractional logit model described by Papke and…
Chris
  • 3,401
  • 5
  • 33
  • 42
32
votes
1 answer

caret train() predicts very different then predict.glm()

I am trying to estimate a logistic regression, using the 10-fold cross-validation. #import libraries library(car); library(caret); library(e1071); library(verification) #data import and preparation data(Chile) chile <-…
Vincent
  • 1,361
  • 2
  • 20
  • 33
29
votes
1 answer

how do i exclude specific variables from a glm in R?

I have 50 variables. This is how I use them all in my glm. var = glm(Stuff ~ ., data=mydata, family=binomial) But I want to exclude 2 of them. So how do I exclude 2 in specific? I was hoping there would be something like this: var = glm(Stuff ~ . #…
user3399551
  • 475
  • 1
  • 4
  • 11
24
votes
1 answer

Can multinomial models be estimated using Generalized Linear model?

In analysis of categorical data, we often use logistic regression to estimate relationships between binomial outcomes and one or more covariates. I understand this is a type of generalized linear model (GLM). In R, this is implemented with the glm…
hxd1011
  • 885
  • 2
  • 11
  • 23
22
votes
1 answer

Logistic regression - cbind command in glm

I am doing logistic regression in R. Can somebody clarify what is the differences of running these two lines? 1. glm(Response ~ Temperature, data=temp, family = binomial(link="logit")) 2. glm(cbind(Response, n - Response) ~…
Eddie
  • 783
  • 4
  • 12
  • 24
21
votes
3 answers

Extract standard errors from glm

I did a glm and I just want to extract the standard errors of each coefficient. I saw on the internet the function se.coef() but it doesn't work, it returns "Error: could not find function "se.coef"".
user1096592
  • 227
  • 1
  • 2
  • 3
21
votes
3 answers

Why is caret train taking up so much memory?

When I train just using glm, everything works, and I don't even come close to exhausting memory. But when I run train(..., method='glm'), I run out of memory. Is this because train is storing a lot of data for each iteration of the cross-validation…
Yang
  • 16,037
  • 15
  • 100
  • 142
21
votes
2 answers

Specifying formula in R with glm without explicit declaration of each covariate

I would like to force specific variables into glm regressions without fully specifying each one. My real data set has ~200 variables. I haven't been able to find samples of this in my online searching thus far. For example (with just 3…
S.R.
  • 263
  • 1
  • 3
  • 5
21
votes
2 answers

Why is it inadvisable to get statistical summary information for regression coefficients from glmnet model?

I have a regression model with binary outcome. I fitted the model with glmnet and got the selected variables and their coefficients. Since glmnet doesn't calculate variable importance, I would like to feed the exact output (selected variables and…
TongZZZ
  • 756
  • 2
  • 8
  • 20
17
votes
2 answers

Get 95% confidence interval with glm(..) in R

Here are some data dat = data.frame(y = c(9,7,7,7,5,6,4,6,3,5,1,5), x = c(1,1,2,2,3,3,4,4,5,5,6,6), color = rep(c('a','b'),6)) and the plot of these data if you wish require(ggplot) ggplot(dat, aes(x=x,y=y, color=color)) + geom_point() +…
Remi.b
  • 17,389
  • 28
  • 87
  • 168
1
2 3
99 100