Questions tagged [speedglm]

14 questions
9
votes
2 answers

How to resolve integer overflow errors in R estimation

I'm trying to estimate a model using speedglm in R. The dataset is large (~69.88 million rows and 38 columns). Multiplying the number of rows and columns results in ~2.7 billion which is outside the integer limit. I can't provide the data, but the…
James
  • 123
  • 1
  • 5
6
votes
1 answer

Why is `speedglm` slower than `glm`?

I am trying to use speedglm to achieve a faster GLM estimation than glm, but why it is even slower? set.seed(0) n=1e3 p=1e3 x=matrix(runif(n*p),nrow=n) y=sample(0:1,n,replace = T) ptm <- proc.time() fit=glm(y~x,family=binomial()) print(proc.time()…
hxd1011
  • 885
  • 2
  • 11
  • 23
6
votes
2 answers

R - Error using summary() from speedglm package

I'm using speedglm to estimate a logistic regression model on some data. I've created a reproducible example which generates the same error that I get using my original data. library(speedglm) n <- 10000 dtf <- data.frame( y = sample(c(0,1), n, 1), …
pietrop
  • 1,071
  • 2
  • 10
  • 27
3
votes
0 answers

Why does broom::tidy occasionally return a wrong type of 'estimate' with speedglm?

It is documented that broom::tidy can tidy a speedglm object: https://broom.tidyverse.org/reference/tidy.speedglm.html. In the following examples, broom::tidy a speedglm object returns some columns as 'fct' rather than 'dbl'. I guess it happens when…
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27
3
votes
1 answer

lapply, glm, and speedglm inside a function: argument "data" is missing, with no default

I am using mtcars data to show my problem. The following code works fine with glm. It generates new models by adding each variable in the vlist to the model of glm(vs ~ mpg, family = binomial(), data = mtcars. check_glm <- function(crude, vlist,…
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27
1
vote
0 answers

speedglm returns "$ operator is invalid for atomic vectors" while glm does not

Because I'm working on a logistic regression model with a large data set with many predictor variables, I decided to try working with speedglm. When I execute glm(y ~ x1+x2+x3+....+x100, family = "binomial", data = mydata) it runs without issue,…
Max
  • 487
  • 5
  • 19
1
vote
1 answer

Speeglm $ operator is invalid for atomic vectors

I am trying to execute the following code of a binary logit: mylogit <- speedglm(dependent_variable ~ InterestRate, data = my_data, family = "binomial") But I get the following error: > mylogit <- speedglm(dependent_variable ~ InterestRate, data =…
adrCoder
  • 3,145
  • 4
  • 31
  • 56
1
vote
0 answers

R: speedglm predict does not work, can i modify object?

Can anybody explain why predict does not work with speedglm? Working code: library(speedglm) mtcars2<-mtcars mtcars2$gear<-as.factor(mtcars2$gear) mtcars_train<-mtcars2[1:10,] mtcars_test<-mtcars2[11:nrow(mtcars2),] model<-speedglm(formula = cyl ~…
HeyJane
  • 143
  • 4
1
vote
0 answers

Interaction effect plots with speedglm output

I am working with a dataset large enough to make necessary the use of alogorithms more advanced than stats glm in order to compute binomial regression models. glm takes over a week to compute. Large fixed effects binomial regression in R suggests…
deca
  • 730
  • 1
  • 8
  • 24
1
vote
0 answers

Weighted least squares with speedglm

library(speedglm) df <- data.frame(y = numeric(30), x = numeric(30), weights = numeric(30)) df$y <- c(5,3,8,3,8,9,3,1,3,5,6,7,8,9,1,4,3,2,4,7,2,5,9,2,3,1,4,5,5,7) df$x <- c(7,5,3,6,8,9,5,3,1,2,3,6,9,6,3,8,9,0,7,5,3,1,2,3,4,9,7,5,3,2) df$weights <-…
Tony
  • 781
  • 6
  • 22
1
vote
1 answer

Running speedlm on weighted data with missing values

I am trying to run a linear regression on weighted data. When using speedlm i get an error msg when there are missing values in the data. library(speedglm) sampleData <- data.frame(w = round(runif(12,0,1)), target =…
eliavs
  • 2,306
  • 4
  • 23
  • 33
0
votes
1 answer

fitted values from speedglm() look very different from fitted values with glm()

The fitted values returned from speedglm() look really different from those returned from glm() and i don't know why. For example, if I run this: data("lalonde") glm <- glm(married ~ treat + age + educ + black + hisp + nodegr, data = lalonde, family…
C.Robin
  • 1,085
  • 1
  • 10
  • 23
0
votes
1 answer

The new version broom::tidy and speedglm cannot get p values

The following code was from help("tidy.speedglm"). It does not get correct p values. The previous version of broom worked fine, but the current version of broom (0.7.0) does not work. I wonder if this is a bug or something?…
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27
0
votes
0 answers

Cross-validation with speedglm for logistic regression in R?

I would like to run a cross-validation function like cv.glm on a logistic regression model built with speedglm on a large (millions of rows) data set. Does any such function exist? I am finding that cv.glm (from boot package) and the train function…