0

Good morning,

I´m currently trying to run a truncated regression loop on my dataset. In the following I will give you a reproducible example of my dataframe.

library(plyr)
library(truncreg)



df <- data.frame("grid_id" = rep(c(1,2), 6), 
                 "htcm" = rep(c(160,170,175), 4),
                  stringsAsFactors = FALSE)
                   

View(df)


Now I tried to run a truncated regression on the variable "htcm" grouped by grid_id to receive only coefficients (intercept such as sigma), which I then stored into a dataframe. This code is written based on the ideas of @hadley

reg <- dlply(df, "grid_id", function(.)
  truncreg(htcm ~ 1, data = ., point = 160, direction = "left")
)

regcoef <- ldply(reg, coef)

As this code works for one of my three datasets, I receive error messages for the other two ones. The datasets do not differ in any column but in their absolute length (length(df1) = 4,000; length(df2) = 100,000; length(df3) = 13,000)

The error message which occurs is

"Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of type vector, was 'NULL'

I do not even know how to reproduce an example where this error code occurs, because this code works totally fine with one of my three datasets. I already accounted for missing values in both columns.

Does anyone has a guess what I can fix to this code?

Thanks!!

EDIT:

I think I found the origin of error in my code, the problem is most likely about that in a truncated regression model, the standard deviation is calculated which automatically implies more than one observation for any group. As there are also groups with only n = 1 observations included, the standard deviation equals zero which causes my code to detect a vector of length = NULL. How can I drop the groups with less than two observations within the regression code?

  • just for specification as I reviewed my question: the error occurs after running the "dlply" code containing the grouped truncated regression loop – r_user_3417 Apr 19 '21 at 09:29
  • Check this - https://stackoverflow.com/questions/20204257/subset-data-frame-based-on-number-of-rows-per-group – Ronak Shah Apr 19 '21 at 11:07

0 Answers0