0

I have a dataset containing the daily rate of return for every industry (in total 10 industries) per country (in total 16 countries) from 1975 to 2018. Now I need to run cross sectional regressions per day and per week and save the coefficients in a separate dataset.

I tried the following code. But the estimates are the same for every day.

fitted_models = Data %>% 
                group_by(Data$Date) %>% 
                do(model = lm(Data$RoR ~ Data$Country + Data$Industry, data=Data))

fitted_models$model

I need to include the following contrasts:

contrasts(All0$Country) <- contr.sum(16, contrasts=TRUE)
contrasts(All0$Industry) <- contr.sum(10, contrasts=TRUE)

but I get the following error message then

Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels In addition: Warning messages: 1: contrasts dropped from factor Country due to missing levels 2: contrasts dropped from factor Industry due to missing levels

This is a sample of my data. As time goes on there are values for RoR.

   Country        Date       Industry     RoR
   <chr>          <date>     <chr>      <dbl>
 1 Finland        1975-01-01 Basic Mats    NA
 2 Austria        1975-01-01 Basic Mats    NA
 3 Spain          1975-01-01 Basic Mats    NA
 4 United Kingdom 1975-01-01 Basic Mats    NA
 5 Norway         1975-01-01 Basic Mats    NA
 6 Germany        1975-01-01 Basic Mats    NA
 7 France         1975-01-01 Basic Mats    NA
 8 Italy          1975-01-01 Basic Mats    NA
 9 Portugal       1975-01-01 Basic Mats    NA
10 Switzerland    1975-01-01 Basic Mats    NA 
Matt Summersgill
  • 4,054
  • 18
  • 47
Amber
  • 21
  • 1
  • 3
  • Export them to a file? https://stackoverflow.com/questions/49958828/exporting-and-formatting-regression-analysis-results-in-r-to-excel – duffymo Jun 12 '19 at 19:16
  • I need them in the environment for further analysis. The main issue is running regressions daily. But thank you otherwise I will export and import again. – Amber Jun 12 '19 at 19:19
  • 1
    Drop the `Data$` might be a step into the right direction. Another would be to run a single regression with interactions between date and every other variable. – Michael M Jun 12 '19 at 19:20
  • But how do I get a time series of the coefficients out of it if I only run one regression? – Amber Jun 12 '19 at 19:24
  • I've run regressions on rows of data using `apply` in the past. Maybe look into that? – cory Jun 12 '19 at 19:49
  • There are still some key issues that make it hard to help. 1. The data you provided is not enough to perform a grouping operation -- if you'd like to group the answer on `Date`, there need to be multiple dates. 2. The data you've provided (with all `NA` values for `RoR`) doesn't allow the calculation of a valid linear regression. 3. On that topic, it doesn't seem like a linear regression with only two categorical variables makes much sense? 4. You reference another data set, `All0`, please either provide that data set or code showing how it's derived from `Data` – Matt Summersgill Jun 13 '19 at 14:07
  • So the for every date there are observations for every industry in every country. Not sure how to provide more than 120 rows for just two dates? Data and All0 are the same. I just put Data to clarify it is my dataset. There is just one dataset. On command 3 why doesn't it make much sense? I try to replicate papers on that topic and they all use the same approach. – Amber Jun 13 '19 at 15:21

1 Answers1

0

Using the data.table package for to do group-wise operations might be a good way to approach this -- I'm using mtcars as an example data set since you haven't provided one, but the approach would be the same with your data. Here, I use cyl as the grouping column, but in your case it would be by Date.

library(data.table)

DT <- as.data.table(mtcars)

DT[,as.list(lm(mpg ~ wt+qsec)$coefficients), by = .(cyl)]

#    cyl (Intercept)        wt      qsec
# 1:   6    25.46173 -5.201906 0.5838640
# 2:   4    24.88427 -7.513576 0.9903892
# 3:   8    14.02093 -2.813754 0.7352592
Matt Summersgill
  • 4,054
  • 18
  • 47
  • Thank you! I need to include the following contrasts: `contrasts(All0$Country) <- contr.sum(16, contrasts=TRUE) contrasts(All0$Industry) <- contr.sum(10, contrasts=TRUE)` but I get the following error message then `Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels In addition: Warning messages: 1: contrasts dropped from factor Country due to missing levels 2: contrasts dropped from factor Industry due to missing levels ` – Amber Jun 12 '19 at 19:44
  • 1
    Hmm. I suspect that's possible, but it's going to be hard to help much further without some clarity around the data set you are working with. Can you provide a subset of your `Data` and `All0` data.frames in your question? ([See Here for some guidelines and instructions](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) ) Also, please add the additional question in this comment to your original question. – Matt Summersgill Jun 12 '19 at 19:53
  • Hope this will help some more with my question. Thanks for the effort. – Amber Jun 13 '19 at 12:11