1

I'm currently using a CSV file to import multiple datasets in R. This dataset contains 2500 variables over 16 columns. I'm trying to make a regression function with lm in R. But when I try to make a dummy variable for year effects or industry effects, the regression won't work.

This is how I create the dummy variable:

CNAME <- factor(Combined.data[6], levels=c(1:20), labels= c("AUSTRIA", "BELGIUM", "DENMARK", 
"FINLAND", "FRANCE", "GERMANY", "IRELAND", "ISLE OF MAN", "ITALY", "LUXEMBOURG",
"NETHERLANDS", "NORWAY", "POLAND", "PORTUGAL", "SPAIN", "SWEDEN", "SWITZERLAND",
"TURKEY", "UNITED KINGDOM", "UNITED STATES")) 

And this is what the regression function looks like:

results <- lm(Tax_Avoidance ~ ENVSCORE + CGVSCORE + SOCSCORE + ECNSCORE + Size +
                Leverage + ROA + MTB + ROA + RND + AUD + PPE + Intang + CDP +
                CHS + NET + CNAME,
              data = finalresults)

summary(results)

I cannot see what I'm doing wrong, I appreciate your help.

Jaap
  • 81,064
  • 34
  • 182
  • 193
VishJ
  • 11
  • 2
  • 1
    What do you mean "the regression doesn't work"? Are you getting an error of some kind? If so you should include that. When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 18 '18 at 15:23
  • 1
    What is `Combined.data[6]` and what is `finalresults` ? Could you upload a small set of those datasets, so we can reproduce your problem? – SeGa May 18 '18 at 15:25
  • 1
    Related Q&A: [*Generate a dummy-variable*](https://stackoverflow.com/q/11952706/2204410) – Jaap May 18 '18 at 15:26
  • I'm sorry for my late response, I've been very busy the last couple of days. The Q&A Jaap posted has helped me figure it out. I want to thank you all for your time and help. Next time when I post a question, I'll take all of your tips in consideration. For this project I have to deal with classified information, so I could not post any data or csv files. Thank you all – VishJ May 20 '18 at 17:30

1 Answers1

0

Will this not work for you? Without knowing the error its difficult to know whats going wrong.

CNAME <-  c("AUSTRIA", "BELGIUM", "DENMARK", 
                                                            "FINLAND", "FRANCE", "GERMANY", "IRELAND", "ISLE OF MAN", "ITALY", "LUXEMBOURG",
                                                            "NETHERLANDS", "NORWAY", "POLAND", "PORTUGAL", "SPAIN", "SWEDEN", "SWITZERLAND",
                                                            "TURKEY", "UNITED KINGDOM", "UNITED STATES")

df <- data.frame(replicate(10,sample(0:50,20,rep=TRUE)))
df <- cbind(df, CNAME)

library(dummies)
df <- as.data.frame(df)
df <- dummy.data.frame(df)


results <- lm(X1 ~ ., data = df)
summary(results)

With the data.frame:

  X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 CNAMEAUSTRIA CNAMEBELGIUM
1 41 27  3 28  6  3 35 19  3  34            1            0
2 41 41 30 22 15 42 44 42  6  41            0            1
3 13  1 26 35 44 22 13 11 46  47            0            0
  CNAMEDENMARK CNAMEFINLAND CNAMEFRANCE CNAMEGERMANY CNAMEIRELAND
1            0            0           0            0            0
2            0            0           0            0            0
3            1            0           0            0            0
  CNAMEISLE OF MAN CNAMEITALY CNAMELUXEMBOURG CNAMENETHERLANDS
1                0          0               0                0
2                0          0               0                0
3                0          0               0                0
  CNAMENORWAY CNAMEPOLAND CNAMEPORTUGAL CNAMESPAIN CNAMESWEDEN
1           0           0             0          0           0
2           0           0             0          0           0
3           0           0             0          0           0
  CNAMESWITZERLAND CNAMETURKEY CNAMEUNITED KINGDOM CNAMEUNITED STATES
1                0           0                   0                  0
2                0           0                   0                  0
3                0           0                   0                  0
user113156
  • 6,761
  • 5
  • 35
  • 81