0

In the output of the multiple regression, I got NAs and negative beta coefficients.

But when I put for example the variable donderdag at the beginning in the equation, there is a result and not NA, but now woensdag got an NA. What's going on here? And what can I do to avoid the negative beta coefficient?

lm(formula = link_clicks_unique ~ expanded_view_avg_time + today_impressions_unique + 
    vrijdag + zaterdag + zondag + maandag + dinsdag + woensdag + 
    donderdag, data = zwitsal_t2_duration)

Result:

                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)               0.5861669  0.0209341  28.001  < 2e-16 ***
expanded_view_avg_time    0.0032145  0.0005866   5.480 4.36e-08 ***
today_impressions_unique  0.0037398  0.0054384   0.688 0.491684    
vrijdag                  -0.0186911  0.0191080  -0.978 0.328008    
zaterdag                  0.0067248  0.0193532   0.347 0.728238    
zondag                   -0.0267094  0.0190372  -1.403 0.160646    
maandag                  -0.0107537  0.0190001  -0.566 0.571421    
dinsdag                   0.0030009  0.0189215   0.159 0.873992    
woensdag                 -0.0257946  0.0192104  -1.343 0.179388    
donderdag                        NA         NA      NA       NA

Moving around the variables:

lm(formula = link_clicks_unique ~ expanded_view_avg_time + today_impressions_unique + 
    donderdag + vrijdag + zaterdag + zondag + maandag + dinsdag + 
    woensdag, data = zwitsal_t2_duration)

Result:

                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)               0.5603723  0.0206519  27.134  < 2e-16 ***
expanded_view_avg_time    0.0032145  0.0005866   5.480 4.36e-08 ***
today_impressions_unique  0.0037398  0.0054384   0.688 0.491684    
donderdag                 0.0257946  0.0192104   1.343 0.179388    
vrijdag                   0.0071035  0.0190190   0.373 0.708786    
zaterdag                  0.0325195  0.0192748   1.687 0.091606 .  
zondag                   -0.0009148  0.0189527  -0.048 0.961506    
maandag                   0.0150409  0.0189190   0.795 0.426622    
dinsdag                   0.0287955  0.0188352   1.529 0.126344    
woensdag                         NA         NA      NA       NA
slamballais
  • 3,161
  • 3
  • 18
  • 29
Dani
  • 1
  • Well, the variables are in Dutch, and they represent the 7 days of the week. Did you make dummy variables for each day? If yes, there's an explanation of this phenomenon [in this answer](https://stackoverflow.com/a/7341074/5805670). Also, what is wrong with negative coefficients? – slamballais Jun 07 '21 at 14:22
  • I recoded the dates myself into weekdays, so probably the recode went wrong. What can I do to create the dummy variables for the weekdays? ```zwitsal_t2_duration$date <- strftime(zwitsal_t2_duration$date, "%A")``` – Dani Jun 07 '21 at 15:39
  • `lm` creates dummy variables for you if you have factors. You can just make a factor with every day of the week. See for example [this tutorial](http://www.sthda.com/english/articles/40-regression-analysis/163-regression-with-categorical-variables-dummy-coding-essentials-in-r/). And just to be clear (which is the point of my 1st link): You cannot run this model with all 7 week days in there. You'll always get an `NA` given how the contrasts work. You'll need to pick one day as a reference. – slamballais Jun 07 '21 at 17:24
  • Ok clear. I have created for the weekdays different columns in my dataset, so even then I can't run this model? – Dani Jun 09 '21 at 14:22

0 Answers0