Linear regression is not returning all of my coefficients from my 'Team' column or category

Question

I want to look at the coefficient estimates for all of my different sales teams. I have 20 teams listed in a "Teams" column and about 24 observations for each. However, when I run my regression, I am only seeing 15 of the 20 teams in my model summary. I want to see all of them, any thoughts?

Here is my code and output:

(log_teams <- lm(Worked ~ Team+Activity+Presented+Confirmed+Jobs_Filled+Converted,  data = df))%>%
  summary

Output:

Call:
lm(formula = Worked ~ Team + Activity + Presented + Confirmed + 
    Jobs_Filled + Converted, data = WBY)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.4035 -1.0048  0.0000  0.8774  5.1677 
Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept)  2.609486   1.869903   1.396   0.1699  
TeamCRW      0.110828   1.908735   0.058   0.9540  
TeamEMW     -1.068797   2.767863  -0.386   0.7013  
TeamGSW     -0.424508   2.795353  -0.152   0.8800  
TeamNS2     -1.234508   2.388392  -0.517   0.6078  
TeamNUW     -1.458735   2.083549  -0.700   0.4875  
TeamOBW      3.224057   2.103054   1.533   0.1324  
TeamORT     -0.432185   1.884824  -0.229   0.8197  
TeamPC1      4.338479   2.115219   2.051   0.0462 *
TeamPC2     -1.002268   2.227166  -0.450   0.6549  
TeamPDW      2.560784   2.791501   0.917   0.3640  
TeamPLW      1.381216   2.151150   0.642   0.5242  
TeamPYW     -1.074374   2.799772  -0.384   0.7030  
TeamSB2     -0.646769   2.288132  -0.283   0.7788  
TeamSYW      2.252061   1.833820   1.228   0.2259  
TeamWMO      0.857452   2.302522   0.372   0.7114  
Activity    -0.000627   0.002906  -0.216   0.8302  
Presented    0.162181   0.331876   0.489   0.6275  
Confirmed   -0.242462   0.317139  -0.765   0.4486  
Jobs_Filled -0.025657   0.016099  -1.594   0.1182  
Converted    0.006213   0.002610   2.381   0.0217 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.247 on 44 degrees of freedom
  (427 observations deleted due to missingness)
Multiple R-squared:  0.5217,    Adjusted R-squared:  0.3043 
F-statistic:   2.4 on 20 and 44 DF,  p-value: 0.007786*

some combination of (1) the fact that you will only have parameters for at most `n-1` of your factor levels, due to contrast settings [check out the `emmeans` package]; (2) maybe too many missing values in the predictors? Can you please post the output as text rather than as an image? What are the results of `levels(df$Team)` or `unique(df$Team)`? — Ben Bolker, Jan 09 '23 at 17:42
Your output states "(427 observations deleted due to missingness)," so it might be that some teams had missing data for every observation. If that doesn’t seem to explain it, please review [this thread](https://stackoverflow.com/q/5963269/17303805), then [edit] your question to provide a reproducible example (e.g., including data). — zephryl, Jan 09 '23 at 17:44
Note that R uses *complete case analysis* by default, i.e. any observation with missing values in *any* of the predictor variables will be discarded — Ben Bolker, Jan 09 '23 at 17:50
@BenBolker I do have some null values but that doesn't seem to be the issue here. Something I noticed is that for every predictor I add, an additional team is removed. As if the summary only shows a certain number of coefficients. When I run levels(df$Team) I see all of my teams. — ngarn, Jan 09 '23 at 18:43
We probably can't say much more without a [mcve] (can you post your data somewhere?) — Ben Bolker, Jan 09 '23 at 23:25

Linear regression is not returning all of my coefficients from my 'Team' column or category

0 Answers0