0

Libraries I use:

library(ggplot2)
library(dplyr)
library(statsr)



```{r}
factor_sc <- table(gss1$class, gss1$getahead) 
factor_sc <- addmargins(factor_sc)
factor_sc 
```

I write this and the output is:

               Hard Work Both Equally Luck Or Help Other   Sum
  Lower Class        1063          368          299     0  1730
  Working Class     10229         3221         1870     0 15320
  Middle Class       9914         3624         1612     0 15150
  Upper Class         701          265          100     0  1066
  No Class              0            0            0     0     0
  Sum               21907         7478         3881     0 33266

I want to run the chi-square inference on this data so that I want to remove Others and No class.

However, I already remove them using:

```{r} 
gss1 <- gss %>%   filter(!is.na(getahead),
!is.na(class), class != "No Class", getahead !="Other") 
```

Why do Other and No class appear in my table?

Jaap
  • 81,064
  • 34
  • 182
  • 193
Darae-Uri
  • 103
  • 3
  • Highly probably, the `gss1$class` and `gss1$getahead` columns are factor columns which contain the `Other` and `No class` levels respectively. Just doing `gss1 <- droplevels(gss1)` before the table command or using `table(droplevels(gss1$class), droplevels(gss1$getahead))` will give you the desired result. – Jaap Oct 13 '18 at 08:44

2 Answers2

1

There is a little misunderstanding here. filter is used to remove lines of your dataset. What you want here is to change your class to NA when it is "No Class". mutate is what you need.

Try this code:

gss1 <- gss %>% 
  mutate(class = ifelse(class=="No Class", NA, class), 
         getahead = ifelse(getahead =="Other", NA, getahead )) %>% 
  select(class, getahead) %>% 
  table %>% 
  chisq.test

It is possible that you need to use NA_character_ instead of NA.

You could just remove these lines and columns from factor_sc by writing this:

factor_sc[-5,-4] %>% chisq.test
Dan Chaltiel
  • 7,811
  • 5
  • 47
  • 92
  • factor_sc[-5,-4] seems wrong... If it works, then df = 3*2 = 6 but it indicates 12, which might come from the original table of 5 classes and 4 getaheads. – Darae-Uri Oct 13 '18 at 08:08
  • Without a bit of your dataset or a [MRE](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), it's difficult to say. When I try on a dummy `table`, it works and df go from 9 to 4, but maybe your case is specific. What about the other code ? – Dan Chaltiel Oct 13 '18 at 08:14
  • I found what is the problem. The problem is on factor_sc <- add margins(factor_sc). I delete it and run the code you give and it works. Thanks! – Darae-Uri Oct 13 '18 at 08:45
  • Good for you ! (sorry, I didn't even check this function). But I still think the upper code is much better, are you sure `gss1` variable is empty ? – Dan Chaltiel Oct 13 '18 at 08:51
0

Try simple df$column <- NULL for column and similar for row. If the attempt is to achieve something bigger good luck.

Amith DS
  • 1
  • 1