3

I'm using heaven labelled dataframes (variables already have value labels when importing datasets). I need to run many crosstabulations of two variables. I’m using the cro function from expss package because by default displays value labels, and computes weighted crosstabs.

However, the output tables I get display unused value labels. How can I drop unused labels without manually dropping unused value labels for each variable? (by the way: the fre function from expss package has this argument by default: drop_unused_labels = TRUE, but cro function doesn’t)

Here is a reproducible example:

# Dataframe 
df <- data.frame(sex = c(1, 2, 99, 2, 1, 2, 2, 2, 1, 2),
                 agegroup= c(1, 2, 99, 2, 3, 3, 2, 2, 2, 1),
                 weight = c(100, 20, 400, 300, 50, 50, 80, 250, 100, 100))
library(expss)

# Variable labels
var_lab(df$sex) <-"Sex"
var_lab(df$agegroup) <-"Age group"

# Value labels 
val_lab(df$sex) <- make_labels("1 Male 
                               2 Female
                               97 Didn't know
                               98 Didn't respond
                               99 Abandoned survey")

val_lab(df$agegroup) <- make_labels("1 1-29
                                        2 30-49
                                        3 50 and more
                                       97 Didn't know
                                       98 Didn't respond
                                       99 Abandoned survey")

cro(df$sex, df$agegroup, weight = df$weight)

 |     |                  | Age group |       |             |             |                |                  |
 |     |                  |      1-29 | 30-49 | 50 and more | Didn't know | Didn't respond | Abandoned survey |
 | --- | ---------------- | --------- | ----- | ----------- | ----------- | -------------- | ---------------- |
 | Sex |             Male |       100 |   100 |          50 |             |                |                  |
 |     |           Female |       100 |   650 |          50 |             |                |                  |
 |     |      Didn't know |           |       |             |             |                |                  |
 |     |   Didn't respond |           |       |             |             |                |                  |
 |     | Abandoned survey |           |       |             |             |                |              400 |
 |     |     #Total cases |         2 |     5 |           2 |             |                |                1 |

I want to get rid of the columns and rows called ‘Didn't know’ and ‘Didn't respond’.

MrFlick
  • 195,160
  • 17
  • 277
  • 295

1 Answers1

2

You can use drop_unused_labels function to remove the labels which are not used.

library(expss)
df1 <- drop_unused_labels(df)
cro(df1$sex, df1$agegroup, weight = df1$weight)
                                                                           
 |     |                  | Age group |       |             |                  |
 |     |                  |      1-29 | 30-49 | 50 and more | Abandoned survey |
 | --- | ---------------- | --------- | ----- | ----------- | ---------------- |
 | Sex |             Male |       100 |   100 |          50 |                  |
 |     |           Female |       100 |   650 |          50 |                  |
 |     | Abandoned survey |           |       |             |              400 |
 |     |     #Total cases |         2 |     5 |           2 |                1 |
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213