R table function including filtered out rows

Question

I have a dataframe that is read in with readRDS() as a df. This contains many rows with cities and states. I keep only data that is in the state of California as df_ca.

df_ca contains 100 columns and I only keep a few categorical columns. I create a new catagorical df called df_cat. I want to loop over the categorical columns and get the frequencies with the table function. Ignoring the loop for troubleshooting, I set var as city and execute the table function creating a new df called cat_freq. cat_freq contains all cities from df rather than df_ca, their Freq is 0. Why are they even showing up if they were filtered out? I am new to R but have a python background

df <- as.data.frame(readRDS('some.data.5140')) 
df_ca <- df[df$car.state == "ca",]
cat_col <- (unlist(list('color', 'city', 'deliver', 'type')))
df_cat <- df_ca[, cat_col]
var <- "city"
cat_freq <-  data.frame(table(df_cat[var]))

Thomas: You can go ahead and accept your answer to show that the question has been resolved. I'll delete my answer. — A5C1D2H2I1M1N2O1R2T1, Dec 03 '20 at 02:46

score 2 · Accepted Answer · answered Dec 03 '20 at 02:30

Incorporating droplevels fixed the problem.

df <- as.data.frame(readRDS('some.data.5140')) 
df_ca <- df[df$car.state == "ca",]
cat_col <- (unlist(list('color', 'city', 'deliver', 'type')))
df_cat <- df_ca[, cat_col]
df_cat <- droplevels(df_cat)
var <- "city"
cat_freq <-  data.frame(table(df_cat[var]))

score 0 · Answer 2 · answered Dec 03 '20 at 02:19

That is mostly because your columns are of type factor if you convert them to character it should help.

df_ca <- df[df$car.state == "ca",]
cat_col <- c('color', 'city', 'deliver', 'type')
df_cat <- df_ca[, cat_col]
#Convert all columns in df_cat to 
df_cat[] <- lapply(df_cat, as.character) character
var <- "city"
cat_freq <-  data.frame(table(df_cat[var]))

R table function including filtered out rows

2 Answers2