51

Given the following mock data:

set.seed(123)
x <- data.frame(let = sample(letters[1:5], 100, replace = T), 
                num = sample(1:10, 100, replace = T))
y <- subset(x, let != 'a')

Creating a table of y$let yields

a  b  c  d  e 
0 20 21 22 18

But I don't want a to show anymore. If I try to do this:

levels(y$let) <- factor(y$let)

I mess the frequencies, since now table(y$let) gives me

b  d  c  e 
0 20 21 40 

I'm aware I could do xtabs(~ y$let, drop.unused.levels = T) and work around the problem, but it doesn't reset the variable levels at its core (which is important to me, since this is an early change I'm making to the dataset which will carry on throughout the whole analysis). Moreover, xtabs is a different class from table, which will give me headaches later in the project.

The question is: how can I automatically change levels(y$let) so it doesn't show levels that were dropped when I created the subset? In this case, how can I make it show [1] "b" "c" "d" "e"?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Waldir Leoncio
  • 10,853
  • 19
  • 77
  • 107
  • 3
    The winning answer int he duplicate question is not as good as the answer here. The other should be marked as a duplicate of this since this is a MUCH better answer – TheSteve0 Dec 20 '15 at 18:47

4 Answers4

140

There's a recently added function in R for this:

y <- droplevels(y)
Señor O
  • 17,049
  • 2
  • 45
  • 47
23

Just do y$let <- factor(y$let). Running factor on an existing factor variable will reset the levels to only those that are present.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
3

Adding to Hong Ooi's answer, here is an example I found from R-Bloggers.

# Create some fake data
x <- as.factor(sample(head(colors()),100,replace=TRUE))
levels(x)
x <- x[x!="aliceblue"]
levels(x) # still the same levels
table(x) # even though one level has 0 entries!

The solution is simple: run factor() again:
x <- factor(x)
levels(x)
CRich
  • 118
  • 1
  • 1
  • 7
2

The forcats package for working with factors is often a good choice.

library(forcats)
y$let <- fct_drop(y$let)
Linda Marsh
  • 105
  • 1
  • 1
  • 9