1

I'm trying to use forcats::fct_relevel to specify the levels in a column, the way I've used it in ggplot, but it's giving an error about "unknown levels".

Here is a chart of the cheeses I have eaten per month:

cheeses<-tribble(
  ~mymonth, ~Brie, ~Stilton,
  1, 4, 2, 
  2, 4, 1,
  3, 1, 3,
  4, 1, 5,
  5, 2, 4,
  6, 3, 1
)

and a list of the months:

cheesemonth<-c("Jan", "Feb", "Mar", "Apr", "May", "Jun")

According to pages like this one, I should be able to do the following:

cheeses %>% 
  mutate(mymonth=factor(mymonth)) %>% 
  mutate(mymonth=fct_relevel(mymonth, cheesemonth))

and have the items in mymonth replaced by the items in cheesemonth. But instead I get:

6 unknown levels in `f`: Jan, Feb, Mar, Apr, May, and Jun 

and I'm at a loss to understand why.

If I replace the last line with:

mutate(mymonth=case_match(mymonth, "1" ~ "Jan", "2" ~ "Feb", "3" ~ "Mar", "4" ~ "Apr", "5" ~ "May", "6" ~ "Jun"))

then it's fine, but this is more typing, and means I can't re-use the cheesemonth list.

So why do I get the unknown levels error?

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
donnek
  • 221
  • 1
  • 9
  • Why not simply use `cheeses %>% mutate(mymonth = factor(mymonth, labels = cheesemonth))`? It doesn't sound like you really need the forcats package here. – Z.Lin Mar 19 '23 at 12:35
  • I re-opened this since the nature of those questions is that `factor` is a good answer but here `ordered` would be better. – G. Grothendieck Mar 19 '23 at 14:31

2 Answers2

1

fct_relevel reorders levels. To change the labels, which forcats calls values, use lvls_revalue

library(forcats)

lvls_revalue(as.character(cheeses$mymonth), cheesemonth)
$$ [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

or use fct

library(forcats)

fct(cheesemonth[cheeses$mymonth], cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

It is even easier with base R:

factor(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

or given that months have a natural order you may wish to create an ordered factor (also base R):

ordered(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun

Note that R has a built-in month.abb vector (English only) so we could eliminate cheesemonth and write:

month.abb
## [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"

ordered(cheeses$mymonth, labels = month.abb[1:6])
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun

or to allow for months that are not present in the data

ordered(cheeses$mymonth, levels = 1:12, labels = month.abb)
## [1] Jan Feb Mar Apr May Jun
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • A BIG thank-you to jay.sf and you, Gabor - I've spent another couple of hours typing stuff, and I think I finally understand what I was doing wrong. Thanks too for the info on ordered() and the month.abb list, which will be really useful. So kind of you both to share your time. – donnek Mar 19 '23 at 16:13
0

You have levels 1:6 and the six labels in cheesemonth which can be combined in factor like so:

cheeses$mymonth <- factor(cheeses$mymonth, levels=1:6, labels=cheesemonth)
cheeses
#   mymonth Brie Stilton
# 1     Jan    4       2
# 2     Feb    4       1
# 3     Mar    1       3
# 4     Apr    1       5
# 5     May    2       4
# 6     Jun    3       1

This also works with pipes in base R,

cheeses |> transform(mymonth=factor(mymonth, levels=1:6, labels=cheesemonth))

or using dplyr.

library(magrittr)
cheeses %>% dplyr::mutate(mymonth=factor(mymonth, levels=1:6, labels=cheesemonth))

Data:

cheeses <- structure(list(mymonth = c(1, 2, 3, 4, 5, 6), Brie = c(4, 4, 
1, 1, 2, 3), Stilton = c(2, 1, 3, 5, 4, 1)), class = "data.frame", row.names = c(NA, 
-6L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110