Why does case_when not work when two conditions involve the same column?

Question

I'm using case_when within dplyr to lookup values in different tables depending on the value in one of my columns. It works well for one argument (either of the arguments below on their own), but when I add a second argument involving the same column I get an error about:

'names' attribute [1] must be the same length as the vector [0]

Here's a small example of the data, and the script that produces the error:

library(lubridate)
directory <- structure(list(data_label = c("BufT.1", "CTID.1", "CTNM.1", "CTOw.1", 
"CTTL.1", "CpEP.1", "DATA.1", "DATA.2", "DATA.3", "DATA.4", "DATA.5", 
"DATA.6", "DATA.7", "DATA.8", "DATA.105", "DCHT.1", "DSam.1", 
"DySN.1", "Dye#.1", "DyeB.1", "DyeB.2", "DyeB.3", "DyeB.4", "DyeB.5", 
"DyeN.1", "DyeN.2", "DyeN.3", "DyeN.4", "DyeN.5", "DyeW.1", "DyeW.2", 
"DyeW.3", "DyeW.4", "DyeW.5", "EPVt.1", "EVNT.1", "EVNT.2", "EVNT.3", 
"EVNT.4", "GTyp.1", "HCFG.1", "HCFG.2", "HCFG.3", "HCFG.4", "InSc.1", 
"InVt.1", "LANE.1", "LIMS.1", "LNTD.1", "LsrP.1", "MCHN.1", "MODF.1", 
"MODL.1", "NAVG.1", "NLNE.1", "OfSc.1", "PSZE.1", "PTYP.1", "PXLB.1", 
"RGNm.1", "RGOw.1", "RMXV.1", "RMdN.1", "RMdV.1", "RMdX.1", "RPrN.1", 
"RPrV.1", "RUND.1", "RUND.2", "RUND.3", "RUND.4", "RUNT.1", "RUNT.2", 
"RUNT.3", "RUNT.4", "Rate.1", "RunN.1", "SCAN.1", "SMED.1", "SMLt.1", 
"SVER.1", "SVER.3", "SVER.4", "Satd.1", "Scal.1", "Scan.1", "SpNm.1", 
"TUBE.1", "Tmpr.1", "User.1"), abif_type = c("short array", "cString", 
"cString", "cString", "pString", "byte", "short array", "short array", 
"short array", "short array", "short array", "short array", "short array", 
"short array", "short array", "short", "short", "pString", "short", 
"char", "char", "char", "char", "char", "pString", "pString", 
"pString", "pString", "pString", "short", "short", "short", "short", 
"short", "long", "pString", "pString", "pString", "pString", 
"pString", "cString", "cString", "cString", "cString", "long", 
"long", "short", "pString", "short", "long", "pString", "pString", 
"char[4]", "short", "short", "long array", "long", "cString", 
"long", "cString", "cString", "cString", "cString", "cString", 
"char array", "cString", "cString", "date", "date", "date", "date", 
"time", "time", "time", "time", "user", "cString", "long", "pString", 
"pString", "pString", "pString", "pString", "long array", "float", 
"short", "pString", "pString", "long", "pString")), row.names = c(NA, 
-90L), class = "data.frame")


data.date <- structure(list(index = c("RUND.1", "RUND.2", "RUND.3", "RUND.4"
), date = structure(c(19234, 19234, 19234, 19234), class = "Date")), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

data.time <- structure(list(index = c("RUNT.1", "RUNT.2", "RUNT.3", "RUNT.4"
), time = new("Period", .Data = c(56, 47, 8, 48), year = c(0, 
                                                           0, 0, 0), month = c(0, 0, 0, 0), day = c(0, 0, 0, 0), hour = c(17, 
                                                                                                                          19, 18, 19), minute = c(48, 0, 19, 0)), hsecond = c(0L, 0L, 0L, 
                                                                                                                                                                              0L)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
                                                                                                                                                                              ))



library(dplyr)
results <- directory %>%
  mutate(value = case_when(
    abif_type == "date" ~ data.date$date[match(data_label, data.date$index)],
    abif_type == "time" ~ data.time$time[match(data_label, data.time$index)]))

You may need to do a join and then coalesce with the same type i.e. `directory %>% left_join(data.date, by = c("data_label" = "index")) %>% left_join(data.time, by = c("data_label" = "index")) %>%mutate(value = coalesce(as.character(date), as.character(time))) %>% select(-date, -time, -hsecond)` — akrun, Sep 13 '22 at 20:05
@GregorThomas why does it work for both arguments on their own then? — Mike, Sep 13 '22 at 20:07
@akrun that's one way round it, though my real world example has quite a few conditions, so it would be good to have an `if` statement like `case_when` working — Mike, Sep 13 '22 at 20:10

score 1 · Answer 1 · answered Sep 13 '22 at 20:18

1

I've found an answer, borrowing from here: dplyr case_when throws error names' attribute [1] must be the same length as the vector [0]

The problem was the date and time need to be of the same class.

results <- directory %>%
    mutate(value = case_when(
      abif_type == "date" ~ as.character(data.date$date[match(data_label, data.date$index)]),
      abif_type == "time" ~ as.character(data.time$time[match(data_label, data.time$index)])))

answered Sep 13 '22 at 20:18

Mike

921
7
26

1

Good find! It's the attribute that has mismatched length, not the result! – Gregor Thomas Sep 13 '22 at 20:23

Why does case_when not work when two conditions involve the same column?

1 Answers1