0

i have simply data, customer id and pesel id.

df <- data.table(id_customer = c(12072510, 12072518, 12072222, 12072456, 
                             12072590, 12356518, 12657318, 12227018),
             id_pesel = c('97031112028', '00230604128', '01321802593', '18901500020', 
                          '31061701887', '65111611046', '81012409371', '86120607640'))

and after i use my code, i get warning message about "the condition has length > 1 and only the first element will be used". The id pesel contains the date of birth, which i need to take out. The first two numbers are the year, the next two - the month, and the next two - the day. Those born in 1900-1999 have months from 1 to 12. Those born in 2000-2099 have months from 21 to 32 (20 is subtracted). For example: 99122011111 --> 1999/12/20. 02221511111 --> 2002/02/15. If the length (id pesel) is not 11 or the months are outside the range 1-12 or 21-32 it should be NA.

Code:

test <- df %>%  mutate(
birthdate =  if(nchar(id_pesel) == 11 & between(as.numeric(substring(id_pesel,3,4)),1,12)) {
  as.Date(paste0('19',substring(id_pesel,1,2),'-',substring(id_pesel,3,4),'-',substring(id_pesel,5,6)))
} else if (nchar(id_pesel) == 11 & between(as.numeric(substring(id_pesel,3,4)),1,12)) {
  as.Date(paste0('20',substring(id_pesel,1,2),'-',as.numeric(substring(id_pesel,3,4))-20,'-',substring(id_pesel,5,6)))
} else if (nchar(id_pesel) == 11 & !between(as.numeric(substring(id_pesel,3,4)),1,12) & !between(as.numeric(substring(id_pesel,3,4)),21,32)) {
  as.Date(NA)
} else if (nchar(id_pesel) != 11) {
  as.Date(NA)
})

How can I correct this code or write a new, clean function?

  • Have a look at the `case_when()` function :) – Julian May 04 '22 at 15:01
  • `if()` isn't vectorized---it's for checking a single condition, not a condition for every value in a vector (or every row in a column). `ifelse()` is vectorized, but when you have multiple nested conditions, Julian's suggestion to use `case_when` will make things much simpler. – Gregor Thomas May 04 '22 at 15:06

0 Answers0