1

I want to identify who are default and voluntary members in an Insurance database. Default members are ones with a certain number of units depending on their age. Voluntary members are any members with more units than default members at that age.

I want to create a column in R that says either "Default" or "Voluntary"

I have a table of the number of units a default member has. For example:

Age Units
18  2
19  2
20  2
21  2
22  2
23  2
24  2
25  3
26  3
27  3
28  3
29  3
30  3
31  4
32  4
33  4
34  4
35  4
36  4
37  4
38  4
39  4
40  4
41  4
42  4
43  4
44  4
45  4
46  4
47  4
48  4
49  4
50  3
51  3
52  3
53  3
54  3
55  3
56  3
57  3
58  3
59  3
60  2
61  2
62  2
63  2
64  2
65  1
66  1
67  1
68  1
69  1

I would usually do this in excel by vlookup-ing the member's number of units and if it equals the default number of units from above table I would say they are default and if not non default.

This is how I would achieve in excel

if( MembersUnits = vlookup(memberage,defaultunitstable,2,0),"Default", "Voluntary")

I expect out put to be "Default" or "Voluntary"

Axeman
  • 32,068
  • 8
  • 81
  • 94
Dane
  • 19
  • 2

2 Answers2

0

Using the data you supplied as a lookup table, I created data of person age and the number of units they have, joined the threshold values from lookup and compared the values with ifelse:

library(dplyr)

lookup <- structure(list(Age = 18:69,
                         Units = c(2L, 2L, 2L, 2L, 2L, 2L, 
                                   2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                                   4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 
                                   3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L)),
                    row.names = c(NA, 
                                  -52L), class = c("tbl_df", "tbl", "data.frame"))

dat <- tibble(Age = c(50, 50, 49, 32, 18), Units = c(3, 5, 5, 4, 3))

left_join(dat, rename(lookup, "Threshold" = "Units"), by = "Age") %>%
  mutate(member = ifelse(Units == Threshold, "Default", "Voluntary"))

# A tibble: 5 x 4
    Age Units Threshold member   
  <dbl> <dbl>     <int> <chr>    
1    50     3         3 Default  
2    50     5         3 Voluntary
3    49     5         4 Voluntary
4    32     4         4 Default  
5    18     3         2 Voluntary
Paul
  • 2,877
  • 1
  • 12
  • 28
0
if (!require("prodlim")) {
  install.packages("prodlim")
  require("prodlim")
} # ensure installation and loading of package "prodlim"

ifelse(is.na(row.match(as.data.frame(dat), as.data.frame(lookup))),
       "Voluntary", 
       "Default")
## [1] "Default"   "Voluntary" "Voluntary" "Default"   "Default"   "Default" 

## the function
## prodlim::row.match(as.data.frame(dat), as.data.frame(lookup))
## returns for each row in dat,
##    the matching row number in lookup or 
##    NA if there is no match
## 
## This resulting vector one can use to translate any non-NA to "Default" and
## any NA to "Voluntary" using the vectorized `ifelse`

Ah I used as example data following @Paul:

require(dplyr)
dat <- tibble(Age = c(50, 50, 49, 26, 32, 18), Units = c(3, 5, 5, 3, 4, 2))
lookup <- structure(list(Age = 18:69,
                         Units = c(2L, 2L, 2L, 2L, 2L, 2L, 
                                   2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                                   4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 
                                   3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L)),
                    row.names = c(NA, 
                                  -52L), class = c("tbl_df", "tbl", "data.frame"))
Gwang-Jin Kim
  • 9,303
  • 17
  • 30