1

I have an age variable and i need to recode it into categories. I've see both of these questions asked, but the answers only seem to create something in memory. When i open the data.table, the new categorical variable is not there. I can't see it and i can't subset with it. But i can run a frequency on it. But i need it to be its own variable.

R code to categorize age into group/ bins/ breaks

Convert Age variable into ordinal variable

How do i convert a continuous variable into a factor and have a tangible variable afterwards? Or, how do i take whatever is being created in memory, and make it real?

`setDT(LSSCM)[client_age <17, agegroup := "0-17"]`
`LSSCM[client_age >=18 & client_age <=24, agegroup := "18-24"]`
`LSSCM[client_age >=25 & client_age <=30, agegroup := "25-30"]`
`LSSCM[client_age >=31 & client_age <=39, agegroup := "31-39"]`
`LSSCM[client_age >=40 & client_age <=54, agegroup := "40-54"]`
`LSSCM[client_age >=55 & client_age <=64, agegroup := "55-64"]`
`LSSCM[client_age >=65 & client_age <=75, agegroup := "65-75"]`
`LSSCM[client_age >=76, agegroup := "76+"]`

Also tried.

LSSCM$age_cat <- case_when(LSSCM$client_age <= 17 ~ '0-17',
                           between(LSSCM$client_age, 18, 24) ~ '18-24',`
                           between(LSSCM$client_age, 25, 30) ~ '25-30',`
                           between(LSSCM$client_age, 31, 39) ~ '31-39',`
                           between(LSSCM$client_age, 40, 54) ~ '40-54',`
                           between(LSSCM$client_age, 55, 64) ~ '55-64',`
                           between(LSSCM$client_age, 65, 75) ~ '65-75',`
                           LSSCM$client_age >= 76 ~ '76+')`
iod
  • 7,412
  • 2
  • 17
  • 36
getoffmylap
  • 95
  • 10
  • After you try the one with `case_when`, what happens? How do you know it didn't work? – iod Nov 01 '19 at 15:17
  • The recoding worked, but when i opened the dataset (frame, whatever), i couldn't find it. There are only 32 variables, so it's not like it was hard to see. It just wasn't there. I scrolled back and forth a ton. But i could run a freq on it. So it was somewhere. – getoffmylap Nov 01 '19 at 15:24
  • 1
    Ok, i ended up getting frustrated and deleted everything and started over and now it's coming up. ??? I did literally nothing different. I swear. I re-ran the setDT one when i started over and it IS showing up in the data.frame now. Thanks. I swear, i breathe on this stuff wrong and stuff goes wrong. – getoffmylap Nov 01 '19 at 15:25

1 Answers1

1

Simply assign the result of your preferred solution into a column in the data.frame. For example:

df$agegroups<-cut(df$ages, breaks=c(20, 30, 40, 50), right = FALSE)

For example:

df<-data.frame(age = c(55, 60, 65, 70, 75, 80, 85, 90, 95))
df
  age
1  55
2  60
3  65
4  70
5  75
6  80
7  85
8  90
9  95
df$age_cat<-cut(df$age, breaks=c(0,17,24,30,39,54,64,75), right = FALSE)
df
  age age_cat
1  55 [54,64)
2  60 [54,64)
3  65 [64,75)
4  70 [64,75)
5  75    <NA>
6  80    <NA>
7  85    <NA>
8  90    <NA>
9  95    <NA>
iod
  • 7,412
  • 2
  • 17
  • 36
  • I tried that with one of them and it still didn't show up:
    `LSSCM$age_cat <- case_when(LSSCM$client_age <= 17 ~ '0-17'`
    – getoffmylap Nov 01 '19 at 15:01
  • Still didn't work: `LSSCM$age_cat<-cut(LSSCM$client_age, breaks=c(0,17,24,30,39,54,64,75), right = FALSE) ` – getoffmylap Nov 01 '19 at 15:06
  • It's hard to tell what's going on without sample data for reproducing it... – iod Nov 01 '19 at 15:07
  • Can you provide in your original question some sample data using `dput(head(LSSCM))`? – iod Nov 01 '19 at 15:11
  • I apparently suck at formatting, so any help on making the code parts look right, would be greatly appreciated. I'm using the tilde, but apparently to hit or miss effect. – getoffmylap Nov 01 '19 at 15:15
  • To format code just append four spaces before each line. No tidles or approstrophes needed – iod Nov 01 '19 at 15:15
  • But what I'm missing is some actual data, so I can see what's going on, because what you tried should work. – iod Nov 01 '19 at 15:16