-1

I am relatively new to R and trying to create a new variable as part of a homework assignment, any help would be appreciated!

I have a data set that looks like this:

State    agegr
1         15-17
1         18-20
1         21-24
2         15-17
2         18-20
2         21-24

Currently I have state as the ID column, however would like to modify it so that I have a single ID column showing the state and age range, something like this:

State
1-15
1-18
1-21
2-15
2-18
2-21

and be able to identify the state.

Blisskarthik
  • 1,246
  • 8
  • 20

1 Answers1

1

Using your data:

df <- data.frame(State=c(1,1,1,2,2,2), agegr=c('15-17','18-20','21-24','15-17','18-20','21-24'), stringsAsFactors=F );
df;
##   State agegr
## 1     1 15-17
## 2     1 18-20
## 3     1 21-24
## 4     2 15-17
## 5     2 18-20
## 6     2 21-24

Here's an approach using sub() and paste():

data.frame(State=paste(df$State,sub('^(\\d+).*','\\1',df$agegr),sep='-'));
##   State
## 1  1-15
## 2  1-18
## 3  1-21
## 4  2-15
## 5  2-18
## 6  2-21

Here's an approach using strsplit() and paste():

data.frame(State=paste(df$State,unlist(strsplit(df$agegr,'-'))[c(T,F)],sep='-'));
##   State
## 1  1-15
## 2  1-18
## 3  1-21
## 4  2-15
## 5  2-18
## 6  2-21
bgoldst
  • 34,190
  • 6
  • 38
  • 64