0

I have a data frame like this:

TAGNAME                                  VALUE
XX:YY:ZZ:WXYX:title_for_this.and_that_a   20.2
PP:YY:ZZ:ABCF:title_for_this.and_that_b   45.7
QQ:YY:ZZ:FGHJ:title_for_this.and_that_c   27.2
RR:YY:ZZ:JYHG:title_for_this.and_that_d   30.9

I need to remove all the characters from TAGNAME that occur before the last colon. So what I need it this:

TAGNAME                     VALUE
title_for_this.and_that_a    20.2
title_for_this.and_that_b    45.7
title_for_this.and_that_c    27.2
title_for_this.and_that_d    30.9

I can get all the characters before the last colon using:

tagnames <- sapply(strsplit(data_frame$TAGNAME, "\\:[^\\:]*$"), "[", 1)

I tried to use this to gsub the characters out from the TAGNAME like this:

for(i in 1:nrow(data_frame)) {
   data_frame[i,1] <- gsub(data_frame[i,1], tagnames[i],'')
 }

which, besides being an awful way to loop through a data frame, doesn't work.

Cybernetic
  • 12,628
  • 16
  • 93
  • 132

1 Answers1

2
df$TAGNAME = sub(".*:","", df$TAGNAME)

explaination of the regex ".*:"
. select any character * present 0 or more times before :
See this website for additional information on regex.

Haboryme
  • 4,611
  • 2
  • 18
  • 21