0

I want to be able to substring the first character from the right hand side of each element of a vector

ABC20
BCD3
B1
AB2222
BX4444

so for the group above I would want, C, D, B, B, X .... is there an easy way to this? I know there is a substr and a numindex/charindex. So I think I can use these but not sure exactly in R.

Sotos
  • 51,121
  • 6
  • 32
  • 66
JRH31
  • 89
  • 1
  • 2
  • 8
  • related: [regex](https://stackoverflow.com/questions/4736/learning-regular-expressions) – jogo Feb 21 '18 at 08:07

3 Answers3

3

You can use library stringi,

stringi::stri_extract_last_regex(x, '[A-Z]')
#[1] "C" "D" "B" "B" "X"

DATA

x <- c('ABC20', 'BCD3', 'B1', 'AB2222', 'BX4444')
Sotos
  • 51,121
  • 6
  • 32
  • 66
0

We can use sub to capture the last upper case letter (([A-Z])) followed by zero or more digits (\\d*) until the end ($) of the string and replace it with the backreference (\\1) of the captured group

sub(".*([A-Z])\\d*$", "\\1", x)
#[1] "C" "D" "B" "B" "X"

data

x <- c("ABC20", "BCD3", "B1", "AB2222", "BX4444")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    When I added my answer with `stringi`, it was NOT in your answer. – Sotos Feb 21 '18 at 08:15
  • @Sotos I was adding the same answer and I didn't find a new answer – akrun Feb 21 '18 at 08:15
  • Ok. I m just saying. I posted it and yours only had the `sub` solution. – Sotos Feb 21 '18 at 08:17
  • @Sotos I was updating at that time. Once I posted, I find yours too – akrun Feb 21 '18 at 08:17
  • 1
    @Sotos I removed the answer. What I meant is that I came up with the answer independently. By checking the timings, you posted it first. It is a fair argument. Having said that, I noticed many times others posting the same answer as mine (that too after 5 or 10 mins) and if I comment, then the obvious reaction would be `when i started typing there was no answer`. – akrun Feb 21 '18 at 10:44
  • 1
    I agree. Thank you for that. – Sotos Feb 21 '18 at 12:25
0

Try this:

Your data:

  list<-c("ABC20","BCD3","B1","AB2222","BX4444")

Identify position

  number_pos<-gregexpr(pattern ="[0-9]",list)
  number_first<-unlist(lapply(number_pos, `[[`, 1))

Extraction

  substr(list,number_first-1,number_first-1)
[1] "C" "D" "B" "B" "X"
Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39