How to select a portion of a string in R?

Question

I have strings like below one in a big text file. How can i select only the part in second line between "SN=" and ":2832397" i.e. "RK:7573-0" out of it?

SIGN="000F 0E70 FA83 B72F D215 C7EE 4AF4 6440 A547 12B1 0603 \

SN=RK:7573-0:2832397:369963

1086 0857 BFF1 5FC2 CE6F C87D 7C00 DF64 C1AD DD39") }

using regular expression with `gsub` would be your best bet, maybe http://stackoverflow.com/questions/6109882/regex-match-all-characters-between-two-strings can be of assistance — zacdav, Jan 09 '17 at 12:03

score 2 · Accepted Answer · answered Jan 09 '17 at 12:04

2

We can use str_extract

library(stringr)
as.vector(na.omit(str_extract(lines, "(?<=SN=).*(?=:2832397)")))
#[1] "RK:7573-0"

Or with base R

gsub("^[^=]+\\=|(:\\d+){2,}$", "", grep("SN=", lines, value = TRUE))
#[1] "RK:7573-0"

answered Jan 09 '17 at 12:04

akrun

874,273
37
540
662

score 0 · Answer 2 · answered Jan 09 '17 at 12:09

You can start with a regex like this :

SN=([A-Z]+:[\d-]+):

exemple : https://regex101.com/r/0qBwYc/1

explanation :

SN= => match literaly "SN="

[A-Z]+ => match 1 or any upercase

: => match literaly ":"

[\d-]+ => match any digit or the caracter "-" 1 time or more

: => match literaly ":"

([A-Z]+:[\d-]+) => parenthesis are use to create a matching group so you can get only the part who match "[A-Z]+:[\d-]+"

How to select a portion of a string in R?

2 Answers2