I have many strings of the form name1, name2 and name3, 0, 1, 2
or name1, name2, name3 and name4, 0, 1, 2
and would like to split the vector into 4 elements where the first one would be the whole text string of names. The problem is that strsplit doesn't differenciate between text and numbers and split the string into 5 elements in the first case and into 6 elements in the second example. How can I tell R to dynamically skip the text part of the string with variable number of names?
Asked
Active
Viewed 1.0k times
0

nopeva
- 1,583
- 5
- 22
- 38
2 Answers
3
You have two main options:
(1) grep for the numbers, and extract those.
(2) split on the comma, then coerce to numeric and check for NA
s
I prefer the second
splat <- strsplit(x, ",")[[1]]
numbs <- !is.na(suppressWarnings(as.numeric(splat)))
c(paste(splat[!numbs], collapse=","), splat[numbs])
# [1] "name1, name2 and name3" " 0" " 1" " 2"

Ricardo Saporta
- 54,400
- 17
- 144
- 178
-
thanks for the answer, I get `Error in withCallingHandlers`when trying to generate `numbs` object – nopeva Oct 06 '13 at 20:20
-
whoops, there should have been a `[[1]]` at the end of the `strsplit` line. fixed! – Ricardo Saporta Oct 06 '13 at 20:33
2
You could also insert a delimiter in the right places, and then split on that:
delimmed <- gsub('(.*[a-z][0-9]+| [0-9]+),','\\1%',strr)
strsplit(delimmed,'%')
The first part of the regular expression (to the left of the |
) matches everything (.*
) up to the final letter-number-comma combo; and the second matches any space-number-comma combo. The comma is dropped (since it's outside the parentheses) and replaced by %
.

Frank
- 66,179
- 8
- 96
- 180