I have a list/vector ("x") of 1000 smaller vectors of 1 line each. These sub vectors include strings and numbers. One of the lines includes the "id: XXXX" variable which is embedded within strings. I can use the following piece of code in R to combine successive vectors within the list if I am only considering the first 2 vectors (i.e. x[[i]] and x[[i+1]]).
first_vec<-c("Page 1 of 1000", "Report of vectors within a list", "id: 1234 height: 164 cms", "health: good")
second_vec<-c("Page 2 of 1000", "Report of vectors within a list", "id: 1235 height: 180 cms", "health: moderate")
third_vec<-c("Page 3 of 1000", "Report of vectors within a list", "id: 1235 weight: 200 pounds", "health: moderate")
x<-list(first_vec, second_vec, third_vec)
X <- for (i in i:unique(length(x))) {
t1 <- unlist(stringr::str_extract_all(x[[i]][!is.na(sample)], "(id: [0-9]+)"))
t2 <- unlist(stringr::str_extract_all(x[[i + 1]][!is.na(sample)], "(id: [0-9]+)"))
if (t1 == t2) {
c(x[[i]], x[[i + 1]])
}
}
The desired result is:
x<-list(first_vec, c(second_vec, third_vec)
This works for me when I have just two subvectors. However, I have a list of 1000 vectors. How can I loop the above piece of code across all the vectors within the list x?
At the moment I get the following error message:
Warning in is.na(sample) :
is.na() applied to non-(list or vector) of type 'closure'
Error in x[[i + 1]] : subscript out of bounds
I am including an example of a typical input file I am applying the code to. In the example below, I would like to combine pages 2 and 3, since the ids match.