Selecting and matching multiple vectors in a list in R

Question

I have a list of vectors like this:

>list

[[1]]

[1] "a" "m" "l" "s" "t" "o"

[[2]]

[1] "m" "y" "o" "t" "e"

[[3]]

[1] "n" "a" "s" 

[[4]]

[1] "b" "u" "z" "u" "l" "a"

[[5]]

[1] "c" "m" "u" "s" "r" "i" "x" "t"

1-First, I want to select the vector in the table with the highest number of elements (in this case the 5th vector with 8 elements). This is easy.

2-Second I want to select all vectors in the list with length equal or immediately lower than the previous, and intersect them with the previous vector.

Another possibility I have is selecting by the name of the 1st character. In this case this would be equivalent to select the vectors starting with "a" or "b", the first and fourth in the list. In this case what I do not know is how to select multiple vectors in a list knowing their first element.

3-Finally, I want to keep just the intersection with the minimum number of matches.

In this case the the four vector in the list, starting with "b". Then start the process again for the rest of the vectors but considering already the 4th and 5th vector when "intersecting". In this case would be pick up the second element and intersect this element with a "unique() combination" of the 4th and 5th.

I hope I have explained myself!. Is there a way to do this in R without 3-4 "for" and "if" loops? in another words. Is there a clever way to do it using lapply or similar?

Please make it easier for people to help you by `dput`ing your input list and the desired result. See [**here**](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). Please show us the code you've tried and why it didn’t meet your needs. Sharing your attempts helps everyone. It demonstrates that you’ve taken the time to try to help yourself, and it saves us from reiterating obvious answers, and it helps you get a more specific and relevant answer. Cheers. — Henrik, Apr 08 '14 at 08:55
sorry for the vague explanation. For me it works if the output is a list with the first character of the selected vectors. In this case, it would be "c" "b" "m" "n". And once I have the entire loops set up I will upload it here. Sorry again. — Javier, Apr 08 '14 at 09:06
Anyway, is solved it now, perhaps I jumped to quickly into here :) — Javier, Apr 08 '14 at 11:14

score 0 · Accepted Answer · answered Apr 08 '14 at 09:04

This should do it?

list <- strsplit(list("amlsto", "myote","nas","buzula","cmsusrixt"), "")
# find minimum length
lens <- sapply(list, length)
which.min(lens)
# which are same or 1 shorter than previous
inds <- which (lens==c(-1,head(lens, -1)) | lens==c(-1,head(lens,-1))-1)
# get the intersections
inters <- mapply(intersect, list[inds], list[inds-1], SIMPLIFY=FALSE)
#Get items where first in vector is in target set
target <- c("a","b")
isTarget <- sapply(list, "[[",1) %in% target

# Minimum number of overlaps
which.min(lapply(inters, length))

Selecting and matching multiple vectors in a list in R

1 Answers1