13

For example, how do I get a vector of each and every person's age in the list people below:

> people = vector("list", 5)
> people[[1]] = c(name="Paul", age=23)
> people[[2]] = c(name="Peter", age=35)
> people[[3]] = c(name="Sam", age=20)
> people[[4]] = c(name="Lyle", age=31)
> people[[5]] = c(name="Fred", age=26)
> ages = ???
> ages
[1] 23 35 20 31 26

Is there an equivalent of a Python list comprehension or something to the same effect?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
c00kiemonster
  • 22,241
  • 34
  • 95
  • 133

5 Answers5

24

You can use sapply:

> sapply(people, function(x){as.numeric(x[2])})
[1] 23 35 20 31 26
tflutre
  • 3,354
  • 9
  • 39
  • 53
8

Given the data structure you provided, I would use sapply:

sapply(people, function(x) x[2])

> sapply(people, function(x) x[2])
 age  age  age  age  age 
"23" "35" "20" "31" "26" 

However, you'll notice that the results of this are character data.

> class(people[[1]])
[1] "character"

One approach would be to coerce to as.numeric() or as.integer() in the call to sapply.

Alternatively - if you have flexibility over how you store the data in the first place, it may make sense to store it as a list of data.frames:

people = vector("list", 5)
people[[1]] = data.frame(name="Paul", age=23)
people[[2]] = data.frame(name="Peter", age=35)
...

If you are going to go that far, you may also want to consider a single data.frame for all of your data:

people2 <- data.frame(name = c("Paul", "Peter", "Sam", "Lyle", "Fred")
                      , age = c(23,35,20,31, 26))

There may be some other reason why you didn't consider this approach the first time around though...

Chase
  • 67,710
  • 18
  • 144
  • 161
  • 1
    Yes in this simple example a data frame is probably better. But the 'real' example is much more complicated so a list is a better fit. Thanks anyway – c00kiemonster Aug 02 '11 at 04:13
  • @c00kie - that's what I figured, but sometimes it's easy to overlook the seemingly obvious :) – Chase Aug 02 '11 at 04:16
  • 7
    +1 For the advice on R-ish, rather than Pythonic, data structures. Also, just because I think it's nifty: `sapply(people, "[", 2)`. – joran Aug 02 '11 at 04:19
  • @joran, what does the "[" do in the sapply call? I've seen that notation before, but I can't really remember where... – c00kiemonster Aug 02 '11 at 04:33
  • @C00kie - `[` is used as an indexing tool. The help page is pretty useful and has some good info in it `?'['`. `[[` and `$` are discussed there as well. – Chase Aug 02 '11 at 04:39
  • So `[` is a function; type `?"["` for more. So you can apply the 'extract element function' to each element of `people`. `2` is passed as an additional argument to `[` via `...` in `sapply`. – joran Aug 02 '11 at 04:40
  • +1 to @joran's method. Another option is `sapply(people,'[','age')`. – nullglob Aug 02 '11 at 05:58
2
ages <- sapply(1:length(people), function(i) as.numeric(people[[i]][[2]]))
ages

Output:

[1] 23 35 20 31 26

PeterVermont
  • 1,922
  • 23
  • 18
1

Though this question is pretty old, I'd like to share my approach to this. It is certainly possible to do with the sapply as tflutre suggested. But I find it more intuitive by using the unlist function:

> ages <- unlist(people, use.names = F)[seq(2, 2 * length(people), 2)]
> ages
[1] "23" "35" "20" "31" "26"

NOTE the multiplication by two in 2 * length(people), there are two elements stored in the poeple list. This can be made more generic by writing length(people[[1]]) * length(people)

Here unlist(people, use.names = F) yields

[1] "Paul"  "23"    "Peter" "35"    "Sam"   "20"    "Lyle"  "31"    "Fred" 
[10] "26" 

and we slice that by every other element using seq command.

CermakM
  • 1,642
  • 1
  • 16
  • 15
0

Alternatively to the apply-family there's @Hadley's purrr package which offers the map_-functions for this kind of job.

(There's a few differences to the apply-family discussed for example here.)

OPs example:

people = vector("list", 5)
people[[1]] = c(name="Paul", age=23)
people[[2]] = c(name="Peter", age=35)
people[[3]] = c(name="Sam", age=20)
people[[4]] = c(name="Lyle", age=31)
people[[5]] = c(name="Fred", age=26)

The sapply approach:

ages_sapply <- sapply(people, function(x){as.numeric(x[2])})
print(ages_sapply)
[1] 23 35 20 31 26

And the map approach:

ages_map <- purrr::map_dbl(people, function(x){as.numeric(x[2])})
print(ages_map)
[1] 23 35 20 31 26

Of course they are identical:

identical(ages_sapply, ages_map)
[1] TRUE
symbolrush
  • 7,123
  • 1
  • 39
  • 67