1

I make a sample of all values of one column of my data frame in function of other column. To do so, I use tapply.

ex <- data.frame(
loc = c("1", "1", "2", "2", "2", "3", "3"),
sp = c("a", "b", "b", "c", "d", "a", "d"))
ex

all_sp <- unique(ex[, "sp"])
all_sp <- data.frame(all_sp)

ex$sp_random <- ""

sp_rand <- tapply(ex$sp_random, ex$loc, function(x) 
base::sample(all_sp$all_sp, size = length(x), replace = FALSE, prob = NULL))

Now I would like to put the sp_rand list in the original ex data frame but I don't know how to it properly.

The only way I found is to reorder ex column like that :

ex <- ex[order(ex$loc), ]
ex$sp_random <- as.character(unlist(sp_rand))
ex

but order is quite slow with big data frames.

P. Denelle
  • 790
  • 10
  • 24
  • What exactly is your question/problem? – Heroka Dec 16 '15 at 13:40
  • I would like to have the `sp_rand` result in my original data frame `ex`. But `tapply` gave me a list, and I can't find an efficient way to put it back in the original data frame. – P. Denelle Dec 16 '15 at 13:45

3 Answers3

1

If I understand your question, you can do this with dplyr:

library(dplyr)
ex %>%
  group_by(loc) %>%
  mutate(sp_random = sample(levels(sp), n()))
user2802241
  • 504
  • 3
  • 10
1

We can try data.table. We convert the 'data.frame' to 'data.table' (setDT(ex)), grouped by 'loc', we get the sample of levels(sp) and assign (:=) it to 'sp_random'.

library(data.table)
setDT(ex)[, sp_random := sample(levels(sp), .N),by = loc]
akrun
  • 874,273
  • 37
  • 540
  • 662
0

I probably didn't understand your problem, but why wouldn't you do :

ex <- data.frame(loc = c("1", "1", "2", "2", "2", "3", "3"),
    sp = c("a", "b", "b", "c", "d", "a", "d"))

spz <- unique(ex$sp)
ex$sp_random <- unlist(tapply(ex$sp, ex$loc, function(x) sample(spz,length(x))))
Vongo
  • 1,325
  • 1
  • 17
  • 27