Sampling elements in data frame

Question

I'm trying to do resampling of the elements of a data frame. I'm open to use other data structures if recommended, but my understanding is that a DF would be better for combining strings, numbers, etc.

Let's say my input is this data frame:

16  x  y  z  2
11  a  b  c  1
.........

And I'd like to build as output another data structure (I take, another df) like this:

16  x   y   z
16  x   y   z
11  a   b   c  
.........

I guess my main issue is the way to append the content, which is on columns df[,1:4].

Thanks in advance, p.

David Arenburg · Accepted Answer · 2014-10-31T13:12:51.240

3

It's unclear from your description, but your desired output implies that you want to duplicate columns 1:4 according to column 5, this should do the job

df[rep(seq_len(nrow(df)), df[, 5]), -5]
#     V1 V2 V3 V4
# 1   16  x  y  z
# 1.1 16  x  y  z
# 2   11  a  b  c

edited Oct 31 '14 at 13:12

answered Oct 31 '14 at 12:08

David Arenburg

91,361
17
137
196

Yes, that clever combination did the trick. It could also be done in a less elegant way with a for loop. thank you – user3310782 Oct 31 '14 at 15:02

score 2 · Answer 2 · answered Oct 31 '14 at 17:05

2

Assuming you're starting with something like:

mydf
#   V1 V2 V3 V4 V5
# 1 16  x  y  z  2
# 2 11  a  b  c  1

Then, you can just use expandRows from my "splitstackshape" package, like this:

library(splitstackshape)
expandRows(mydf, count = "V5")
#     V1 V2 V3 V4
# 1   16  x  y  z
# 1.1 16  x  y  z
# 2   11  a  b  c

By default, the function assumes that you are expanding your dataset based on an existing column, but you can just as easily add a numeric vector as the count argument, and set count.is.col = FALSE.

answered Oct 31 '14 at 17:05

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

Whaaat, this is nice – Rich Scriven Oct 31 '14 at 17:06
@RichardScriven, it's essentially David's answer with a few other bells and whistles.... Maybe I should CW this.... – A5C1D2H2I1M1N2O1R2T1 Oct 31 '14 at 17:10
I see that now. Just reading the source – Rich Scriven Oct 31 '14 at 17:11
1

It just seemed like a common enough thing that people ask for, so I put it in a function :-) – A5C1D2H2I1M1N2O1R2T1 Oct 31 '14 at 17:11

score 0 · Answer 3 · answered Oct 31 '14 at 12:06

0

If you want to sample with replacement n rows from df data frame:

df[sample(nrow(df), n, replace=TRUE), ]

answered Oct 31 '14 at 12:06

Tim

7,075
6
29
58

Sampling elements in data frame

3 Answers3