1

I would like to expand a sample survey and simulate a population. For example, if I have the following data sample(very small for explain my question) like

control weight  sex age  race
      1      2    F   23    W
      2    3.1    M   21    B
      3    5.3    F   19    W

In this case, control represents the interviewed people. For example, I would like get a dataframe where the control 1 (some person, sex female , 23 yeard old and white) repeats 2 times(2 rows). The dificult arises when I try to repeats 3.1 times the control number 2 and 5.3 the contol number 3, preserving the sex, age and race.

There is the "survey" package, but I don't know if there is some function for this situation.

How can I find a solution for this problem?

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
MAOC
  • 625
  • 2
  • 8
  • 26

1 Answers1

2

If you need the expand the rows of the dataset, based on the value in the 'weight' column, one option would be expandRows from splitstackshape. This will be similar to df1[rep(1:nrow(df1), weight),].

 library(splitstackshape)
 expandRows(df1, 'weight')
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    @vasco this is probably the correct answer. the `survey` package is mostly for analyzing microdata that's already in some sort of respondent-level format. – Anthony Damico Jun 24 '15 at 06:31
  • Unfortunately, this approach does not work for panel data, because it does not make sense to replicate an individual five times one year and then four times the year after. Current treatment of sampling weights for estimation in R remains at a poor level, unfortunately. Not that I can contribute to improve it :( – luchonacho Jul 17 '18 at 16:29