-1

I have a dataframe (40 x 3, where rows is equal to number of people) and I want to randomly assign each person to one of 10 groups. In order to do that I created a new column called "group" and I did:

for (i in 1:dim(data)[1]) {data$group[i] = sample(1:10,1)}

Output:

Gr1  Gr2  Gr3  Gr4  Gr5  Gr6  Gr7  Gr8  Gr9 Gr10
 2    5    8    8    3    3    2    4    3    2

It works, but I would like to have almost the same number of individuals in each group. How can I do that? Thanks.

Frank
  • 66,179
  • 8
  • 96
  • 180
PaulaF
  • 393
  • 3
  • 17
  • 5
    You could start with a reproducible example (if you want an actual solution, rather than a collection of some vogue tips as below). – David Arenburg May 19 '15 at 06:07
  • @DavidArenburg: What. The question is perfectly understandable, and my answer gives an easily implementable recipe to get the rows divided evenly into groups. It's not a "vague tip". The snobbery on this site is really getting hard to stand. – cfh May 19 '15 at 17:08
  • @cfh I didn't understand neither the question or your answer or how to actually implement it. You yourself said that you don't know R, so why would you answer an R question? Anyone can say "do this and that", but this is not how coding works. In order to solve an actual problem you need an actual code. Your answer doesn't even contain a pseudo code. [There certain rules how to ask a question on SO](http://stackoverflow.com/help/how-to-ask) or [how to make it reproducible in R](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). OP didn't follow them. – David Arenburg May 19 '15 at 18:53
  • @DavidArenburg: I gave an algorithm. Algorithms are independent of the language you implement them in. And in fact, Jeff came up with a simple one-liner to implement my algorithm in R. That's great, we worked together and came up with an elegant solution. We provided value. You only complained about "rules". – cfh May 19 '15 at 20:17
  • @cfh I didn't complain about rules. I wanted to help the OP to solve his problem, but first, he needs to help him self and provide a reproducible example. I don't know about the value of your or any other answer because I don't know what is being asked. It is also possible that it can be solved much more easily. It is also possible that you too didn't understand the question.You don't know R so you don't know what is possible or not possible to do. Or what is efficient solution or not. You probably also don't know what is a vectorized language and how it differs from Java/C, for example. – David Arenburg May 19 '15 at 20:20
  • @DavidArenburg: Suffice it to say that if you didn't understand the question, but several others were perfectly capable of understanding it and providing useful solutions, maybe the problem wasn't with the question. – cfh May 20 '15 at 06:39
  • @cfh Suffice to say that the question regarding ["How to make a great R reproducible example?"](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) was accepted a "bit" better than this one. And even my initial comment was accepted much better. Not to mention that no [top](http://stackoverflow.com/tags/r/topusers) [tag:r] users did try to answer this question tells something too. I have no worse record than you in answering questions btw, and I believe that when a person seeks help on SO he should at least make a minimal effort to make it reproducible. – David Arenburg May 20 '15 at 08:10

4 Answers4

2

Create a list of the numbers 1..10 and duplicate it four times so that you get a list of length 40. Then shuffle this vector randomly and put it into your group column.

I don't know enough R to put this into code, sorry, but it should be rather easy for someone who knows the language.

cfh
  • 4,576
  • 1
  • 24
  • 34
2

Choosing random numbers for a sample will give you varied results. WLOG there isn't a small probability, for example, that when picking 10 numbers from 1:10 you won't choose a single 3.

Rather than assigning the group to the person, you should assign the person to the group. If you want the same number of people in each group, randomly pick four from your list to be in group one, four to be in group two, etc.

Edit: I don't have enough reputation to add a comment to @cfh's post, but in R to do this you can type in group <- sample(rep(1:10,each=4)) and then add it to your data frame. That is the easiest implementation of a solution, I believe.

Jeff Yontz
  • 246
  • 1
  • 10
2

Simply create vector 1 to 10 of a known length:

groups <- rep(1:10, 4)

And then shuffle it, this can be done just by using rnorm or any of the random number generators. This will be your index which you can then place in order to shuffle the vector groups.

sample(groups)

Ex. output:

 [1]  7  5  3  7  9  8  9  8  7 10  8 10  5 10  6  5  8  2  4 10  7  5  4  2  3  2  6
[28]  3  1  4  1  2  1  6  1  3  6  9  9  4
chappers
  • 2,415
  • 14
  • 16
0

sample is a great solution in this case, but you could also use the complete random assignment function in the randomizr package:

library(randomizr)
Z <- complete_ra(N = 30, condition_names = paste0("gr", 1:10))

> table(Z)
Z
 gr1  gr2  gr3  gr4  gr5  gr6  gr7  gr8  gr9 gr10 
   3    3    3    3    3    3    3    3    3    3 
Alex Coppock
  • 2,122
  • 3
  • 15
  • 31