0

Gist:

I'm looking for a randomization algorithm that can for each subject, look at all previous measurements, determine if a subject would fall into the high or low end, then ensure that approximately half of each of these groups get assigned to an experimental condition.

Details:

I'm building a research application that has two independent variables. One is an experimental manipulation which I can assign. We'll call these X (experimental) and C (control). The other is personal characteristic with two categorical types, measured through a scale. We'll call these P1 (type 1) and P2 (type 2).

So it's essentially a 2x2 where I have 4 conditions (P1X, P1C, P2X, P2C). I'm recruiting about 120 subjects so ideally I'd have a distribution of 30 subjects in each condition.

I have three problems.

1) Based on the literature I'm expecting a natural 50/50 split between P1 and P2 characteristics in my sample. However, I can't be sure as my population isn't what I'd consider the general population which is where the split estimate is derived from.

2) A simple randomization of the X or C manipulation won't necessarily guarantee equal distribution. This exacerbates the first problem as if I see e.g. P1 40% and P2 60% split, 50% (X or C) of 48 leaves me with 24 people. Of course, it could be worse. If the random assignment of the experimental manipulation ends up less than 50% for this smaller sub-sample, then again, it could be worse. The fear is I could be left with too small a sample to run my analyses.

3) Another complication is that the categories, P1 and P2 are sometimes less definitive and more relative. Usually we'd just split P1 and P2 via median. It is determined on the aggregate value of several scale measurements (total range 0-20). However, in my sample we might see some bias toward one end. In this case, I might have to do a relative comparison. In this case, I don't know where the median would lie to split my sample. But whatever the median becomes, I would then say something like, "these people are more P1 than P2" or vice versa. But I don't know what the average or SD would be for my sample.

What I can do in my experiment:

What I can do is measure people's P1/P2 types before assigning them to either X or C. I can't pre-test everyone before assignment so I'll only know the bigger picture 1 subject at a time until it stabilizes with enough sample.

Question:

So the question in short is, is there an randomization algorithm that can adapt to these unknowns as I know more from one experiment to the next?

Basically I want to measure the subject's P1/P2 measure, compare it against the entire sample up to that point and find out if they would be more or less likely to be in the upper-half toward P2 or lower-half toward P1. Then after I figure this out, I want to assign them to either X or C in a way that ensures better than simple random that I'll get equal number of participants in each condition.

I'm not sure if this is the right place to ask, maybe Stack Cross-Validated? Anyway, thanks ahead of time if you have some suggestions.

Extra note:

The application is programmed completely in Javascript.

jmk2142
  • 8,581
  • 3
  • 31
  • 47
  • Math.random() is fairly unbiased. You can bias it based on previous result to increase the chance of the random side is relative to this bias. See if this can be helpful: http://stackoverflow.com/a/29325222/1693593 –  Jun 02 '15 at 22:04
  • I'd try and talk to statisticians, and I'd also run a pilot test beforehand with all the data from random numbers, and test this by analyzing it just as you would the real data. The horror story you want to avoid is when you describe what you have done to a statistician and give them your hard-won data and they say they can't provide a conclusive analysis because you didn't collect the data they needed, or your assignment of subjects to control or experimental conditions introduced a bias they can't remove. – mcdowella Jun 03 '15 at 04:41

1 Answers1

1

First trick, you'll need a priority queue. Google gives me https://github.com/adamhooper/js-priority-queue for JavaScript. Actually you need two. One to give you the smallest in P2, and the other to give you the largest in P1.

With that done, you need to keep 4 counters for your 4 groups.

The first person is randomly assigned to control or experimental. The second is assigned to the other, then the two are assigned to P1 and P2. You then initialize your counters to 1's and 0's.

With each subsequent person you encounter, you compare them to the largest in P2, smallest in P1, and decide which group they go in and whether you need to move someone between the groups. If you need to move someone, do so and update counters. Then assign them to treatment or control based on which is less common in the Pi they are in, breaking ties for which is less common in the other Pi, breaking any further ties randomly.

This will not guarantee an even split, but it does make a good faith effort to generate one.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • Sorry. It took me a while to actually notice this was answered. In the end I ended up doing a regular random assignment and hoped not to see groups too strangely distant from each other. It worked out but here's your check. I'll keep this answer in mind for next time. :-) Thank you. – jmk2142 Aug 20 '15 at 05:23