2

For an experiment I need to pseudo randomize a vector of 100 trials of stimulus categories, 80% of which are category A, 10% B, and 10% C. The B trials have at least two non-B trials between each other, and the C trials must come after two A trials and have two A trials following them.

At first I tried building a script that randomized a vector and sort of "popped" out the trials that were not where they should be, and put them in a space in the vector where there was a long series of A trials. I'm worried though that this is overcomplicated and will create an endless series of unforeseen errors that will need to be debugged, as well as it not being random enough.

After that I tried building a script which simply shuffles the vector until it reaches the criteria, which seems to require less code. However now that I have spent several hours on it, I am wondering if these criteria aren't too strict for this to make sense, meaning that it would take forever for the vector to shuffle before it actually met the criteria.

What do you think is the simplest way to handle this problem? Additionally, which would be the best shuffle function to use, since Shuffle in psychtoolbox seems to not be working correctly?

Jeremy
  • 67
  • 4
  • Real numbers? integers? What is "3 spaces"? – Mendi Barel Aug 08 '17 at 09:25
  • What I mean is the index on the vector, so that if vector(8) is stimulus class C, the next index which can be class C would be vector(11). There have to be two A trials in between. I'm updating the original post to reflect this. – Jeremy Aug 08 '17 at 09:35
  • Can you explain what you require by "pseudorandom"? Since the number of trials of each category are different, the random trials too will contain more of "A" than "B" and "C". Is that okay? – crazyGamer Aug 08 '17 at 09:51
  • Yes that's right, it should contain the same proportion of each of the categories, only randomized. I'm not completely clear on why I was given the term "pseudorandomization" but from my understanding you could just substitute the term randomization. I think it's just saying that any of the functions which don't produce true random numbers (or shuffles) would suffice. – Jeremy Aug 08 '17 at 10:24

1 Answers1

0

The scope of this question moves much beyond language-specific constructs, and involves a good understanding of probability and permutation/combinations.

An approach to solving this question is:

  1. Create blocks of vectors, such that each block is independent to be placed anywhere.
  2. Randomly allocate these blocks to get a final random vector satisfying all constraints.

Part 0: Category A

Since category A has no constraints imposed on it, we will go to the next category.

Part 1: Make category C independent

The only constraint on category C is that it must have two A's before and after. Hence, we first create random groups of 5 vectors, of the pattern A A C A A.

At this point, we have an array of A vectors (excluding blocks), blocks of A A C A A vectors, and B vectors.

Part 2: Resolving placement of B

The constraint on B is that two consecutive Bs must have at-least 2 non-B vectors between them.

Visualize as follows: Let's pool A and A A C A A in one array, X. Let's place all Bs in a row (suppose there are 3 Bs):
s0 B s1 B s2 B s3
Where s is the number of vectors between each B. Hence, we require that s1, s2 be at least 2, and overall s0 + s1 + s2 + s3 equal to number of vectors in X.

The task is then to choose random vectors from X and assign them to each s. At the end, we finally have a random vector with all categories shuffled, satisfying the constraints.

P.S. This can be mapped to the classic problem of finding a set of random numbers that add up to a certain sum, with constraints. It is easier to reduce the constrained sum problem to one with no constraints. This can be done as:
s0 B s1 t1 B s2 t2 B s3
Where t1 and t2 are chosen from X just enough to satisfy constraints on B, and s0 + s1 + s2 + s3 equal to number of vectors in X not in t.

Implementation

Implementing the same in MATLAB could benefit from using cell arrays, and this algorithm for the random numbers of constant sum.

You would also need to maintain separate pools for each category, and keep building blocks and piece them together.

Really, this is not trivial but also not impossible. This is the approach you could try, if you want to step aside from brute-force search like you have tried before.

crazyGamer
  • 1,119
  • 9
  • 16
  • @Damon Let me know if you have any doubts. I hope you would be able to try an implementation on your own - and perhaps when I'm a little more free in the future I could write up the code myself. – crazyGamer Aug 08 '17 at 11:39
  • Thank you, this is very helpful. I believe understand the idea but I'm a bit caught on some of the terminology. Let's say for simplicity's sake that conditions B and C are the same in that their vectors should look like [A A B A A] and [A A C A A] respectively. Does this mean that I should start with a larger vector of [A A A [A A B A A] [A A B A A] [A A C A A] etc] and randomly combine the elements? – Jeremy Aug 08 '17 at 16:20
  • *Yes, exactly*. So in this case, [A A A [A A B A A] [A A B A A] [A A C A A]] should be randomly shuffled (can use [`randperm`](https://in.mathworks.com/help/matlab/ref/randperm.html) in MATLAB), and then the blocks should be "merged" finally, so you have [A A A A A B A A A A B A A A A C A A]. – crazyGamer Aug 09 '17 at 04:51
  • And before you reach this step, you must first randomly create the [A A B A A] blocks and [A A C A A] blocks, and put them all in a pool as you mentioned above. – crazyGamer Aug 09 '17 at 04:52