Shuffle an array of unequally repeating entries, so they do not repeat

Question

I am writing an experiment in Matlab Psychtoolbox and my conditions are stored in an array like this

Cond = ["VM" "VM" "VN" "VS" "AM" "AM" "AN" "AS" "CM" "CM" "CN" "CS"...
"VM" "VM" "VN" "VS" "AM" "AM" "AN" "AS" "CM" "CM" "CN" "CS"];

I now want to shuffle the array in a way that I don't have repeating conditions.

There are a lot of treats regarding this problem e.g.this one, but they always have every condition equally often.

Some suggested a brute force method, shuffling so often until this criteria is reached. But since I have tree of this condition arrays and I have to shuffle them several times per experimental run I don't think that is a good solution.

Hope someone can help

I am not sure you can "conditionally shuffle". If you want random shuffling, just do so until it meets your criteria. Otherwise, find an algorithm that sorts (not shuffles) your data with specific constrains. But if you want randomness, then you need to do the brute force method. (I think) — Ander Biguri, Aug 15 '19 at 12:48
Further, you don't have to brute force it every experiment, just generate a list of all possible combinations which satisfy your "shuffled" condition and randomly choose one of these permutations. — Wolfie, Aug 15 '19 at 14:00
How about randomly shuffling the array, then iterating through it and taking each element that is equal to its predecessor and move it to the end. Then, at the end you might have some elements that don't satisfy your condition. Takes these elements and find a spot for each by going through the array again and finding two consecutive elements that are different to this element, and inserting it in between. — Cris Luengo, Aug 15 '19 at 19:17

gnovice · Accepted Answer · 2019-08-15T20:30:46.987

I devised an algorithm that should do what you're asking for. Given a sequence, it will randomly reorder it such that no value repeats. However, it does appear to have a tendency to create repeated sub-sequences (e.g. ..."A" "B" "C" "A" "B" "C"...). I wrapped it in a function reorder_no_reps:

function seq = reorder_no_reps(seq)

  % Find unique values and counts:
  N = numel(seq);
  [vals, ~, index] = unique(seq(:), 'stable');
  counts = accumarray(index, 1);
  [maxCount, maxIndex] = max(counts);

  % Check the maximum number of occurrences:
  if (2*maxCount-1 > N)
    error('Can''t produce sequence without repeats!');
  end

  % Fill cell array column-wise with permuted and replicated set of values:
  C = cell(maxCount, ceil(N/maxCount));
  if (3*maxCount-1 > N)
    permIndex = [maxIndex(1) ...
                 setdiff(randperm(numel(vals)), maxIndex(1), 'stable')];
  else
    permIndex = randperm(numel(vals));
  end
  C(1:N) = num2cell(repelem(vals(permIndex), counts(permIndex)));

  % Transpose cell array and extract non-empty entries:
  C = C.';
  seq = reshape([C{~cellfun('isempty', C)}], size(seq));

end

A description of the steps for the algorithm:

Find the unique values in the input sequence (vals) and how many times they each appear (counts).
Find the maximum occurrence of a single value (maxCounts) and check if it is too large to make a sequence without repreats.
A random permutation order is applied to both the unique values and their counts. If the maximum occurrence exceeds a given threshold, the corresponding value index is moved to the beginning of the randomized order (the reason why is addressed in the next bullet).
Each unique value is then replicated in place by its number of occurrences in the sequence. A cell array is filled in column-wise order with these replicated values, possibly leaving some cells at the end empty. Since the cell array has a number of rows equal to the maximum occurrence of a value, no single row of the resulting cell array will have a value appearing more than once in it. In addition, the last value in each row should be different than the first value in the subsequent row (ensured by filling first with the most frequently occurring value in certain cases).
Transpose the cell array, then pull all the non-empty cell values in column-wise order and reshape them into the same size as the input sequence. This is the same as pulling values row-wise from the non-transposed cell array, which should give you a sequence without repeated values.

Hey @gnovice, your algorithm works nice. You are right, it creates repeating subsequences. To me it is fine, but for someone with more repetitions this could be problematic. — Pibe_chorro, Aug 19 '19 at 12:43

Shuffle an array of unequally repeating entries, so they do not repeat

1 Answers1