Finding the number of values you need to add/remove is pretty trivial.
For example: Given a vector A, you want to add n_new
values to A
to have a desired percentage, DP
, of 30%. So you start with this equation:

And solve for the number of values to add:

Once you have your n_new
value, you know how many occurrences of val
you need to add to your array. You can throw some in either end of A
(or both) and then sort the resulting array. You can utilize randperm
to generate a randomized vector of indices and use those to create a randomly "sorted" array. See also: MATLAB's Matrix Indexing documentation, specifically accessing multiple elements.
Removing values uses pretty much the same logic. If your n_new
value is negative, it means you need to remove n_new
occurrences of val
to get your DP
.
In MATLAB this gives us something like the following:
% Sample Vector
A = [61 52 67 58 62 69 51 57 66 68 67 55 69 54 57 64 53];
% Criteria
DP = 0.4;
val = 57;
% Find count of val in A
n_val = length(find(A==val)); % Ignore floating point issues for brevity
% Find number of new values to add/remove to get to DP
n_new = (n_val - DP*length(A))/(DP - 1);
n_new = fix(n_new); % Need to round to the nearest integer in some direction
if n_new > 0
% Need to add values
% Create new vector, append appropriate number of values
B = horzcat(A, repmat(val, 1, n_new));
% Randomly sort
newidx = randperm(length(B)); % Generate a random permutation of our indices
B = B(newidx);
elseif n_new < 0;
B = A; % Copy vector
% Need to remove values
val_idx = find(B == val); % Ignore floating point issues for brevity
remidx = val_idx(randperm(length(val_idx), abs(n_new))); % Generate n_new number of random indices
B(remidx) = []; % Delete values
end
% Test
p = length(find(B==val))/length(B);
Which gives us the following:
B =
57 51 52 57 57 69 57 57 55 67 53 57 64 69 57 57 54 57 61 58 57 66 67 68 62
p =
0.4000
And to test removal:
% Sample Vector
A = [57 51 52 57 57 69 57 57 55 67 53 57 64 69 57 57 54 57 61 58 57 66 67 68 62];
% Criteria
DP = 0.10;
val = 57;
And we get:
B =
57 51 52 69 57 55 67 53 64 69 54 61 58 66 67 68 62
p =
0.1176
I'll also add the obligatory caveat for comparing two floats for equality if you are not working with MATLAB's integer data types. In the find
calls you will want to incorporate a tolerance to account for floating point issues. For more information see: What Every Computer Scientist Should Know About Floating-Point Arithmetic and the more MATLAB-specific Why is 24.0000 not equal to 24.0000 in MATLAB?