If the number of characters to remove/replace is small compared to the length of the
string, then your solution is good, because the probability of a "collision" in the
while-loop is small. You can improve the method by using a single mutable string instead of
allocating a new string in each step:
NSString *string = @"Remove Some Characters";
int totalRemove = 5;
NSMutableString *result = [string mutableCopy];
for (int j=0; j < totalRemove; j++) {
int replaceLocation;
do {
replaceLocation = arc4random_uniform((int)[result length]);
} while ([result characterAtIndex:replaceLocation] == '_' || [result characterAtIndex:replaceLocation] == ' ');
[result replaceCharactersInRange:NSMakeRange(replaceLocation, 1) withString:@"_"];
}
If the number of characters to remove/replace is about the same magnitude as the
length of the string, then a different algorithm might be better.
The following code uses the ideas from Unique random numbers in an integer array in the C programming language to replace characters
at random positions with a single loop over all characters of the string.
An additional (first) pass is necessary because of your requirement that space characters
are not replaced.
NSString *string = @"Remove Some Characters";
int totalRemove = 5;
// First pass: Determine number of non-space characters:
__block int count = 0;
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![substring isEqualToString:@" "]) {
count++;
}
}];
// Second pass: Replace characters at random positions:
__block int c = count; // Number of remaining non-space characters
__block int r = totalRemove; // Number of remaining characters to replace
NSMutableString *result = [string mutableCopy];
[result enumerateSubstringsInRange:NSMakeRange(0, [result length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![substring isEqualToString:@" "]) {
// Replace this character with probability r/c:
if (arc4random_uniform(c) < r) {
[result replaceCharactersInRange:substringRange withString:@"_"];
r--;
if (r == 0) *stop = YES; // Stop enumeration, nothing more to do.
}
c--;
}
}];
Another advantage of this solution is that it handles surrogate pairs (e.g. Emojis) and composed character sequences correctly, even if these are stores as two separate characters in the string.