The goal is to replace multiple (or all) occurences of a given text in another string using only C strings.
(self answered question)
The goal is to replace multiple (or all) occurences of a given text in another string using only C strings.
(self answered question)
This uses fixed size buffers, you must make sure they are big enough to hold the string after replacement is done.
Define the size before use:
#define LINE_LEN 256
This code was tested with MSVC 2019.
void replaceN(char* line,const char* orig,const char* new, int times){
char* buf;
if(times==0) return; //sem tempo irmao
if((times==-1||--times>0) && (buf = strstr(line,orig))!=NULL){ //find orig
for(const char *c=orig;*c;c++) buf++; //advance buf
replaceN(buf,orig,new,times); //repeat until the last occurrence
}
//this will run first for the last match
if((buf = strstr(line,orig))!=NULL){
char tmp[LINE_LEN];
int i = buf-line; //pointer difference
strncpy(tmp,line,i); //copy everything before the match
for(const char *k=orig;*k;k++) buf++; //buf++; //skip find string
for(const char *k=new;*k;k++) tmp[i++]=*k; //copy replace chars
for(;*buf;buf++) tmp[i++]=*buf; //copy the rest of the string
tmp[i]='\0';
strcpy(line,tmp);
}
}
inline void replace(char* line,const char* orig,const char* new){replaceN(line, orig, new, 1);}
inline void replaceAll(char* line,const char* orig,const char* new){replaceN(line,orig,new,-1);}
Turns out I had too much self esteem. The code was not tested, and I should not have posted it without proper testing. I add this comment to remind others of not doing the same mistake. If you find any other errors, please let me know.
In order to keep it simple, I don't do it in place. Instead it requires a preallocated output buffer. Doing in place is risky if the size of the new string is longer than the original. And there's also an edge case that can be tricky to handle, and that's when the original substring to replace is a substring of the new string.
The headers needed to run allt his:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stddef.h>
#include <stdint.h>
The main replace function. It replaces maximum n occurrences and returns number of replacements. dest is a buffer big enough to hold the result. All pointers needs to be non NULL and valid. You may notice that I'm using goto
which may be frowned upon, but using it to exit cleanly is very convenient.
size_t replace(char *dest, const char *src, const char *orig,
const char *new, size_t n) {
size_t ret = 0;
// Maybe an unnecessary optimization to avoid multiple calls in
// loop, but it also adds clarity
const size_t newlen = strlen(new);
const size_t origlen = strlen(orig);
if(origlen == 0 || n == 0) goto END; // Edge cases
do {
const char *match = strstr(src, orig);
if(!match) goto END;
// Length of the part of src before first match
const ptrdiff_t offset = match - src;
memcpy(dest, src, offset); // Copy before match
memcpy(dest + offset, new, newlen); // Replace
src += offset + origlen; // Move src past what we have already copied.
dest += offset + newlen; // Advance pointer to dest to the end
ret++;
} while(n > ret);
END:
strcpy(dest, src); // Copy whatever is remaining
return ret;
}
It's easy to write a wrapper for the allocation. We borrow and modify some code from find the count of substring in string
size_t countOccurrences(const char *str, const char *substr) {
if(strlen(substr) == 0) return 0;
size_t count = 0;
const size_t len = strlen(substr);
while((str = strstr(str, substr))) {
count++;
str+=len // We're standing at the match, so we need to advance
}
return count;
}
Then some code to calculate buffer size
size_t calculateBufferLength(const char *src, const char *orig,
const char *new, size_t n) {
const size_t origlen = strlen(orig);
const size_t newlen = strlen(new);
const size_t baselen = strlen(src) + 1;
if(origlen > newlen) return srclen;
n = n < count ? n : count; // Min of n and count
return baselen +
n * (newlen - origlen);
}
And the final function. It combines allocation and replacement. It returns a pointer to the buffer, and NULL if allocation fails.
char *replaceAndAllocate(const char *src, const char *orig,
const char *new, size_t n) {
const size_t count = countOccurrences(src, orig);
const size_t size = calculateBufferLength(src, orig, new, n);
char *buf = malloc(size);
if(buf) replace(buf, src, orig, new, n);
return buf;
}
And finally, a simple main
with a few test cases
int main(void) {
puts(replaceAndAllocate("hoho", "ha", "he", SIZE_MAX ));
puts(replaceAndAllocate("", "", "", 5));
puts(replaceAndAllocate("", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "", 5));
puts(replaceAndAllocate("", "", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 5));
puts(replaceAndAllocate("hihihi!!!", "hi", "of", 2));
puts(replaceAndAllocate("!!!hihihi", "hi", "x", 3));
puts(replaceAndAllocate("asdfasdfasdf", "asdf", "x", 2));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", SIZE_MAX ));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", 0));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", 1));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "", SIZE_MAX ));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "", 3 ));
puts(replaceAndAllocate("!asdf!asdf!asdf!", "asdf", "asdf#asdf", SIZE_MAX));
// Yes, I skipped freeing the buffers to save some space
}
No warnings with -Wall -Wextra -pedantic
and the output is:
$ ./a.out
hoho
ofofhi!!!
!!!xxx
xxasdf
yyyyyyyyyyyy
xxxxxxxxxxxx
yxxxxxxxxxxx
xxxxxxxxx
!asdf#asdf!asdf#asdf!asdf#asdf!
Note that I don't have any special functions for replacing one and replacing all. If you really want those, just write wrappers with n=1
or n=SIZE_MAX
. Using SIZE_MAX
is safe, because a string cannot be bigger than that.
Another reason that I got rid of a special function for one replacement is that it was very ineffecient. Also, it was easier to write it that way and it is much cleaner.
I changed the code a lot from last time, and that's very much thanks to the awesome help I got at Codereview. You can see how the code was before on the question I posted there: https://codereview.stackexchange.com/q/263785/133688