1

New to C here and I have found the following algorthim to concatenate strings whilst searching books online:

Algorithm: STRING_CONCAT (T, S)
[string S appends at the end of string T]
1. Set I = 0, J = 0
2. Repeat step 3 while T[I] ≠ Null do
3. I = I + 1
[End of loop]
4. Repeat step 5 to 7 while S[J] ≠ Null do
5. T[I] = S[J]
6. I = I + 1
7. J = J + 1
[End of loop]
8. Set T[I] = NULL
9. return

Essentially, I have tried to implement this with my current working knowledge with C. However, I am unsure on how to get the char* pointers to correctly point inside the function. For example,

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

const char* stringConcat(char* T, char* S){
    int i = 0;
    int j = 0;
    char* Q;
    while(*S[i] != NULL & *T[i] != NULL){
        i += 1;
    
    while(*S[j] != NULL){
        *T[i] = *S[j];
        i += 1;
        j += 1;
    }
        }
    *T[i] = NULL;
    return *T
}

int main(void){
    char* sentence = "some sentence";
    char* anotherSentence = "another sentence";
    const result;

    result = stringConcat(sentence, anotherSentence);

    return EXIT_SUCCESS;
}

I get a logged error output with the following:

exe_4.c:8:11: error: indirection requires pointer operand ('int' invalid)
    while(*S[i] != NULL & *T[i] != NULL){
          ^~~~~
exe_4.c:8:27: error: indirection requires pointer operand ('int' invalid)
    while(*S[i] != NULL & *T[i] != NULL){

...

...
Dollar X
  • 73
  • 5
  • 1
    As a rule of thumb: *Never* assign string literals to `char*` pointers, always to `char const*` pointers – while the literals indeed are of type `char[]` they are still *immutable* (not being of type `char const[]` as it is in C++ is due to the fact that the definition arises from times where `const` did not yet exist...). As a consequence, modifying (or trying to) such literals yields *undefined behaviour*! – Aconcagua Dec 02 '22 at 12:11
  • 1
    Apart from modifying string literals (assume you'd have written instead `char sentence[] = "some sentence";` or would have used `malloc`) then you'll discover that the array defined is simply too short to append further characters to. Make sure you allocate sufficient memory to hold the *combined* string (e.g. `char sencence[128] = ...`)! – Aconcagua Dec 02 '22 at 12:14
  • 1
    Note, that `NULL` is a macro typically yielding a null *pointer*, e.g. defined as `(void*)(0)` – you shouldn't use in numeric context in C, i.e. simply use `0` here instead (or, if you want to make more explicit that you want to deal with a char, `'\0'`, though in C this is absolutely equivalent, character literals have type `int` anyway (differing from C++, where they have type `char`). – Aconcagua Dec 02 '22 at 12:18
  • And your initial loop is wrong – doesn't reflect the code in your book either: You only check for `T[i] != 0`; Assume you wanted to append `"ss"` to `"ttt"` – *your* loop would only increment `i` twice for the two `'s'` in the string, resulting in the third `'t'` getting overwritten on copying, final result being `"ttss"`. – Aconcagua Dec 02 '22 at 12:21
  • 1
    Apart from `*S[i]` etc being nonsense syntax, please check out the linked duplicate. You cannot concatenate string literals in run-time. It's super easy to do at compile-time however, but that's another story. – Lundin Dec 02 '22 at 12:21
  • Side note: Instead of using separate variables `i` and `j` you could, as C, instead simply increment the pointers as well. That results in more compact and actually even more efficient code: `while(*S) { ++S; } while(*T) { *S++ = *T++; } *S = 0;` – and *here* the asterisk *is* correct, while in your code, see previous comment, it is *not*. Be aware that in C `X[Y]` is equivalent to `*(X + Y)`, so the dereferencing is already contained within the index operation. – Aconcagua Dec 02 '22 at 12:22
  • And a funny fact: As `X[Y]` is equivalent to `*(X + Y)` and the addition is commutative, i.e. you can write `*(Y + X)` instead, the initial expression is equivalent to `Y[X]` as well, so you actually could legally and correctly write `7[someArray]` – admitted, looks pretty odd, but it *is* valid ;) Note, though, that this is specific for C (and C++), won't work in other languages with same syntax like Java, C# and others (and not either in C++ on data types with an overloaded index operator – just for completeness). – Aconcagua Dec 02 '22 at 12:28
  • @Aconcagua Many thanks for the detailed explanations, they've improved my script writing a lot! A question on `*T[i] = NULL` suppose its just `*T = 0`, what does this reflect because I cannot see it. Does this set the string to nothing? – Dollar X Dec 03 '22 at 17:36
  • @DollarX Well, it *is* there in my penultimate comment – **but** I discover right now that I accidentally swapped `S` and `T`, appending `T` to `S` instead of inverse. So you need to swap variables back (you might alternatively just swap names in the signature, that's less changes...). `*T = 0` (or `*S = 0` in my variant) appends a terminating null character to the string (from point of view of `T` that would create an empty string, but note that `T` has advanced away from the beginning since long). – Aconcagua Dec 05 '22 at 06:50
  • It doesn't matter, by the way, that you've lost the original beginnings of `S` and `T` *within* the function, as these are only *copies* of the addresses of the pointers to passed to them (if you pass an array to a function directly it *decays* to a pointer automatically). – Aconcagua Dec 05 '22 at 06:52

1 Answers1

1

To concatenate two strings, code needs a valid place to save the result.

Attempting to write to a string literal in undefined behavior (UB).

//           v------v This is a string literal and not valid place to save.
stringConcat(sentence, anotherSentence);

Instead use a writeable character array.

// Make it big enough
char sentence[100] = "some sentence";
char* anotherSentence = "another sentence";

stringConcat(sentence, anotherSentence);

Concatenation code attempts to de-reference a char with *S[i]. This is not possible.

Instead, walk the destination string to its end and then append each character of the source.

const char *stringConcat_alt(char *destination, const char* source) {
  const char *t = destination;  
  // Get to the end of destination
  while (*destination) {
    destination++;
  }

  while (*source) {
    *destination++ = *source++;
  }

  *destination = 0;
  return t;
}

Do not use NULL for the null character. Use '\0'. NULL is a null pointer and possibly will not convert to a char 0.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Quick question: Why does `destination` inside the first while-loop does not use the `*` pointer whereas; the second one does? – Dollar X Dec 03 '22 at 16:48
  • And what would be the difference in `char const*` and `const char *var`? – Dollar X Dec 03 '22 at 16:50
  • @DollarX The first loop has `*destination` and `destination++`. `*destination` is used. – chux - Reinstate Monica Dec 03 '22 at 17:44
  • @DollarX `char const*` and `const char *` are functionally the same. Difference is _style_. `const char *` is more common. – chux - Reinstate Monica Dec 03 '22 at 17:46
  • @DollarX ... while `char const*` is more consistent – `const` in general refers to what is left from it, except it being the first token in a variable/parameter declaration. Consider `char const* const`, where both times `const` refers to its respective preceding token, so a *const pointer to a const char* (read it from right to left...). `const char* const` mixes both west- and east-consting (how these variants are sometimes referred to humoristically). – Aconcagua Dec 05 '22 at 06:58
  • @DollarX Note, too, that returning `t` actually is not *necessary* – the returned pointer is the same that originally has been passed to, so is already known outside anyway. *Still* returning it is a convenience feature that would e.g. allow chaining function calls like `doSomethingWith(stringConcat_alt(x, y));`. Some doubts about that style as `x` getting modified is less obvious, though, I personally would avoid it unless *perhaps* being part of a macro. – Aconcagua Dec 05 '22 at 07:07