1

I have a function (called encrypt) that will return a string (called ciphertext). Inside this function, I created an array char[] (called cptxt_arr) and after running a loop and creating the ciphertext from the plaintext (got it from the user), I stored each letter inside the char[], then I assigned the char[] to the string and returned the string.

Here's my question: In the main(void) function, how come, when I print the encrypt function as a string (%s) I don't get anything from the CLI, but if I print the same function as char (%c) I do get the letters?

PS: From what I've read, I know I can't return arrays in C, which is why I'm assigning the char[] to a string variable and returning that.

I just want to know why is this happening because If I print the string ciphertext inside the encrypt function it works meaning that the code works fine.

NOTE: For the sake of the question, use positive integers for the "key" variable. I cleaned a bit the code, so the important part could be seen more easily.

Here's the code:

#include <cs50.h>
#include <stdio.h>
#include <string.h>

string encrypt(string plaintext, int key);

int main(void)
{
    int key;
    string plaintext;

    // Get the key.
    key = get_int("Key: ");

    // Get plaintext from the user.
    plaintext = get_string("plaintext:  ");

    // Print encrypted text.
    printf("ciphertext: %s\n", encrypt(plaintext, key));

    // Same but with the formatted char (%c).
    // printf("ciphertext: %c\n", encrypt(plaintext, key)[0]);
}

// Encrypt text.
string encrypt(string plaintext, int key)
{
    // Get plaintext length.
    int length = strlen(plaintext);

    char cptxt_arr[length];
    string ciphertext;

    for (int i = 0; i < length; i++)
    {
        if (plaintext[i] >= 'A' && plaintext[i] <= 'Z')
        {
            cptxt_arr[i] = ((plaintext[i]) + key);

        }
        else if (plaintext[i] >= 'a' && plaintext[i] <= 'z')
        {
            cptxt_arr[i] = ((plaintext[i]) + key);
        }
    }

    // Assign cptxt_arr[] to string and returning it.
    ciphertext = cptxt_arr;

    return ciphertext;
}

I have (I don't if it could matter) created a variable in the main(void) function and assigned the encrypt function to it, which doesn't work either.

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
Fito
  • 17
  • 4
  • 3
    You're returning the address of a local variable. When the function ends that variable no longer exists and you have a dangling pointer. Using it causes undefined behavior. – Retired Ninja Jul 07 '23 at 22:24
  • 2
    Q: I can't return arrays in C. A: Of *COURSE* you can return arrays in C! You just can't allocate the array inside a function (as a local variable). In C, local variables are *NO LONGER VALID* when the function exits. SOLUTIONS: 1) `malloc()` the array inside the function, or (better!) 2) allocate a char[] buffer *OUTSIDE*, and pass it as an argument *INTO* the function. – paulsm4 Jul 07 '23 at 22:24
  • 1
    @paulsm4 ... no? – Ted Lyngmo Jul 07 '23 at 22:25
  • 1
    For clarity: I questioned the statement when it was: _"Q: I can't return arrays in C. A: Of COURSE you can return arrays"_ - which is ... no, not in C. – Ted Lyngmo Jul 07 '23 at 22:28
  • You can return pointers to arrays, which seems like what @paulsm4 meant from the rest of the comment. – Dave S Jul 07 '23 at 22:31
  • @DaveS Probably not. `Of COURSE you can return arrays in C!` - What that means can be removed if not meant. – Ted Lyngmo Jul 07 '23 at 22:33
  • @Fito Just to clarify something that may not be obvious to you, the `string` type declared by `cs50.h` is just another name for `char *`, which means you *are* returning the address of a local object as stated in the answers. – zwol Jul 07 '23 at 22:56
  • @zwol That's a misunderstanding. The `string` in `cs50` is an automatically freed classic C string. A `string` in cs50 is `atexit` `free`d. So, don't use `string` and `char*` interchangeably. – Ted Lyngmo Jul 07 '23 at 22:58
  • @TedLyngmo That may well be true when you actually use the CS50 API, but *as OP is using it* what I said is true. See https://github.com/cs50/libcs50/blob/6d916ef457528b67bee2ac3e5ea3735acac19669/src/cs50.h#L51 – zwol Jul 07 '23 at 23:03
  • @zwol You misunderstand the intent of the cs50 documentation. Nowhere do they teach you to `free` something returned by `get_string` - which returns `string`. It's because, what you and I know is hidden in implementation details. A `string` _is_ freed as far as me as a user is concerned. All are registered to be freed `atexit`. – Ted Lyngmo Jul 07 '23 at 23:05
  • 2
    ... which is why it's utterly dangerous to use `string` and `char*` interchangeably in cs50. – Ted Lyngmo Jul 07 '23 at 23:11
  • @TedLyngmo You're still missing the point. OP thought that `string ciphertext = cptxt_arr;` would convert a char array into something that could safely be returned. chux's answer presumes that you already know it does something very different. What I wrote was intended to clue OP in to this, and nothing more; in particular I was never talking about the behavior of "proper" CS50 strings because *that's not what OP's code is using*. – zwol Jul 07 '23 at 23:59
  • @zwol If it's me missing the point, I'll leave it here for people to have something to lean on. – Ted Lyngmo Jul 08 '23 at 00:02
  • I have voted to reopen the question because in this case, I don't think it was appropriate to close this CS50 question as a duplicate of a non-CS50 question. Even if the underlying technical issue is the same, the required explanations are very different for people using CS50 than for people not using CS50. – Andreas Wenzel Jul 09 '23 at 23:23

2 Answers2

3

The CS50 data type string is nothing more than a reference to an array which contains a sequence of characters. It is not an array itself.

These references are called "pointers" in C and you will learn all about them in week 4 of CS50.

In the line

ciphertext = cptxt_arr;

you are not making a copy of the characters contained in the array cptxt_arr. Instead, you are making ciphertext refer to the array cptext_arr.

In the line

return ciphertext;

the function encrypt will pass this reference to cpttxt_arr to the function main. However, as soon as the function encrypt returns, all of its local variables will cease to exist. This includes the array cptxt_arr. This means that the function main cannot do anything with the returned reference, because the reference is invalid, as it refers to an array that no longer exists.

In week 2 of CS50, you are supposed to solve the exercises by overwriting the array that was created and returned by get_string.

In week 4 of CS50, you will learn how to use the functions strcpy and malloc, with which you can copy character arrays and allocate memory for these copies in such a way that the copies do not cease to exist when the function returns.

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
  • _"The CS50 data type `string` is nothing more than a reference to an array which contains a sequence of characters."_ - But it is. That's what leads cs50 students astray. It's the `atexit` registration that makes it special. – Ted Lyngmo Jul 07 '23 at 23:49
  • @TedLyngmo: The `atexit` registration that you are referring to only applies to the function `get_string`, but not to the data type `string` in general. For example, when writing `int main( int argc, string argv[] )`, the elements of `argv` do not have the `atexit` properties that you describe, despite them having the data type `string`. – Andreas Wenzel Jul 08 '23 at 00:03
  • Exactly! They should not be confused! `char*` `!=` `string` (typedefined aside) – Ted Lyngmo Jul 08 '23 at 00:05
  • @TedLyngmo: I don't understand what point you are trying to make. Are you claiming that the data type `string` should only be used with `get_string` and not with `argv`? The CS50 course also uses the data type `string` with `argv`. Is it your intent to criticize the CS50 course in this respect? – Andreas Wenzel Jul 08 '23 at 00:11
  • 1
    In so many words, yes. I want to separate the two types `char*` and `string` since, in the world of cs50, they are just as different as `std::string` and `char*` in C++. It's not nice to lead people learning the crazy mid-language "cs50" anything else. A `string` in cs50 is autmatically destroyed. That clue could be enougfh. – Ted Lyngmo Jul 08 '23 at 00:23
  • @TedLyngmo: The CS50 data type `string` only exists so that students can work with `get_string` and `argv` before they learn about pointers in week 4. I think that the CS50 course does this well. A week 2 CS50 student will not care about memory leaks, because at that point, they haven't even learned about pointers yet, so they have not learned about `malloc` yet. However, I do agree with you that from week 4 on, it is not good that the behavior of `get_string` is inconsistent with `malloc`, as the return value of `malloc` must be explicitly `free`d, but not the one from `get_string`. – Andreas Wenzel Jul 09 '23 at 21:10
2

Why my returned value is not printed as a string but it is printed as a char?

Code attempts to return the address of a local object which leads to undefined behavior (UB).

char cptxt_arr[length];
string ciphertext;
...
ciphertext = cptxt_arr;
return ciphertext; // Bad

Certainly a dupe someplace.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • _"attempts to return the address"_ - Not only does it attempt to do it, it does. – Ted Lyngmo Jul 07 '23 at 22:38
  • @TedLyngmo Perhaps. In either case a fine point. As I understand, even the attempt to return the value is UB, even if the calling code does not use it. – chux - Reinstate Monica Jul 07 '23 at 22:39
  • It's not a problem if not dereferenced - if I got it right. – Ted Lyngmo Jul 07 '23 at 22:41
  • @TedLyngmo " not a problem if not dereferenced" --> I don't think so. Simply passing around a defunct address is problematic. We could spend time looking it up, yet it is still poor programming. Yet even if UB, I have never seen a system that had much trouble with simply copying such addresses. – chux - Reinstate Monica Jul 07 '23 at 22:43
  • ❤ - Indeed. It's only at the first dereferencing problem occurs - but sure, catch early is good. – Ted Lyngmo Jul 07 '23 at 22:46
  • 1
    I'm not gonna look at the standard either, but it seems like you could return an uninitialized pointer with an indeterminate value and it only becomes a problem if you try and use it. A pointer to an array that used to exist seems like the same thing to me. I bet we can all agree it's just a bad idea. :) – Retired Ninja Jul 07 '23 at 22:47
  • @RetiredNinja As I understand `a = b`, where `b` is a defunct pointer, or an uninitialized (e.g. random) pointer is UB as copying _pointers_ can involve more than just a binary bit pattern on some sophisticated memory systems. Pointers can have a devilish _under the table_ encoding involving segments, privilege, rights, mapping, usage counts, etc. – chux - Reinstate Monica Jul 07 '23 at 22:51
  • If I remember correctly, there is special language for function returns such that this would definitely not be UB if the returned value were *discarded* (e.g. `(void)encrypt(...)`). Also IIRC, whether it is UB to copy indeterminate values with `=` is ... controversial. – zwol Jul 07 '23 at 22:59
  • Yeah, trap representations and all. Good point. – Retired Ninja Jul 07 '23 at 23:02