0

I have written a small program to convert vowels in a word to 'leet', for an assignment for the Harvard CS50 course I'm taking online. NB: this is relevant because we are using CS50 headers which give us access to a string data type. I'm not a student at Harvard, just taking the free version through edx.org.

My main function runs an if-else to validate the input, and then call the converter function if input is correct. In the else block, a string variable is assigned the value returned from my replace() function, and then the variable is (supposed to be) printed. Main will then return 0 and exit.

Problem: when I run this program, the leeted variable is purged when the printf statement is called in main. When I run our course debugger, I can see that the leeted variable is correctly assigned, but the second the debugger runs the printf line in main, the variable becomes empty. I'm absolutely baffled.

I have added another printf line at the end of the replace() function. When this print statement is commented out, the above problem occurs as described. However, when I activate the print statement at the end of the replace() function, the print statement in main() works.

Can anyone tell me why is is happening?

code:

#include <cs50.h>
#include <stdio.h>
#include <string.h>

// function declarations
bool argcheck (int argc, string argv[]);
string replace(string input);

// #######################################
// MAIN
int main(int argc, string argv[])
{
    if (argcheck(argc, argv) == false)
    {
        return 1;
    }
    else
    {
        string leeted = replace(argv[1]);
        printf("from main(): %s\n", leeted);
    }
    return 0;
}

// #######################################
// FUNCTIONS

// input validation
bool argcheck (int argc, string argv[])
{
    if (argc != 2)
    {
        printf("Usage: ./no-vowels word\n");
        return false;
    }
    return true;
}

// converter implementation
string replace(string input)
{
    char output[strlen(input)];

    int i = 0;
    while (input[i] != '\0')
    {
        switch(input[i])
        {
            case 'a':
                output[i] = (char) '6';
                break;

            case 'e':
                output[i] = (char) '3';
                break;

            case 'i':
                output[i] = (char) '1';
                break;

            case 'o':
                output[i] = (char) '0';
                break;

            default:
                output[i] = input[i];
                break;
        }
        i++;
    }
    string finished = (string) output;
    // printf("from replace(): %s\n", finished);
    return finished;
}
the busybee
  • 10,755
  • 3
  • 13
  • 30
skytwosea
  • 249
  • 3
  • 8
  • 3
    You're returning a pointer to a local variable. You can't do that with any defined semantics. Let me see if I can find a dupe. – Carl Norum Jan 17 '23 at 00:28
  • 2
    This is WRONG: `string finished = (string) output...return finished;` Variables declared inside a C function go OUT OF SCOPE when the function exits. If you try using "finished" outside of replace(), it will cause [undefined behavior](https://en.wikipedia.org/wiki/Undefined_behavior). SUGGESTION: declare a buffer for your string *OUTSIDE* of replace(), then pass the buffer as a parameter. – paulsm4 Jan 17 '23 at 00:28
  • And `char output[strlen(input)];` is too short to hold a string copied from `input` because `strlen()` doesn't count the `'\0'` terminator. – Andrew Henle Jan 17 '23 at 00:30
  • 1
    @EricPostpischil I have tried to stay true to exactly what I have observed; I hope you can forgive a novice for using the wrong terminology. When debugging, the local variable leeted goes from having content, to not having content as I step into the printf line, and what I assume is the leeted pointer *leeted also changes to zero. – skytwosea Jan 17 '23 at 00:34
  • @EricPostpischil Did they not do that? "...the second the debugger runs the printf line in main, the variable becomes empty." – John Kugelman Jan 17 '23 at 00:34
  • @paulsm4 Thanks for your suggestion, but I am confused: as I understand it, the return statement should be returning whatever that variable contains, so it shouldn't matter if the 'finished' variable is no longer in scope? The call to the replace() function in main assigns its result to a variable that is in scope in main. THAT variable, here called 'leeted', goes from having the correct value, to being empty when the printf line is run. – skytwosea Jan 17 '23 at 00:40
  • 3
    The problem is `finished` doesn't contain a string. It contains a *pointer* to a string. The string, the array of characters, is stored in `output`. That array is deallocated when the function returns, which means the function is returning an invalid pointer. – John Kugelman Jan 17 '23 at 00:43
  • @CarlNorum I haven't learned about pointers properly yet; I think they are covered in the next lecture, and I am only on chapter 2 of the K&R book (pointers are in ch 5). Regardless, I think you're on to something here - when I run the debugger, as I step into the printf line in main(), the local variable 'leeted' becomes empty, and the (pointer?) marked *leeted goes to zero. – skytwosea Jan 17 '23 at 00:44
  • @JohnKugelman Ok, I'm starting to follow. I noted that someone has closed this question and marked a duplicate so I'll go read that. However, I still can't sort out why this behaviour does not occur when I run the printf statement at the end of the replace() function? Why would using that printf statement keep the pointer allocated? – skytwosea Jan 17 '23 at 00:46
  • 1
    To peek ahead at pointers, `string` is a typedef alias for `char *`, which is a pointer type. You can think of a `string` variable as storing not an array of characters but a reference (or pointer) to one. It's like having an envelope with an address on it. You can hand that envelope to someone else, or you can copy the address onto additional envelopes as many times as you want, as long as the house isn't demolished in the meantime! – John Kugelman Jan 17 '23 at 00:47
  • 1
    The short answer is that undefined behavior is really weird and tricky. Accessing an invalid pointer triggers undefined behavior, and once you do that all kinds of bizarre things will happen. It's really hard to figure out "why" things happen. Strange as it sounds, it's quite common for the presence or absence of benign debugging printouts to alter the behavior of your code. It happens all the time. C's a pretty gnarly language. If you follow tech news you've probably heard that 70% of bugs are caused by memory safety errors. Well, this is one of them. – John Kugelman Jan 17 '23 at 00:53
  • @JohnKugelman: Variables do not “become empty”; there is no “empty“ state in C, except possibly an array used as a string might have a null character at its start, in which case they should state that. But, even so, you cannot see variables or their contents. You see output from the program or the debugger. That is what novices should describe. – Eric Postpischil Jan 17 '23 at 00:58
  • @JohnKugelman Definitely interesting. Thanks for this. I don't understand how to fix my code yet but I've got enough hints & tips here to figure it out. Cheers – skytwosea Jan 17 '23 at 00:59
  • 1
    @EricPostpischil I'm not trying to square up against you, but you are not being helpful. I'm learning, and I don't really know enough yet to ask questions to the standard you're demanding. I've been at this for two weeks. I described exactly what my debugger showed me: under the 'variables' dropdown, the 'leeted' variable changed in front of my eyes from "answer" to "". It changed from text contained in quotation marks, to quotation marks with no content. I think, 'the variable became empty' is a pretty reasonable way to describe this situation at this stage of my education. – skytwosea Jan 17 '23 at 01:06
  • @EricPostpischil You're being incredibly picky and not in a helpful way. If the debugger shows `leeted = "abc"` one minute and `leeted = ""` the next, it's perfectly reasonable to call that "empty" or "purged". If you want to know what the OP means by it then ask them! Don't jump on them for using English words correctly but not to your desired level of precision. – John Kugelman Jan 17 '23 at 01:07
  • Q: `as I understand it, the return statement should be returning whatever that variable contains`. NO! Like John Kugelman said, "finished" doesn't contain a string. It contains a pointer to a string. In C, a "string variable" is actually an ARRAY of characters, ending with a "null byte". The variable "finished" just refers to the LOCATION (the "address") of the array. The data in the array itself is LOST when you exit the function. You simply need to "allocate" the array differently. Look here for more details: https://www.tutorialspoint.com/cprogramming/c_strings.htm – paulsm4 Jan 17 '23 at 01:08
  • Re “I described exactly what my debugger showed me: under the 'variables' dropdown, the 'leeted' variable changed in front of my eyes from "answer" to "".”: That is what you should do to describe program behavior. It is a description of an observation, not a conclusion. Beginners often come to incorrect conclusions because they do not yet have good models of how programs work internally. Describing direct observations avoids making mistakes. – Eric Postpischil Jan 17 '23 at 01:09
  • @JohnKugelman: Re “If the debugger shows leeted = "abc" one minute and leeted = "" the next, it's perfectly reasonable to call that "empty" or "purged".”: It is a reasonable description **if we know what it means**. But if somebody just says something is “empty” or “purged” without saying the debugger showed it as empty, we do not know they mean the debugger display is empty. That needs to be said. It is not picky; it is a method of communicating clearly. – Eric Postpischil Jan 17 '23 at 01:10
  • @paulsm4 Thanks. I'm catching on. As I mentioned in another comment, I just haven't gotten to pointers yet - they are in an upcoming lecture for cs50, and an upcoming chapter in my K&R book. Up to this point, our exercises have only required us to use a void function that prints our answer out directly. So, one step at a time. Thanks for the link, I'm headed there now; I've got enough hints & tips to figure it out from here. – skytwosea Jan 17 '23 at 01:17
  • 1
    It's unfortunate that the code scaffolding you got with this assignment implies that "string" is a "type" (vs. a "char *"). For a newbie, "string" looks like an "object" (like a C++ string). Oh well... SOLUTIONS: 1) allocate a C buffer outside of the function (like I suggested above), 2) use [malloc()](https://man7.org/linux/man-pages/man3/malloc.3.html) (which you probably haven't covered either), or 3) use [static](https://www.geeksforgeeks.org/static-variables-in-c/) (probably NOT a good choice here!) – paulsm4 Jan 17 '23 at 01:34
  • 1
    @paulsm4 yes, I agree, and that's the root of my problem. I learned to script in Python during grad school (not a CS degree, obviously), and I've been having a hell of a time with C so far - I've been looking at things through my mental model of a basic Python script. That is changing, albeit slowly! Supposedly the 'training wheels' (ie. use of cs50's header files incl. the string type) will be taken away in the coming weeks. I kind of wish the course were taught without those training wheels in the first place, but oh well. No biggie, just a bit of broken code while I learn. Thanks again – skytwosea Jan 17 '23 at 01:53

0 Answers0