0

EDIT: The problem was not undefined behaviour, but rather "mis-use" of char-arrays

I haven't worked a lot with pointers and dynamic memory allocation, so I decided I'd try to make a really simple encrypter using these methods. (this encrypter isn't supposed to be great, it uses Caesars method, only +1 instead of 3, and uses symbols in between the letters to make it harder to decrypt, not here for criticism on the algorithm)

The problem I'm facing I believe is undefined behaviour, but I don't really see how this happens. Say I want "Hello" to be encrypted, it only prints 'I', which comes after 'H' in the alphabet, but it stops there and the program becomes unresponsive, so I suppose the problem lies in the else part of the for loop. The compiler also warns me about heap corruption, but after looking through this page I think I'm freeing the memory correctly.

#include <iostream>
#include <string>
#include <ctime>

using namespace std;

char * enc(string);

int main()
{
    string foo = "Hello";

    char * bar = enc(foo);

    cout << *bar;

    delete[] bar;

    cin.get();

    return 0;
}

char * enc(string str)
{
    char * encrypted = new char[int(str.size())];
    srand(unsigned int(time(NULL)));

    // Disguise symbols for obscurifying text
    char * disg = new char[37]
    {
        //37 symbols, unrelevant and takes a lot of space.
    };

    for (int i = 0; i < str.size(); i++)
    {
        encrypted[i] = int(str[i]) + 1;
        if (i == str.size())
            break;
        else
            encrypted[i + 1] = disg[rand() % 37];
    }

    delete[] disg;

    return encrypted;
}

As a sidenote, I do realize that vectors might be better for this purpose, but I haven't gotten into that just yet, this is for practicing memory management.

Chaost
  • 137
  • 3
  • 12
  • @Someprogrammerdude No, I don't entirely understand why this is UB myself, and what I mean by unresponsive is that it stops after 'I', I can't click enter to close the program (which is what I use `cin.get()` for), and the only way to close it is through task manager, or the stop button in VS – Chaost Jan 24 '18 at 08:40
  • it never enters if (i == str.size()) since you stated in your for loop i < str.size() – Moia Jan 24 '18 at 08:41
  • Somewhat unrelated to your question: why do you use raw char pointers? The `enc` function should return a `string` and not a `char*`. – Jabberwocky Jan 24 '18 at 08:48
  • @MichaelWalz I understand it seems weird, I didn't want to do it like this either, but as I stated in Werner Henze's answer "I tired having `enc` return string at first, but I don't know how I can initialize a string with dynamic size, so I went with char." – Chaost Jan 24 '18 at 08:54
  • I also didn't want to overload this post with tons of questions (which I do have) as it becomes too inconclusive and wide, I figured I can try to fix the rest myself or make a separate, more in-depth question on these things – Chaost Jan 24 '18 at 08:55

3 Answers3

1
cout << *bar;

is printing the first character that bar points to.

You want

cout << bar;

But for that you would need to null terminate the returned string.

Side notes: please don't use raw pointers. disg should be for example a std::array. And enc should return a std::string.

if (i == str.size())can never be true inside your loop. Also as Some programmer dude says in his answer you are writing out of bounds in the lineencrypted[i + 1] = disg[rand() % 37];. But you will better see that if you remove the wrong if.

And for (int i = 0; i < str.size(); i++) could be rewritten as for(auto c : str).

Werner Henze
  • 16,404
  • 12
  • 44
  • 69
  • I tired having `enc` return string at first, but I don't know how I can initialize a string with dynamic size, so I went with char. I might try to improve upon this function as I get better, thank you for the help. – Chaost Jan 24 '18 at 08:52
  • @Chaost `string (size_t n, char c);` or `string::resize` or even simpler: initialize the result string as a copy of the input string. In your case str is not const, so you can modify str and then return the modified str. – Werner Henze Jan 24 '18 at 11:52
1

Lets take a closer look at the loop in the enc function:

for (int i = 0; i < str.size(); i++)
{
    encrypted[i] = int(str[i]) + 1;
    if (i == str.size())
        break;
    else
        encrypted[i + 1] = disg[rand() % 37];
}

The loop will iterate with i from 0 to str.size() - 1 (inclusive). That means the condition inside the loop, i == str.size() will never be true, and you will for the last iteration write out of bounds in the else clause.

The solution is to change the condition in the if to i == str.size() - 1.


There are some other problems as well, but none that will lead to UB like the above. For example, the value you write to encrypted[i + 1] will be overwritten the very next iteration.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I've made these changes and it kind of works, but this answers my question at least! I also noticed it overwriting the symbols, so I fixed that as well. There are just a few quirks going on right now, but I think I can fix that. Both answers here are correct and helped, but this explains how I can fix the code rather than make changes I don't entirely understand so I'll go with this. – Chaost Jan 24 '18 at 08:51
1

but I don't know how I can initialize a string with dynamic size

I've taken your code and modified it, to use a std::string and removed the allocations as they are simply not needed here.

Please read up on the code, debug it, and check the comments.

class Enc
{
public:
    //For arrays of fixed size, you may use this to obtain the size of the given array
    template <typename T, std::size_t N>
    constexpr static std::size_t FixedArraySize(T(&arr)[N])
    {
        return N;
    }

    //The parameter should not be modified, pass it as constant value by reference
    static std::string enc(const std::string &str)
    {
        //It looks like your symbols do not change, let's make them static to initialize only once
        static char symbols[] =
        {
            'A', 'x', '!' //...
        };

        auto originalStringSize = str.size();

        //Create the destination string and resize it to the source string's length
        std::string encrypted;
        encrypted.resize(originalStringSize);

        //Your loop
        for (int i = 0; i < originalStringSize; ++i)
        {
            //Make safe cast, which is also obvious to the reader of the code
            encrypted[i] = static_cast<int>(str[i]) + 1;

            //Fixed with - 1 to break correctly
            if (i == (originalStringSize - 1))
                break;
            else
                encrypted[i + 1] = symbols[rand() % FixedArraySize(symbols)]; //Notice the helper function
        }

        //Return the 'encrypted' string
        return encrypted;
    }
};

int wmain(int, wchar_t**)
{
    std::string testString = "Hello World";

    //Call srand only once in your program !
    srand(static_cast<unsigned int>(time(nullptr)));
    auto encrypted = Enc::enc(testString);

    std::cout << encrypted << std::endl;

    return 0;
}

As a sidenote, I do realize that vectors might be better for this purpose, but I haven't gotten into that just yet, this is for practicing memory management.

You could use a std::vector but it is here also not neccessary. You will rarely have to deal with raw pointers in modern C++.

Blacktempel
  • 3,935
  • 3
  • 29
  • 53
  • Although memory allocation was kind of the point with this entire project, I really appreciate this answer, it gives me a better understanding of how code should be planned and executed. – Chaost Jan 25 '18 at 07:47
  • I have 1 question though, is there any specific reason you went with `wmain()` rather than `main()` or is that just generally good practice? – Chaost Jan 25 '18 at 07:47
  • 1
    @Chaost I'm using Unicode Character Set rather than Multi-Byte as project settings. For example check this question: https://stackoverflow.com/questions/3064052/c-project-type-unicode-vs-multi-byte-pros-and-cons – Blacktempel Jan 25 '18 at 07:52