-4

I am trying to make a knockoff string struct that will give me the bare bones of what I need for my code (I don't need everything and want to make my code as fast&small as possible). Thus, other than grabbing source for strcpy and strcmp (am I allowed to do that?) I have made a struct hstring to help with my code. So far I have the following for the struct:

struct hstring{
private:
    char *s; // pointer to what holds the string
    int size; // size of the string
public:
    hstring(){
        s=(char *)malloc(0);
        size=0;
    }
    void set(const char* str){ // set the string
        size=0;
        while(str[size]!='\0')
            size++;
        s=(char*)realloc((void *)s,size*sizeof(*s)); // reallocate memory to hold just enough for the character array
        for(int i=0;i<size;i++)
            s[i]=str[i];
        s[size]='\0';
    }
    bool is(const char* str){ // check if something is equal to the string
        int i=0;
        while((s[i]==str[i])&&(str[i]!='\0'))
            i++;
        if((i==size)&&(str[i]=='\0'))
            return true;
        return false;
    }
    inline char* get(){ // return the string
        return s;
    }
    inline int length(){ // return the size of the string
        return size;
    }
};

I've noticed that the only way the set() function works is if I put an explicit string in there or don't have an array. For ex.

// This works
printf("\nTest1\n");
hstring test;
char tmp_c[50];
scanf("%s",tmp_c);
test.set(tmp_c);
printf("%s\n",test.get());

// This works
printf("\nTest2\n");
hstring test2[2];
test2[0].set("Hello ");
test2[1].set("world!");
printf("%s %s\n",test2[0].get(),test2[1].get());

// This works
printf("\nTest3\n");
hstring test3[2];
scanf("%s",tmp_c);
test3[0].set(tmp_c);
scanf("%s",tmp_c);
test3[1].set(tmp_c);
printf("%s %s\n",test3[0].get(),test3[1].get());

// This, what I want to do, does NOT work
printf("\nTest4\n");
hstring *test4 = (hstring *)malloc(2*sizeof(hstring));
for(int i=0;i<2;i++){
    scanf("%s",tmp_c);
    test4[i].set(tmp_c);
}
printf("%s %s",test4[0],test4[1]);
free(test4);

I'm confused as to why the fourth test doesn't properly run. It compiles but crashes upon reaching test4 and trying to reallocate memory in the .set() function. I get an "access violation reading location" error which made me assume I was writing/reading somewhere that I'm not supposed to; however, I cannot determine the exact cause (though I can tell the line causing the error is s=(char*)realloc((void *)s,size*sizeof(*s)); when trying to reallocate the size of the character array. Does someone notice an issue that I have overlooked?

xaxxon
  • 19,189
  • 5
  • 50
  • 80
hherbol
  • 91
  • 12
  • 2
    Why in `s=(char *)malloc(0)`, allocating `0` bytes **?** – Grijesh Chauhan Jul 24 '13 at 04:40
  • 2
    Even more importantly, why are you using `malloc()` and not `new`? –  Jul 24 '13 at 04:44
  • 1
    Also, you should be running this in a debugger in first place. And after all, why not just use `std::string` if you're in C++? –  Jul 24 '13 at 04:46
  • H2CO3 - I don't need the full functionality of strings and worry that it will slow down my code as I loop over 1000000 Times. Secondly, isn't new doesn't let me reduce so I chose to go with malloc. Finally @grijesh - I have to allocate something or else the code won't compile. As I don't know the size until I set it I decided to allocate to 0. – hherbol Jul 24 '13 at 04:52
  • @hherbol Sorry, your argument is invalid. Use `std::string`, seriously. Don't worry about efficiency - unless you benchmarked your app and actually found that this is a bottleneck, worrying about efficiency is premature optimization. Also, you shouldn't really shrink the array if the new string is shorter, only expand if it is longer. If you keep reallocating, it will quickly lead to terrible memory fragmentation. –  Jul 24 '13 at 04:56
  • @H2CO3 - I'm mainly worried about efficiency because I'm doing over 1,000,000 loops and doing high end calculations per loop. I'm simulating particles traveling a delta distance of an angstrom over several hundred microns... Every millisecond counts. – hherbol Jul 24 '13 at 05:02
  • 2
    @hherbol So, have you benchmarked it? Or are you still making assumptions only? (I'm pretty sure that `std::string` actually has better reallocation techniques and compiler-supported copying intrinsics that make it **faster** than your naive implementation...) –  Jul 24 '13 at 05:04
  • I never got around to benchmarking std::string as this code never workd. It was actually an assumption as you say and I was hoping to understand this code properly and then benchmark them and use the faster version... I think I'll probably just give in though and use std::string though I'm still going to try and accomplish this code for my education. Thanks for the assistance with this and the idea to redesign the code to do construction and destruction as little as possible. All constructive criticism is welcome. – hherbol Jul 24 '13 at 05:17

2 Answers2

2

For test4, you use malloc() to allocate memory for two objects:

hstring *test4 = (hstring *)malloc(2*sizeof(hstring));

However, the constructor is not called for either of them. So, neither test4[0] nor test4[1] have been properly initialized. Your class methods probably assume that the referenced object is initialized, and so you get non-deterministic behavior.

The way to fix this is to not use malloc() to allocate your test4, but use std::vector instead:

std::vector<hstring> test4(2);

Then, you can remove your call to free(), as test4 will be destructed properly when the object falls out of scope.

The hstring class itself is managing a pointer to allocated memory. Because of that, you will need to define a destructor for it that will free that memory.

struct hstring {
    //...
    ~hstring () { free(s); }
    //...
};

However, because the destructor has become necessary, it is now also necessary to define a copy constructor, and an assignment operator for your hstring class. This is known as the Rule of Three.

You can avoid this complexity by utilizing an object to manage the memory for you. For your hstring class, it is probably easiest to make the internal char * a std::vector<char> instead. Then there is no need for a destructor, copy constructor, and assignment operator.

Community
  • 1
  • 1
jxh
  • 69,070
  • 8
  • 110
  • 193
  • I was hoping there was another method without using vectors to accomplish this. My old code used vectors all over it and I was trying to improve the speed by sticking to basic arrays. (I ran a clock for vector vs basic array and found that the normal arrays were on average 1 millisecond to 10 milliseconds faster. At 1000000 loops which I'm going to be doing that's around 100 minutes or more I believe). – hherbol Jul 24 '13 at 05:00
  • @hherbol What's slow is construction and destruction of objects, `vector` is typically only at most 10% slower than plain arrays (unless the compiler optimizes it away completely, in which case even identical assembly may be generated in the case of vectors and raw arrays). You should redesign your code instead so that object construction and destruction happens only once (or at least as rarely as possible) and the same, continuously alive temporary object performs the operations. –  Jul 24 '13 at 05:05
  • 1
    @hherbol: Please refer to [this article](http://stackoverflow.com/a/3664349/315052) for a comparison of arrays and `vector`. We would need to see your program and how you timed it to judge whether you had accurately determined the cost of one vs. the other. – jxh Jul 24 '13 at 05:18
  • @jxh - I think I understand it a little more now. But I will have to do a lot more research on copy constructors and assignment operators as I've never had to use them until now. The reason I went through this whole ordeal with creating the hstring was that originally I wanted to hold `const std::string s*;` in a struct and couldn't figure out how to initialize it dynamically. I'll try to figure it out and if I can't come back to StackOverflow. ps. what was wrong with my question that led to (as of now) a -3? – hherbol Jul 24 '13 at 05:21
  • @jxh - hahaha oddly enough that's one of the articles I read that got me worried. It says that he optimized it down to being only 10 milliseconds slower. That's where I came up with that number originally. I then made some program (don't remember, this was about 10 hrs ago) to double check. Assuming 1,000,000 loops, 10 milliseconds slower for just one vector call each loop, leads to 166 min. Though I don't think I realized he did 1000 loops and push_backs, not calls... so that probably makes a big difference... I need sleep before I make any more stupid mistakes... – hherbol Jul 24 '13 at 05:26
  • @hherbol: If I had to guess, the down votes are due to that the `malloc()` on your object not calling your constructor was perhaps too obvious of a problem. The code that was run to test `vector` and array does over a million operations. So the difference in time is 10ms divided by a million, which takes into the nanosecond range for difference in speed of a single op between array and `vector`. – jxh Jul 24 '13 at 05:30
1

Passing 0 to malloc either returns NULL or special pointer which can be used to call free, it is better to make it initialize simply with NULL or better an empty string with a '\0' character.

You are printing printf("%s %s",test4[0],test4[1]); that is printing the structure itself. To print the string you should do printf("%s %s",test4[0].get (),test4[1].get ());

Also, you have a memory leak in the program. You malloced and realloced the s pointer in the hstring structure objects, but never freed them.

I noticed the constructor was not called when you do a malloc. You need to use new/delete instead of malloc/free

For the minimum change you need to do: hstring *test4 = new hstring[2]; when allocating and delete [] test4 when freeing.

See jxh's suggestions also.

phoxis
  • 60,131
  • 14
  • 81
  • 117
  • I'm not sure that's the error, but it certainly doesn't help... – Jiminion Jul 24 '13 at 04:53
  • After the modification, compiles without a warning with `-Wall` and `-Wextra` and runs fine. I don't think I am missing anything. – phoxis Jul 24 '13 at 04:58
  • I agree with the initialization to `'\0'` instead; however, I left out the .get cause I've been staring at this code for the last 6 hours at least and it's 1AM now :'( Re-running and it still gives the same error. – hherbol Jul 24 '13 at 04:58
  • I missed something, the constructor was not called, pointed out by jxh. See edited answer. – phoxis Jul 24 '13 at 05:07