2

Could someone please help me understand the following behaviour:

I have a little piece of code for cloning a float image. The method Clone takes a pointer to another image and its dimensions as arguments.

Everything works as expected, but sometimes this line clone[i] = color; causes an Access Violation. The occurrence of the exception is not predictable neither periodic. Inspecting variables at crash time shows that Color color = source[i]; is always set and valid.

How is it possible that malloc returns a bad pointer?

The code:

typedef struct
{
    float r;
    float g;
    float b;
    float a;
} Color;

Color* Clone(Color* source, int width, int height)
{
    int s = width * height;

    Color *clone;
    clone = (Color *)malloc(s * sizeof(Color));

    if (clone)
    {
        for (int i = 0; i < s; i++)
        {
            Color color = source[i];

            // Sometimes app crash here: Access violation
            clone[i] = color;
        }
    }

    return clone;
}

Any help is much appreciated.

Update:

Platform: Windows 64bit

Values of variables at crash time:

width = 256
height = 256
s = 655536
i = 0
JimDoddle
  • 51
  • 1
  • 5
  • 5
    Never cast the results of `malloc` – Magisch Oct 28 '15 at 14:24
  • Why would you not want to cast the void* returned by malloc in C? – Chris Oct 28 '15 at 14:27
  • Also please post the rest of your code and the exact error output. – Magisch Oct 28 '15 at 14:27
  • 2
    Most likely, somewhere in your program there is something that causes a heap corruption, which in turn destroys other parts of the program using the heap. – Lundin Oct 28 '15 at 14:27
  • @Chris: Because `void*` implicitly and safely casts to *any* pointer type. – datenwolf Oct 28 '15 at 14:28
  • 1
    @Chris Because in C any void* pointer is automaticly and safely promoted to the proper type on use, unlike in c++ – Magisch Oct 28 '15 at 14:28
  • 1
    Which compiler and platform do you use? What's the value of `width`, `height`, `i` and `s` when it crashes? – skyking Oct 28 '15 at 14:28
  • @Chris And casting to it can hide errors – Magisch Oct 28 '15 at 14:28
  • 4
    But FFS, please *stop complaining* about casting malloc. I'm sure there are more important things to discuss in the comment section of every bloody C question. – Karoly Horvath Oct 28 '15 at 14:29
  • All `int`s should be `size_t`s. – alk Oct 28 '15 at 14:29
  • 3
    @Chris [Dead horse malloc cast debate can be found here](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc?rq=1). Now, can everyone kindly stop spamming this question with these irrelevant, off-topic remarks. – Lundin Oct 28 '15 at 14:30
  • You have a strange way of copying your Color struct. Why don't you just try to do a memcpy of sizeof(Color) ? – KeylorSanchez Oct 28 '15 at 14:35
  • 1
    As @Lundin mentioned, the problem is somewhere else. Produce an MCVE. And for your own mental health use a memory santizier (valgrind, clang, gcc...) – Karoly Horvath Oct 28 '15 at 14:35
  • We need to know more about `Color* source`. – Kotshi Oct 28 '15 at 14:37
  • 2
    @KeylorSanchez What's strange about it? It is perfectly fine practice and equivalent to a memcpy. In fact it will most likely result in the same code as memcpy, after optimization. – Lundin Oct 28 '15 at 14:37
  • 1
    @KarolyHorvath the hope is that this stupid dogma will die someday. A single one-line comment with a link to the canonical SO answer (kindly provided by Lundin above) would be sufficient though. – Quentin Oct 28 '15 at 15:00
  • 4
    @KeylorSanchez: Actually if we go into full language C struct assignment is the proper way to do it. memcpy does a flat, void copy on the whole struct, but according to the C language standard in fact only accesses to actual struct elements are defined. Depending on how the compiler pads the struct a memcpy will also read/write "between" the elements, which is, if we're super-technical and totally OCD, illegal and undefined behavior. – datenwolf Oct 28 '15 at 15:04
  • @datenwolf thanks for the explanation – KeylorSanchez Oct 28 '15 at 15:26
  • Thanks for the comments and answers. I will change the points suggested and come back as soon as I can exclude some points or got the problem resolved. – JimDoddle Oct 28 '15 at 18:41

7 Answers7

3

I can see nothing terribly wrong with this code. However malloc can indeed return garbage if the heap has been corrupted before. Actually quite often malloc is when one detects that something went wrong and you get an explicit "heap corruption" error message.

My suggestion is, if possible, to run the program under valgrind in the hope to catch the real bad guy that corrupts heap data structures... something that happens BEFORE calling this cloning function.

6502
  • 112,025
  • 15
  • 165
  • 265
2

This

int s = width * height;

is prone to an multiplication integer overflow. If that happens you're invoking undefined behavior (since the overflow behavior of signed integers is not defined); usually malloc will allocate a buffer too short.

EDIT: Such undefined overflow can also happen if either one of width or height is negative.

To avoid this you have to check for an multiplication overflow. The only reliable way to do this, is by using unsigned arithmetic (for which overflow behavior is defined).

if( 0 > width
 || 0 > height
){
    return ERROR_INVALID_VALUE;
}
size_t const sz_width  = width;
size_t const sz_height = height;
/* ((size_t)x) != x makes use of arithmetic conversion
 * rules to check for truncation by the cast  */
if( sz_width  != width
 || sz_height != height
){
    return ERROR_TRUNCATION;
}

/* now check if the multiplication overflows */
/* size_t is unsigned, so overflow is well behaved */
size_t const sz = sz_width * sz_height;
if( (sz / sz_width) != sz_height ) {
    return ERROR_OVERFLOW;
}
datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • minusone. *Seriously*? Have many megapixels do you think this image has? – Karoly Horvath Oct 28 '15 at 14:39
  • @KarolyHorvath: `(4*1024*1024*1024/(sizeof(float)*4))**0.5 = 16384` – that's not so much, truth to be told. I frequently work with images 4 times as wide and high. – Also maybe OP is passing a negative value into either width or height, which causes the same problem. – datenwolf Oct 28 '15 at 14:42
  • 1
    Actually, that problem shouldn't result in this error. First of all overflow is not undefined behavior, it's actually well defined behavior (IIRC). If you get wrong value for `s` you would still have that `i – skyking Oct 28 '15 at 14:44
  • @skyking: Signed overflow is **undefined behavior** in C. – datenwolf Oct 28 '15 at 14:44
  • It seems a reasonable point to me @KarolyHorvath. In C a signed `int` only needs to support up to +32767, so simple 640x480 image would be too big for `s`. – GrahamS Oct 28 '15 at 14:44
  • @datenwolf: Now you're pulling numbers out of thin air? Just delete this answer and leave a comment... – Karoly Horvath Oct 28 '15 at 14:45
  • @GrahamS: *Reasonable*? No, it's just a very very very vey improbable scenario. – Karoly Horvath Oct 28 '15 at 14:46
  • @KarolyHorvath: I'm not pulling numbers out of thin air here. 32 bit is still a widely used size for integers (for example on Windows, even Windows 64 a int is 32 bits). This gives you 4GiB you can address and the struct up there required 16 bytes (if packed). 4 bytes a float, 4 floats. divide that and take the square root. – datenwolf Oct 28 '15 at 14:47
  • Ah. But that's width and height, not megapixels. – Karoly Horvath Oct 28 '15 at 14:48
  • 1
    OK, so it's undefined, but in practice you use a multiplication instruction here and the result should in most cases result in either result being whatever the processor delivers or that the processor raises exception right there. – skyking Oct 28 '15 at 14:49
  • @KarolyHorvath: Yes. I'd expect somebody actually follow a calculation thinking about what's going on. Of course the simple answer would have been 256 megapixels, but how wide and high would that be. – datenwolf Oct 28 '15 at 14:50
  • @skyking: Nope. If a compiler detects that your calculation (like comparing results on such a signed multiplication) tests for a value that can only happen in the undefined case, then it legally any discard any code using the resulting values. It may and modern compilers **do that!** – datenwolf Oct 28 '15 at 14:52
  • Which could be the case if width and height are *constants*. Well, another assumption. – Karoly Horvath Oct 28 '15 at 14:53
  • @KarolyHorvath: Could have been a negative value for wither width or height either. Same outcome: Undefined behavior and very likely a value passed into malloc that's way outside a reasonable range so a buffer to small is returned. OP does test for malloc returning NULL and since that test passes the only explanation is, that somewhere some arithmetic overflows. – datenwolf Oct 28 '15 at 14:53
  • @KarolyHorvath: No, not an assumption. For example the DIB file format (you know, Windows `.bmp` permits the height header field to be negative to indicate that the scanlines are in bottom–up order instead top–down. If OP does read a DIB that's been written that way, this happens. – datenwolf Oct 28 '15 at 14:55
  • @datenwolf: oh no Sir, heap corruption is the likely culprit here. But let's stop this ridiculously silly discussion here. I'm sure the OP has something to say on the matter. – Karoly Horvath Oct 28 '15 at 14:55
  • @datenwolf Nope, GCC for example seem to emit just an `imul`, no test seem to be done for overflow. Besides why would a "modern" compiler waste time on checks for conditions that have undefined behavior anyway? The reasonable way would to be to just multiply and let whatever happens happen if it overflows. This means that you would end up with a value in `s` and the execution would continue after that. – skyking Oct 28 '15 at 14:56
  • Anyway if now overflow is so bad, you should not just multiply `sz_width` and `sz_height` without first checking so they can't overflow. – skyking Oct 28 '15 at 14:56
  • @skyking: **signed** overflow is undefined. **unsigned** overflow *is well defined*. `size_t` is unsigned, so the overflow behavior for `sz_width * sz_height` is well defined, and by using the identity `sz = sz_width * sz_height` => `sz_height = sz / sz_width` one can check if a overflow happened. – datenwolf Oct 28 '15 at 14:58
  • @skyking: Please read those two articles on UB: http://blogs.msdn.com/b/oldnewthing/archive/2014/06/27/10537746.aspx and http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html – datenwolf Oct 28 '15 at 15:01
  • 1
    @KarolyHorvath: Since OP now gave actual values for the variables, yes, heap corruption seems to be the culprit. Nevertheless my answer still holds significance (IMHO) so I see no point in altering or deleting it. OP likely does have heap corruption. Other people with a similar problem, that's due to invalid values however may search for this and then a discussion about overflow behavior and checks is necessary reading material. – datenwolf Oct 28 '15 at 15:07
  • @datenwolf: Sry, I was just following my instincts, based on likelihoods. Good old Bayes' theorem. – Karoly Horvath Oct 28 '15 at 15:09
  • @KarolyHorvath: And I was following mine. As already told, our group is working with **huge** images all the time and this kind of overflow error I frequently have to deal with in programs newbies in our group write. I always tell them "use the libraries provided", but they often don't listen. – datenwolf Oct 28 '15 at 15:22
  • @datenwolf: and you've extrapolated your group's tasks and tendencies to the rest of the world. I guess this is some kind of occupational hazard ;) – Karoly Horvath Oct 28 '15 at 15:26
  • @datenwolf I **know** what undefined behavior means, but still the fact is that I haven't seen a implementation that will do anything entirely unexpected in this situation. The odds is in favor of that the OP's compiler does not do that either. – skyking Oct 29 '15 at 06:10
1

I suppose that if you have range checked the width and height as being reasonable (so you won't overflow) the best course of action would be to try using valgrid. That way you will be able to see if you've had some memory error before this that might result in malloc misbehaving or if you've actually got a memory block that isn't large enough.

skyking
  • 13,817
  • 1
  • 35
  • 57
1

You don't say what your target platform is, but this:

int s = width * height;

will cause an overflow if width * height would produce a number greater than MAX_INT. The C standard only requires a signed int to store up to +32767.

Your target platform may use larger ints, especially if it is a desktop OS, but it is still bad practise.

Also your function signature allows for negative values to be passed as width or height, but your code doesn't handle that possibility.

Edit: in summary, use more appropriate types. width and height should probably be unsigned int. If so then s and i should be unsigned long.

GrahamS
  • 9,980
  • 9
  • 49
  • 63
  • More an attempt to clarify one aspect of the @datenwolf answer and explain why it is a bad idea. I think it is better to use more appropriate types from the start, rather than try to detect overflows. – GrahamS Oct 28 '15 at 15:12
  • Note: The OP has added that the target platform is Win64, so `int` is probably sufficiently wide there that this won't cause an overflow. Still bad practise though and not cross-platform. – GrahamS Oct 28 '15 at 15:15
  • 1
    "I think it is better to use more appropriate types from the start, rather than try to detect overflows." -- Good point, but you don't make the suggestion in the answer. ;-) If you do, *please* use standard types, not Windows-specific ones. There are children watching. ;-) – DevSolar Oct 28 '15 at 15:52
  • Okay thanks @DevSolar, I've edited the answer to make that point clearer. – GrahamS Oct 28 '15 at 16:00
  • "*should probably be*" `size_t`. – alk Oct 28 '15 at 18:45
  • Personally I wouldn't use `size_t` here @alk. It would work but would be misleading as `size_t` is intended to be used for values that are sizes in bytes. – GrahamS Oct 28 '15 at 20:41
  • ... and to index arrays. @GrahamS – alk Nov 01 '15 at 08:31
  • @alk There is a nice summary of the reasons behind introducing the size_t type here: http://www.embedded.com/electronics-blogs/programming-pointers/4026076/1/Why-size-t-matters – GrahamS Nov 01 '15 at 09:13
1

When you are compiling .c source in Visual Studio that doesn't include stdlib.h (where malloc defined as returning void*) and Visual Studio uses its own definition, where malloc returns int.

Visual studio prints:

warning C4013: 'malloc' undefined; assuming extern returning int

So your pointer is truncated at 4 bytes rather than 8. It seems this problem appears only when compiling in x64 mode in .c files.

So, solution is - just include stdlib.h.

1

Well I cannot upvote but my problem was the problem Lance Larsen described. I was compiling C code using Visual Studio and malloc, or in my case calloc, was not working. VS was throwing no errors at the point calloc was run, however no memory was being allocated to my struct and then later on in the code when it came time to use the struct an error would be thrown stating that a memory access violation occurred while attempting to write to memory.

As Lance advised, I included stdlib.h and that resolved the problem. So a big thanks to Lance.

nucode
  • 51
  • 5
  • Please don't add "thank you" as an answer. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation), you will be able to [vote up questions and answers](https://stackoverflow.com/help/privileges/vote-up) that you found helpful. - [From Review](/review/late-answers/33455719) – Gerhard Dec 23 '22 at 05:47
0

The malloc call is probably not your problem (as @KarolyHorvath said, the size is not really very big). The most likely problem is that the incoming source is null or empty; you should check for that before trying to reference source[i].

FredK
  • 4,094
  • 1
  • 9
  • 11