0

My professor in C++ has shown us this as an example in overloading the operator new (which i believe is wrong):

class test {
    // code
    int *a;
    int n;
public:
    void* operator new(size_t);
};

void* test::operator new(size_t size) {
    test *p;
    p=(test*)malloc(size);
    cout << "Input the size of array = ?";
    cin >> p->n;
    p->a = new int[p->n];
    return p;
}

Is this right?

nikitas350
  • 57
  • 4
  • 3
    It isn't wrong, but it isn't really useful in a real-world sense either. It is a purely academic exercise to show you how you would go about overloading `operator new`. – Zac Howland Dec 21 '10 at 18:42
  • What do you think is wrong about it? – Joel Rondeau Dec 21 '10 at 18:44
  • The definition of `operator new` should have return type `void*`, not `void`. This will break horribly if any other class derives from `test`. It's also a really terrible mind-wrenching idea as a whole. – aschepler Dec 21 '10 at 18:45
  • 3
    I believe that is "wrong" because he access the object before the constructor. – nikitas350 Dec 21 '10 at 18:46
  • Hm. Also, I seriously doubt C++ guarantees the compiler won't generate code that stomps on bytes in the `test` object after `operator new` returns. – aschepler Dec 21 '10 at 18:48
  • @Noah: There are some special performance-related cases where using a custom memory manager is required. Those cases are very rare and specialized, however. @nikitas: There is nothing "wrong" with the order or syntax of the code. The only thing wrong is that it is simply a terrible practice. A much better way to show you how to overload `new` would be to give you a real-world case where using a custom memory manager was necessary and show you a simple version of how it can be done. Take everything your professors show you with a grain of salt; many times, their code isn't useful. – Zac Howland Dec 21 '10 at 18:52
  • @Zac - duh. Very rare indeed. – Edward Strange Dec 21 '10 at 19:08
  • Just some examples of when you might need to use your own memory manager: game engines, embedded development, high-performance simulations, intensive graphics processing. – Zac Howland Dec 21 '10 at 19:21
  • @Zac Howland: It is clearly wrong. Please check the code I posted below that shows it does not work. – Martin York Dec 21 '10 at 19:36
  • @ Joel Rondeau: @Zac Howland: and all the other fools that votted Zac's comment up. It is wrong so wrong. Stop voting with the herd and actally know the answer before you make a comment like that. See code below to see how it blows up. Or at least read the accepted answer by @Steve Jessop: that explains why it will not work in several situations. – Martin York Dec 21 '10 at 19:38
  • @Zac: @Noah: Never its broken. – Martin York Dec 21 '10 at 19:39
  • @Martin: It isn't "wrong" in that it is legal, and for the sole purpose of the code that the professor is likely using it for (using a default constructor by way of `test* t = new test;`) it will compile and operate as expected. It is "wrong" in that it isn't useful for anything beyond that purely academic exercise. I completely agree that it is a horrible example. – Zac Howland Dec 21 '10 at 19:42
  • @Zac Howland: Syntactically legal does not make the code legal. There are hundreds(probably thousands) of examples of syntactically legal code that are undefined behavior. Doing `test* t = new test();` generates a memory leak and probably a de-referencing of a NULL pointer. So it is wrong in that it can not be consistently used correctly without looking at the implementation (and who looks at the implementation while using a library (if it is even available)). – Martin York Dec 21 '10 at 19:52
  • @Martin: I don't disagree; but at the same time, I don't hold professors to the same standard I hold programmers. Professors rarely (if ever) test out more than one use case. If it worked for what he wanted to show, he probably didn't look any further. As a side note, I'd facepalm if you actually wrote `test* t = new test();` in production code. Just reading it makes my face cringe ... – Zac Howland Dec 21 '10 at 20:22
  • @Zac: It is wrong in the sense that it is not legal code. It evokes undefined behavior. Just because you can sneak it past the compiler doesn't make it "OK" – John Dibling Dec 21 '10 at 20:23
  • @Zac: Professors should be held to a higher standard. That they aren't is IMO a big reason why all newbie programmers coming out of college can't program their way out of a wet paper bag. It's the professors' faults. – John Dibling Dec 21 '10 at 20:24
  • @John: It doesn't invoke undefine behavior when used in the method the professor is likely using it (only in the other use cases). I don't disagree that its professors' faults that newbies are terrible, but I cannot in good conscious hold a professor who either has never written production code, or it has been so long since they've seen it that they wouldn't recognize it if it smacked them upside the head, to a higher standard than those who write real code everyday. Theoretically they should be, but it just isn't practical. – Zac Howland Dec 21 '10 at 20:37
  • @Zac: I'm pretty sure it does invoke undefined behavior. The class is non-POD, so regardless of whether it's in `operator new` or not, `test *p = (test*)malloc(sizeof(test)); p->a = 0;` invokes undefined behavior. You can't access the members of a non-POD class before construction starts. I think the compiler is permitted to make `p->a` expand to something looking like a virtual function call, that gets the location of `a` using hidden magic data in `p` that has indeterminate values before constructions starts. It's just that no implementation does that, because it's pointless. – Steve Jessop Dec 22 '10 at 00:44
  • Yes, probably the standard does go too far in allowing implementations free reign with non-POD classes, and IIRC C++0x pulls it back somewhat, but if we're arguing whether the code is valid today, we're using today's standard... – Steve Jessop Dec 22 '10 at 00:49

7 Answers7

2

It's definitely "not right", in the sense that it's giving me the creeps.

Since test has no user-declared constructors, I think it could work provided that the instance of test isn't value-initialized (which would clear the pointer). And provided that you write the corresponding operator delete.

It's clearly a silly example, though - user interaction inside an overloaded operator new? And what if an instance of test is created on the stack? Or copied? Or created with test *tp = new test(); in C++03? Or placement new? Hardly user-friendly.

It's constructors which must be used to establish class invariants (such as "I have an array to use"), because that's the only way to cover all those cases. So allocating an array like that is the kind of thing that should be done in a constructor, not in operator new. Or better yet, use a vector instead.

As far as the standard is concerned - I think that since the class is non-POD the implementation is allowed to scribble all over the data in between calling operator new and returning it to the user, so this is not guaranteed to work even when used carefully. I'm not entirely sure, though. Conceivably your professor has run it (perhaps many years ago when he first wrote the course), and if so it worked on his machine. There's no obvious reason why an implementation would want to do anything to the memory in the specific case of this class.

I believe that is "wrong" because he access the object before the constructor.

I think you're correct on this point too - casting the pointer returned from malloc to test* and accessing members is UB, since the class test is non-POD (because it has private non-static data members) and the memory does not contain a constructed instance of the class. Again, though, there's no reason I can immediately think of why an implementation would want to do anything that stops it working, so I'm not surprised if in practice it stores the intended value in the intended location on my machine.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • Actually we have an exercise to do using this style, (also he frees the pointed by the pointer a memory in delete) – nikitas350 Dec 21 '10 at 18:53
  • 3
    @nikitas350: my sympathy regarding the exercise. In order of preference, I would: (a) skip the class; (b) do the exercise, but put things that should be in the constructor in the constructor and dare the marker to disagree; (c) do as I'm told but try not to learn anything. Depends how badly you need the course credits and if it's a prerequisite for something better. There's no excuse for teaching this. Sometimes you don't want to introduce too many concepts at once, but `operator new` doesn't come before constructors. Just using `malloc` is enough to demonstrate how to write an `operator new`. – Steve Jessop Dec 21 '10 at 19:07
  • It does have a constructor, copy-constructor, and copy-assignment. If you do not define these, the compile does so for you. The code will work (in the sense that it is operational) despite being completely useless from a non-academic perspective. It is a really bad example of implementing your own memory manager. – Zac Howland Dec 21 '10 at 19:19
  • @Zac: yes, the jargon term I intended to use was "user-declared constructors", I wrongly said "defined constructors" instead. The point is that a non-POD class with no user-declared default constructor value-initializes all its members when it is value-initialized, whereas a non-POD class with a user-declared default constructor calls the constructor when value-initialized. – Steve Jessop Dec 21 '10 at 19:22
2

Did some Standard checking. Since test has private non-static members, it is not POD. So new test default-initializes the object, and new test() value-initializes it. As others have pointed out, value-initialization sets members to zero, which could come as a surprise here.

Default-initialization uses the implicitly defined default constructor, which omits initializers for members a and n.

12.6.2p4: After the call to a constructor for class X has completed, if a member of X is neither specified in the constructor's mem-initializers, nor default-initialized, nor value-initialized, nor given a value during execution of the body of the constructor, the member has indeterminate value.

Not "the value its memory had before the constructor, which is usually indeterminate." The Standard directly says the members have indeterminate value if the constructor doesn't do anything about them.

So given test* p = new test;, p->a and p->n have indeterminate value and any rvalue use of them results in Undefined Behavior.

aschepler
  • 70,891
  • 9
  • 107
  • 161
1

The creation/destruction of objects in C++ is divided into two tasks: memory allocation/deallocation and object initialization/deinitialization. Memory allocation/deallocation is done very differently depending on an object's storage class (automatic, static, dynamic), object initialization/deinitialization is done using the object's type's constructor/destructor.

You can customize object initialization/deinitialization by providing your own constructors/destructor. You can customize the allocation of dynamically allocated objects by overloading operator new and operator delete for this type. You can provide different versions of these operators for single objects and arrays (plus any number of additional overloads).

When you want to fine-tune the construction/destruction of objects of a specific type you first need to decide whether you want to fiddle with allocation/deallocation (of dynamically allocated objects) or with initialization/deinitialization. Your code mixes the two, violating one of C++' most fundamental design principle, all established praxis, every known C++ coding standard on this planet, and your fellow-workers' assumptions.

sbi
  • 219,715
  • 46
  • 258
  • 445
1

Your professor is completely misunderstanding the purpose of operator new whose only task is to allocate as much memory as was asked and to return a void* to it.

After that the constructor is called to initialize the object at that memory location. This is not up to the programmer to avoid.

As the class doesn't have a user-defined constructor, the fields are supposed to be uninitialized, and in such a case the compiler has probably freedom to initialize them to some magic value in order to help finding use of uninitialized values (e.g for debug builds). That would defeat the extra work done by the overloaded operator.

Another case where the extra work will be wasted is when using value-initialization: new test();

UncleBens
  • 40,819
  • 6
  • 57
  • 90
1

This is very bad code because it takes initialization code that should be part of a constructor and puts it in operator new which should only allocate new memory.

The expression new test may leak memory (that allocated by p->a = new int[p->n];) and the expression new test() definitely will leak memory. There is nothing in the standard that prevents the implementation zeroing, or setting to an alternate value, the memory returned by a custom operator new before that memory is initialized with an object even if the subsequent initialization wouldn't ordinarily touch the memory again. If the test object is value-initialized the leak is guaranteed.

There is also no easy way to correctly deallocate a test allocated with new test. There is no matching operator delete so the expression delete t; will do the wrong thing global operator delete to be called on memory allocated with malloc.

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
0

This does not work.

Your professor code will fail to initialize correctly in 3/4 of cases.

  • It does not initialize objects correctly (new only affects pointers).
  • The default constructor generated for tests has two modes.
    • Zero Initialization (which happens after new, but POD are set to zero)
    • Default Initialization (POD are uninitialized)

Running Code (comments added by hand)

$ ./a.exe
Using Test::new
Using Test::new
A Count(  0)               // zero initialized:  pointer leaked.
A Pointer(0)
B Count(  10)              // Works as expected because of default init.
B Pointer(0xd20388)
C Count(  1628884611)      // Uninitialized as new not used.
C Pointer(0x611f0108)
D Count(  0)               // Zero initialized because it is global (static storage duration)
D Pointer(0)

The Code

#include <new>
#include <iostream>
#include <stdlib.h>

class test
{
    // code
        int *a;
        int n;
    public:
        void* operator new(size_t);

        // Added dredded getter so we can print the values. (Quick Hack).
        int* getA() const { return a;}
        int  getN() const { return n;}
};

void* test::operator new(size_t size)
{
    std::cout << "Using Test::new\n";
    test *p;
    p=(test*)malloc(size);
    p->n = 10;             // Fixed size for simple test.
    p->a = new int[p->n];
    return p;
}

// Objects that have static storage duration are zero initialized.
// So here 'a' and 'n' will be set to 0
test d;

int main()
{
    // Here a is zero initialized. Resulting in a and n being reset to 0
    // Thus you have memory leaks as the reset happens after new has completed.
    test* a = new test();
    // Here b is default initialized.
    // So the POD values are undefined (so the results are what you prof expects).
    // But the standard does not gurantee this (though it will usually work because
    // of the it should work as a side effect of the 'zero cost principle`)
    test* b = new test;

    // Here is a normal object.
    // New is not called so its members are random.
    test  c;

    // Print out values
    std::cout << "A Count(  " << a->getN() << ")\n";
    std::cout << "A Pointer(" << a->getA() << ")\n";
    std::cout << "B Count(  " << b->getN() << ")\n";
    std::cout << "B Pointer(" << b->getA() << ")\n";
    std::cout << "C Count(  " << c.getN() << ")\n";
    std::cout << "C Pointer(" << c.getA() << ")\n";
    std::cout << "D Count(  " << d.getN() << ")\n";
    std::cout << "D Pointer(" << d.getA() << ")\n";
}

A valid example of what the professor failed to do:

class test
{
    // code
        int n;
        int a[1];  // Notice the zero sized array.
                   // The new will allocate enough memory for n locations.
    public:
        void* operator new(size_t);

        // Added dredded getter so we can print the values. (Quick Hack).
        int* getA() const { return a;}
        int  getN() const { return n;}
};

void* test::operator new(size_t size)
{
    std::cout << "Using Test::new\n";

    int  tmp;
    std::cout << How big?\n";
    std::cin  >> tmp;

    // This is a half arsed trick from the C days.
    // It should probably still work.
    // Note: This may be what the professor should have wrote (if he was using C)
    //       This is totally horrible and whould not be used.
    //       std::vector is a much:much:much better solution.
    //       If anybody tries to convince you that an array is faster than a vector
    //       The please read the linked question below where that myth is nailed into
    //       its over sized coffin.
    test *p =(test*)malloc(size + sizeof(int) * tmp);
    p->n = tmp;
    // p->a  = You can now overflow a upto n places.
    return p;
}

Is std::vector so much slower than plain arrays?

Community
  • 1
  • 1
Martin York
  • 257,169
  • 86
  • 333
  • 562
  • 1
    Professor's answer is going to be, "well don't do the things that don't work, then". You and I think it's stupid to write a class that only works when created in a specific way. We further think that since the compiler-generator copy constructor also causes leaks and double-frees, it should be suppressed. But the Professor's response is superficially plausible, since it's the same response that people who actually know what they're talking about give, when asked "deleting a subclass of `vector` through a pointer to `vector` doesn't work - does this mean the `vector` class is wrong?". – Steve Jessop Dec 21 '10 at 19:37
  • Btw, I don't want to suggest that the professor has a leg to stand on. Just that writing some code that doesn't work, that uses code X in a particular way, doesn't prove that code X is wrong. Other things prove that. – Steve Jessop Dec 21 '10 at 19:39
  • 2
    @Steve Jessop: The problem is we (the old developers) have to deal with this kids that leave college thinking they can program (but have no clue) because they have been tough by a bunch of half wit incompetents into doing stupid things like the above. Sure there are a couple of things that canbe done to fix it, but it is much simpler to do it the way the language expects you to do it (constructor) and doing this way makes the life of the maintainer (me) easier. For you kids. I know where you live and I own an Axe. **DONT** write code like this. – Martin York Dec 21 '10 at 19:45
  • @Steve: The professor's more likely answer would be something along the lines of, "This was meant as a simple example of creating your own memory manager." A real-world example would have been far more educational, even though the code would be more complex. – Zac Howland Dec 21 '10 at 19:46
  • @Zac: if it's meant as a simple example of creating your own memory manager, then why not stop at calling `malloc` from `operator new` and `free` from `operator delete`, as proof of concept that this code controls the source of the memory? This *isn't* a simple example, it's needlessly over-complicated by the invalid stuff. Doesn't necessarily mean that the prof wouldn't say what you say he will, because he might *think* almost anything, given that he doesn't know what he's doing, so I'll stop guessing what specific delusion he suffers from... – Steve Jessop Dec 21 '10 at 19:52
  • @Martin: true dat. I'd say that the professor might not be a half-wit. It's possible he just knows very little about C++, and cares even less, but ended up running the class anyway for some stupid reason. – Steve Jessop Dec 21 '10 at 19:55
  • @Zac: @Steve: @nikitas350: Updated my example with a working version of how he probably should have done it. – Martin York Dec 21 '10 at 19:59
  • Also note: A lot of debug builds will zero out or reset the memory content to some magic number after the call to new. This makes debugging easier and of course will cause all sorts of headaches to the debuggers as the code that is supposed to help them is breaking the code. – Martin York Dec 21 '10 at 20:01
  • @Martin: maybe. Much better chance of working, although that's still not strictly conforming. Bad C++ programmers frequently seem to display this almost overwhelming aversion to `vector`, I'm not sure there's much value in offering them new and ever more sophisticated ways to dynamically allocate an array badly ;-) – Steve Jessop Dec 21 '10 at 20:02
  • In pedantic mode, g++ informs me that zero-sized array is not ISO C++. – UncleBens Dec 21 '10 at 20:18
  • @Steve: I don't disagree; you just have to remember this is an academic we are talking about. 99.99% of what I saw in college was either completely useless or absurdly wrong when I started working. @Martin: Don't get me wrong, I feel your pain with bad young programmers and I'm not defending the coding at all - just noting that with what the professor was doing, it will (at least seemingly) work. I've fired programmers for less poorly written code ... – Zac Howland Dec 21 '10 at 20:20
  • UncleBens: @Steve: That's why I noted it was a half arsed C trick. OK my ranting is over. I just accept there are stupid people teaching programming (including myself sometimes) and move on. – Martin York Dec 21 '10 at 20:21
  • While it is a motivating example, shouldn't you rather use a factory function (since `new` doing I/O is too bizarre), and wouldn't it be possible to just overallocate without having the non-standard array member (`getA` returning the address beyond the end of the instance - alignment issues being dealt with)? – UncleBens Dec 21 '10 at 20:23
  • @UncleBens: I would never ever ever do anything like this. The least example was just a historic curiosity better left for the C boys and girls to play with. – Martin York Dec 21 '10 at 20:24
-2

As you show this is wrong. You can also see how easy it is to get this wrong.

There usually isn't any reason for it unless you are trying to manage your own memory allocations and in a C++ environment you would be better off learning the STL and write custom allocators.

Casey
  • 12,070
  • 18
  • 71
  • 107