1

Use case

I have a Node class that reads, creates and sends Message objects to other Node classes. I do not wish to modify the Message class constructor because I use it in other classes as well.

Message objects basically look like that (Message.hpp file)

class Message {
    size_t size;
    char* buffer;

    Message(size_t capacity) : buffer(new char[capacity]), size(0) {}
}

A use case could look like that (.cpp file)

Node::Node() : header_size(sizeof(int)) {}

Node::processMessage(const unique_ptr<Message>& message){
    vector<Event>& events = readEvents(message); // reads the char* buffer
    const int message_id = readMID(message); // reads another part of the buffer
    vector<unique_ptr<Message>> messagesToSend;

    
    for (const Event& event : events){
        unique_ptr<Message> message(new Message(header_size + sizeof(Event));
        writeInMessage<Event>(message, event);
        writeInMessage<int>(message, message_id);
        messagesToSend.push_back(std::move(message));
    }

    sendMessages(messagesToSend);
}

This previous use case simply shards the events contained in a message in many different messages, and keeps the same message id.

In my use cases, the message header usually is an integer used to identify the message and the rest of the buffer is filled with a sequence of events that could be of different types. Now, I want to modify Node so that it would manage the identifiers automatically so that the user can concentrate on the event processing, and I wondered if I could overload or replace the operator new for the specific purpose of adding header_size to the size of the allocated memory when creating a new message object in the Node class.

The overload should allow me to write something like this (.cpp)

Node::processMessage(const unique_ptr<Message>& message){
    vector<Event>& events = readEvents(message);
    vector<unique_ptr<Message>> messagesToSend;

    
    for (const Event& event : events){
        unique_ptr<Message> message(new Message(sizeof(Event));
        writeInMessage<Event>(message, event);
        messagesToSend.push_back(std::move(message));
    }

    sendMessages(messagesToSend);
}

The management of the header would be done by other methods called before or after processMessage().

I have already looked at

I have read a few stackoverflow questions and a bit of documentation on how replacing the operator new works, but it was always used globally or for the current class (in my case I do not wish to replace the operator in the Message class but in the Node class).

I vaguely remember there was a stackoverflow FAQ on operator overloading and new/delete operators overloading that didn't recommend modifying these operators here : What are the basic rules and idioms for operator overloading?.

Questions

For this purpose, is overloading the operator new in the Node class a good idea or at least a legitimate one ? If not, what would be the correct approach ?

If this approach is legitimate, what would it look like ? Would there be a need for the Message class to define the operator overloading as a friend method or something like that ? Also, would I need to overload the operator delete as well ?

  • So the only way to allocate more memory to the buffer is to go through the constructor of the `Message` class ? I do not wish to modify the `Message` class. Can't I increase the amount of raw memory and write more than sizeof(Event) bytes in the buffer ? Will that cause a memory leak ? – Big Ben Baggle Jun 23 '20 at 20:35
  • Are you trying to get `Message` to be allocated with an extra `sizeof(int)` bytes when `Node` creates it? – François Andrieux Jun 23 '20 at 20:47
  • If there is no other choice, I will just keep the code as it is - using `new Message(header_size + sizeof(Event))`. But if there is a way to write `new Message(sizeof(Event))` without touching the `Message` constructor and reserve the same amount of memory then I'll take it. Maybe `operator new` is not the way though. – Big Ben Baggle Jun 23 '20 at 20:49
  • I just saw your answer. Yes, I wish to allocate `sizeof(int)` more bytes without having to write it every time. But only in the code of the `Node` class. – Big Ben Baggle Jun 23 '20 at 20:50
  • There are only two options for overloading `operator new`. You can overload the global `operator new` or the member `operator new`. The global `operator new` will impact all uses of `new` in the current translation unit, so we can ignore that one. It won't be useful here. The other option is overloading `operator new` for a specific class as a member operator. However, that only changes how allocating *that* class is done. If you overload `Node::operator new` then you only change what happens when you do `new Node`. You can't change `Message::operator new` from within `Node`. – François Andrieux Jun 23 '20 at 20:54
  • You can achieve what you want with placement `new`, where you allocate enough aligned memory for both `Message` and `int`, then placement `new` the `Message` in that memory space. But this is very advanced and messy. You'll need to manage the destruction yourself in two steps (calling the destructor, and then freeing the memory separately) . If you just want to make the constructor a bit easier to call, this is probably not worth the effort. – François Andrieux Jun 23 '20 at 20:57
  • Thank you, that's exactly what I wanted to know. Placement new doesn't seem that hard to implement, but I hardly see how I can make that easy to use. The only other option I see is to create a member method in `Node` that returns a new message with the correct amount of memory (some kind of factory pattern if I am not wrong), but that would force the user to know he has to use this method instead of the `new` keyword. – Big Ben Baggle Jun 23 '20 at 21:07
  • Consider a factory pattern where the only way to create a message is to call a function that makes them. Then, they have no choice but to know about them. – François Andrieux Jun 23 '20 at 21:12
  • Additionally, consider using `std::vector` instead of tracking the size with a pointer. – François Andrieux Jun 23 '20 at 21:14
  • I have little choice here. I am not the creator of the `Message` class so I try to not modify it. Also, message buffer management is made using `memcpy` calls to write inside it and `reinterpret_cast` to read, so I don't think I could use `vector`. Though, in another situation I agree it would be better to use standard containers indeed. – Big Ben Baggle Jun 23 '20 at 21:17
  • As for the factory pattern, I'll consider it. This might be the best solution I have here. – Big Ben Baggle Jun 23 '20 at 21:22
  • I think you were very comprehensive in your answers. Do you wish to create the answer that I'll validate later, or do you wish for me to create one based on your comments ? – Big Ben Baggle Jun 23 '20 at 21:25
  • Feel free to make an answer. – François Andrieux Jun 23 '20 at 21:44

1 Answers1

0

My thanks to François Andrieux for his timely answers. The short answer to my question is to create a factory pattern to create Message objects. Something like :

(Node.cpp)

unique_ptr<Message> Node::createMessage(size_t capacity) {
    return unique_ptr(new Message(capacity + header_size));
}

That way, the user doesn't have to precise the header size every time they want to create a new message object. The downside of this is that the user has to know the exact method to use. Forcing the user to use this method is possible if the Message natural constructor is made inaccessible.

Overloading the operator new is not the answer, and the justification is the following :

There are only two options for overloading operator new. You can overload the global operator new or the member operator new. The global operator new will impact all uses of new in the current translation unit, so we can ignore that one. It won't be useful here. The other option is overloading operator new for a specific class as a member operator. However, that only changes how allocating that class is done. If you overload Node::operator new then you only change what happens when you do new Node. You can't change Message::operator new from within Node

Another possibility to allocate more space than the natural constructor would do is to use placement new. However this makes construction and destruction over-complicated which is not the intended objective here. Also, it would be less risky to let the user write new Message(header_size + capacity) every time than let him allocate and free memory.