1

I'm working on a simple JSON parser for fun, and I've got my value type:

typedef enum {
    JSON_NULL,
    JSON_NUMBER,
    JSON_STRING,
    JSON_ARRAY,
    JSON_OBJECT,
    JSON_BOOLEAN
} json_type_t;

// datatype for json value
struct json_value {
    using arr_type = vector<json_value>;
    using obj_pair = pair<string, json_value>;
    using obj_type = unordered_map<string, json_value>;

    // constructors 
    json_value()
        : type(JSON_NULL) {}

    json_value(json_type_t type)
        : type(type) {
        switch(type) {
            case JSON_STRING:  str = new string;   break;
            case JSON_ARRAY:   arr = new arr_type; break;
            case JSON_OBJECT:  obj = new obj_type; break;
            default:
                break;
        }
    }

    // copy construct
    json_value(const json_value& other) {
        printf("copying json value\n");
        if (other.type != JSON_NULL) {
            type = other.type;
            switch(type) {
                case JSON_NULL:                                    return;
                case JSON_NUMBER:  num = other.num;                return;
                case JSON_BOOLEAN: val = other.val;                return;         
                case JSON_STRING:  str = new string  (*other.str); return;
                case JSON_ARRAY:   arr = new arr_type(*other.arr); return;
                case JSON_OBJECT:  obj = new obj_type(*other.obj); return;
            }
        }
    }

    // move construct
    json_value(json_value&& other) {
        type = other.type;
        switch(type) {
            case JSON_NULL:                     break;
            case JSON_NUMBER:  num = other.num; break;
            case JSON_BOOLEAN: val = other.val; break; 
            case JSON_STRING:  str = other.str; other.str = nullptr; break;
            case JSON_ARRAY:   arr = other.arr; other.arr = nullptr; break;
            case JSON_OBJECT:  obj = other.obj; other.obj = nullptr; break;
        }
    }

    // assignment operator copy/swap idiom
    json_value& operator =(json_value other) {
        destroy();
        type = other.type;
        switch(type) {
            case JSON_NULL:                     break;
            case JSON_NUMBER:  num = other.num; break;
            case JSON_BOOLEAN: val = other.val; break; 
            case JSON_STRING:  str = other.str; other.str = nullptr; break;
            case JSON_ARRAY:   arr = other.arr; other.arr = nullptr; break;
            case JSON_OBJECT:  obj = other.obj; other.obj = nullptr; break;
        }
        return *this;
    }

    // destructor
    ~json_value() {
        destroy();
    }

    // type of value and union to hold data
    json_type_t type = JSON_NULL;
    union { 
        bool      val;
        double    num;
        string   *str;
        arr_type *arr;
        obj_type *obj;
    };

private:
    // cleanup our memory
    void destroy() { 
        switch(type) {
            case JSON_NULL:    break;
            case JSON_NUMBER:  break;
            case JSON_BOOLEAN: break; 
            case JSON_STRING:  delete str; break;
            case JSON_ARRAY:   delete arr; break;
            case JSON_OBJECT:  delete obj; break;
        }
        type = JSON_NULL;
    }
};

I've written proper copy/move constructors and an assignment operator. My issue is that, when running a particular benchmark, the parser takes about 40ms. Trying to optimize a bit, I commented out the copy constructor to make sure I'm not making any unecessary copies. Sure enough, my code still compiles, indicating the move constructor is sufficient, and it's ~25% faster!

Instrumenting the copy constructor, I can see it is indeed being called, but as I've shown, the move constructor is sufficient.

So, my question is, in what contexts would a copy constructor be preferred over the move constructor, and how can I find where that's happening?

gct
  • 14,100
  • 15
  • 68
  • 107
  • 1
    Do you, by chance, ever use `json_value` inside a standard container? – Justin Jul 18 '18 at 18:35
  • @Justin indeed, see the first three `using`s at the top of `json_value` – alter_igel Jul 18 '18 at 18:36
  • I use std::pair, std::vector and std::unordered_map as a keypair, json_array and json_object respectively. They're part of json_value, and recursively hold json_values. – gct Jul 18 '18 at 18:36
  • Unrelated: A move assignment operator might be a good addition. – user4581301 Jul 18 '18 at 18:37
  • You likely just need to make your move constructor and move assignment operator `noexcept`. For exception safety, if you don't, containers will prefer the copy constructor. – François Andrieux Jul 18 '18 at 18:37
  • 1
    @SeanMcAllister [This](https://stackoverflow.com/questions/8001823/how-to-enforce-move-semantics-when-a-vector-grows) might be your issue – NathanOliver Jul 18 '18 at 18:37

1 Answers1

2

Standard containers all try to have the strong exception guarantee, meaning that if an exception is thrown, it's as if nothing happened.

Take the case of std::vector. In order to keep this guarantee, it can only use the move constructor if it's guaranteed that moving won't throw. Consider the case when the vector needs to resize its buffer:

d // new element to push_back
[a][b][c] // old, filled buffer
[ ][ ][ ][ ][ ][ ] // new, empty buffer

Moving the new element in place isn't a problem, even if it throws, since we'll still have the old buffer that we can use:

[a][b][c]
[ ][ ][ ][d][ ][ ]

But when we move the old buffer's elements into the new buffer, what happens if we throw in the middle?

[ ][#][c]
[a][#][ ][d][ ][ ]

We don't know the state that b is in when it throws, so how do we recreate the old state? Even if we were able to resurrect b, we can't just go and move the prior elements back, because moving those could throw too.

If we fall back on a copy, we could back out at any time by just discarding the new buffer.

Thus, in order to keep the strong exception guarantee, std::vector can't move unless the move constructor is noexcept.


Declaring the move operations as noexcept is necessary for the standard containers to use them. Most of the time, move constructors and move assignments can be noexcept, so declare them as such when they are:

json_value(json_value&& other) noexcept {
    // ...
}

json_value& operator=(json_value&& other) noexcept {
    // ...
}
Justin
  • 24,288
  • 12
  • 92
  • 142
  • And indeed it does! – gct Jul 18 '18 at 18:40
  • Nitpick : If you don't have a copy constructor, the throwing move constructor will be used. So it's not impossible to use the move constructor. `std::vector` can use a throwing move constructor but will only do so if there's no choice. – François Andrieux Jul 18 '18 at 18:49
  • @FrançoisAndrieux Indeed, in that case, you lose the *strong exception guarantee*. – Justin Jul 18 '18 at 18:50