What is wrong with "checking for self-assignment" and what does it mean?

Question

In Herb Sutter's book Exceptional C++ (1999), he has words in item 10's solution:

"Exception-unsafe" and "poor design" go hand in hand. If a piece of code isn't exception-safe, that's generally okay and can simply be fixed. But if a piece of code cannot be made exception-safe because of its underlying design, that almost always is a signal of its poor design.

Example 1: A function with two different responsibilities is difficult to make exception-safe.

Example 2: A copy assignment operator that is written in such a way that it must check for self-assignment is probably not strongly exception-safe either

What does he mean by the term "check for self-assignment"?

[INQUIRY]

Dave and AndreyT shows us exactly what "check for self-assignment" means. That's good. But the question is not over. Why does "check for self-assignment" hurts "exception safety"(according to Hurb Sutter)? If the caller tries to do self-assignment, that "check" works as if no assignment ever occurs. Does it really hurt?

[MEMO 1] In item 38 Object Identity later in Herb's book, he explains about self-assignment.

"in such a way that **it must**" the most important word here is: must — curiousguy, Aug 18 '12 at 03:35

score 53 · Accepted Answer · edited Aug 03 '22 at 16:53

53

A question that's of greater importance in this case would be:

"What does it mean, when a you write function in a way, that requires you to check for self assignment???"

To answer my rhetorical question: It means that a well-designed assignment operator should not need to check for self-assignment. Assigning an object to itself should work correctly (i.e. have the end-effect of "doing nothing") without performing an explicit check for self-assignment.

For example, if I wanted to implement a simplistic array class along the lines of

class array {
    // code...
 
    int *data;
    size_t n;
};

...and came up with the following implementation of the assignment operator...

array &array::operator =(const array &rhs) 
{
    delete[] data;

    n = rhs.n;
    data = new int[n];

    std::copy_n(rhs.data, n, data);

    return *this;
}

That implementation would be considered "bad" since it obviously fails in case of self-assignment.

In order to "fix" this, you have two options;

Add an explicit self-assignment check

array &array::operator =(const array &rhs) {
    if (&rhs != this) {
        delete[] data;

        n = rhs.n;
        data = new int[n];

        std::copy_n(rhs.data, n, data);
    }
    return *this;
}

Follow a "check-less" approach:

array &array::operator =(const array &rhs) {
      size_t new_n = rhs.n;
      int *new_data = new int[new_n];

      std::copy_n(rhs.data, new_n, new_data);

      delete[] data;

      n = new_n;
      data = new_data;

      return *this;
}

The latter approach is better in a sense that it works correctly in self-assignment situations without making an explicit check. This implementation is far for perfect from a 'safety point of view', it is here to illustrate the difference between "checked" and "check-less" approaches to handling self-assignment. The later check-less implementation can be written more elegantly through the well-known copy-and-swap idiom.

This does not mean that you should avoid explicit checks for self-assignment. Such check do make sense from the performance point of view: there's no point in carrying out a long sequence of operations just to end up "doing nothing" in the end. But in a well-designed assignment operator such checks should not be necessary from the correctness point of view.

edited Aug 03 '22 at 16:53

user438383

5,716
8
28
43

answered Aug 18 '12 at 02:00

AnT stands with Russia

312,472
42
525
765

2

"_Such check do make sense from the performance point of view_" Rally, self assignment often happens in your programs? – curiousguy Aug 18 '12 at 03:40
@curiousguy: Depends on the program, on the algorithm. If it allows self-assignment, so be it. Branchless code always look more elegant than branched one, meaning that I'd allow self-assigment to happen in such cases instead of trying to catch and prevent it on the outside. – AnT stands with Russia Aug 18 '12 at 04:32
OK. I should admit, understanding that "check for self-assignment" statement needs a strong context. – Jimm Chen Apr 22 '13 at 01:20
1

I think when having an `if (this != &rhs) return *this;` at the top will often result in a more easily verifiable as correct function. In your third scenario I find myself checking for correctness. Working under the assumption that self-assignment isn't the case, it's often a lot easier to reason about the code. – Aidiakapi May 06 '16 at 01:22
3

@Aidiakapi - Don't you mean if (this == &rhs) return *this; – John Bowers May 13 '16 at 20:48
@JohnBowers Exactly, made a typo. – Aidiakapi May 14 '16 at 11:09
[C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c62-make-copy-assignment-safe-for-self-assignment) recommends check-less implementations for performance reasons because self-assignments rarely happens (see [details](https://stackoverflow.com/a/57676637/5447906)) in real programs. – anton_rh Aug 28 '19 at 07:26
Wouldn't any move assignment operator NEED to check for self assignment? – Zebrafish Aug 22 '21 at 23:17
@Zebrafish: Well, just as described above, it depends on how it is implemented. If you implement move semantics using the move-is-swap idiom, then you don't formally need to check for self-assignment. – AnT stands with Russia Aug 25 '21 at 19:56
@anton_rh - the guidelines recommend check-less implementations for a class IIF the constituent members are also self-assignment-safe for efficiency purposes. It is not just for efficiency (given that it has a mandatory safety prerequisite for a self-check-less assignment to be implemented that way.) – luis.espinal Jul 21 '22 at 18:11

camino · Answer 2 · 2018-08-09T21:11:19.753

11

From c++ core guide

Foo& Foo::operator=(const Foo& a)   // OK, but there is a cost
{
    if (this == &a) return *this;
    s = a.s;
    i = a.i;
    return *this;
}

This is obviously safe and apparently efficient. However , what if we do one self-assignment per million assignments? That's about a million redundant tests (but since the answer is essentially always the same, the computer's branch predictor will guess right essentially every time). Consider:

Foo& Foo::operator=(const Foo& a)   // simpler, and probably much better
{
    s = a.s;
    i = a.i;
    return *this;
}

Note: The code above only apply to classes without pointer, for classes with pointer point to dynamic memory. Please refer to Ant's answer.

edited Aug 09 '18 at 21:11

answered May 04 '17 at 16:33

camino

10,085
20
64
115

1

err, what if something is dynamically allocated? you have to check for self-assignment then as the other answers have made clear – Aryaman Aug 08 '18 at 17:05
2

What if something is not dynamically allocated? That's the different aspects of the same thing. – camino Aug 09 '18 at 14:53
I agree, but you should make it clear that this is only a good optimization if nothing in your class is a pointer. – Aryaman Aug 09 '18 at 17:55
I add a comment as you suggested. – camino Aug 09 '18 at 21:11

David · Answer 3 · 2012-08-18T01:59:21.610

3

MyClass& MyClass::operator=(const MyClass& other)  // copy assignment operator
{
    if(this != &other) // <-- self assignment check
    {
        // copy some stuff
    }

    return *this;
}

Assigning an object to itself is a mistake but it shouldn't logically result in your class instance changing. If you manage to design a class where assigning to itself changes it, it's poorly designed.

edited Aug 18 '12 at 01:59

answered Aug 18 '12 at 01:51

David

27,652
18
89
138

6

@Mooing Duck Don't edit my answer so drastically. I don't even agree with what you added - it's the easy way out and almost always sub-optimal. Sometimes that's okay, sometimes it's not. Write your own answer. – David Aug 18 '12 at 01:59
I figured the answer would be better if it also contained a sample of a good way to do things alongside the bad. Sorry if I stepped over the line. You're right that it's sub-optimal, but it's pretty standard, because it's fast enough, and hard to screw up. If you'd provide some bits on the right way to do assignment, I'd upvote. (doesn't have to be copy and swap, you're allowed to disagree with me) – Mooing Duck Aug 18 '12 at 02:13
@MooingDuck I think _almost_ all the time the right way is not to explicitly provide a copy assignment op. If you design your class well the default is all you'll ever want. – David Aug 18 '12 at 02:25
That should be in the answer then. – Mooing Duck Aug 18 '12 at 02:55

score 2 · Answer 4 · answered Aug 18 '12 at 03:04

2

The general reason to check for self-assignment is because you destroy your own data before copying in the new one. This assignment operator structure is also not strongly exception safe.

As an addendum, it was determined that self-assignment does not benefit performance at all, because the comparison must run every time but self-assignment is extremely rare, and if it does occur, this is a logical error in your program (really). This means that over the course of the program, it's just a waste of cycles.

answered Aug 18 '12 at 03:04

Puppy

144,682
38
256
465

"_this is a logical error in your program (really)._" really, it isn't. – curiousguy Aug 18 '12 at 03:36
Self-assignment is a meaningless statement. Meaningless statements are never intentionally written by programmers; and therefore always errors. – Puppy Aug 18 '12 at 03:53
4

No, self assignment should have no effect. It is not meaningless. – curiousguy Aug 18 '12 at 04:21
Having no effect is by definition meaningless, as the program is identical before and after it's execution. – Puppy Aug 18 '12 at 04:35
12

Self-assignments aren't always obvious. It's totally reasonable to assume that many STL algorithms (partitioning, sorting and such) will do self-assignments on the elements. – fredoverflow Aug 18 '12 at 07:59
A logical error. Unless, of course, you have machine-generated source code. Oops. – Andrew Lazarus Aug 21 '14 at 00:54
@Puppy - a self-assignment is never an error if a design allows it to create a readable algorithm uncluttered by self-checks at the point the caller invokes the assignment operator. This is particularly true when the algorithm is performing "bulk" operations. The corollary of this design choice is that the assignment implementations *must* ensure they are safe. In essence, the obligation to prevent or allow (survive) self-checks is hidden from the caller (for the sake of providing a simplified API.) T – luis.espinal Jul 21 '22 at 18:16

What is wrong with "checking for self-assignment" and what does it mean?

4 Answers4

A question that's of greater importance in this case would be:

"What does it mean, when a you write function in a way, that requires you to check for self assignment???"

That implementation would be considered "bad" since it obviously fails in case of self-assignment.

Linked