Unspecified behaviour in C++ when adding to map

I recently happened upon a bug in a C++ codebase which had me scratching my head. When the behaviour differed between GCC and Clang it became even more muddled.
This is the code in question

Simple enough, right? We’re adding an entry to the map, where the value of the entry is the size of the map. However, which size? Is it the size of the map before the insertion, or after? With C++ things are not so easy.
As it turns out, the result differs between GCC and Clang. GCC (4.9.2) will put the value of 1 into the map, i.e. the size of the map after insertion. And Clang (3.6.0) will put the value of 0 into the map, i.e. the size of the map before the insertion. Neither will print any warning during compilation. Changing optimization levels has no effect.

So how could this be?
On one hand the logical thing would be to first calculate the value on the right side of the assignment (the map size) first and then assign that to the value on the left side. This is what Clang seems to be doing.
On the other hand however the value returned by the left side is a reference to the newly inserted entry, as specified in the standard. Thus it would also make sense to first get hold of the reference before assigning a value to it, which is what GCC seems to be doing.

Or is this perhaps undefined behaviour?

Reading the standard it says in 5.17

The assignment operator (=) and the compound assignment operators all group right-to-left.

Which seems to indicate that Clang is correct. However, it also says

With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

This seems to indicate that something like

would be undefined behaviour. But that’s not what we’re doing here, not exactly. However we are performing an operation for the left side which has a side effect on the right side.

The more succinct summary over att cppreference.com describes the order of evaluation in C++ in this way

Order of evaluation of the operands of almost all C++ operators (including the order of evaluation of function arguments in a function-call expression and the order of evaluation of the subexpressions within any expression) is unspecified. The compiler can evaluate operands in any order, and may choose another order when the same expression is evaluated again.

And this is perhaps the reason for this behaviour. If so, shouldn’t the compilers print out a warning? Perhaps some of our readers have more insight into the C++ standard?

You can check for yourself with this code

Run with

and

This Post Has 12 Comments

  1. I got shouted at for saying that it is undefined behavior, so to be correct, it is “Unspecified Behavior”. This is because the evaluation of the [] operator will either happen before, or after the size() call and they cannot interleave.

    1. That is absolutely correct: this is “unspecified” and not “undefined” behavior. I forgot that C++ provides two clearly different definitions for those two.

      One question remains though: can and should the compilers print a warning about this? One one hand the side effects of all calls can’t always be properly calculated. On the other hand std::map is commonly used class.
      In this specific example it took me some time to find the cause of the bug, since the code at a first glance seems to be ok.

      1. It is unfeasible to track all possible side effects. Which is what you would have to do to properly detect this problem.

        1. Yep, there’s no way to catch this in general.
          You could write specific rules for std::map (given it’s so commonly used). But then you wouldn’t catch other map implementations, and that would perhaps only provide a false sense of security.

    1. Indeed, using insert() or emplace() or just first calling size() in a separate statement would prevent the issue.

    1. Nice.
      I love how this exact case is mentioned in the introduction.

      1. I discuss the other main case from the paper in my Stackoverflow question/answer “Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?”: http://stackoverflow.com/q/27158812/1708801

        This one requires a lot more work to understand all the odds and ends

  2. > The assignment operator (=) and the compound assignment operators all group right-to-left.

    This quote from the standard doesn’t have anything to do with order of evaluation. It’s simply saying that: a = b = c means a = (b = c) rather than (a = b) = c.

    No matter how the operator groups, the program is still free to evaluate the sub expressions a, b, and c, in any order. For example, a = (b = c) can result in b being evaluated first, then a, then c.

  3. Thanks a lot ! I was scratching my head whole day yesterday over this bug and finally realized it is an issue with calculating the size. Felt so validated after seeing your post !!

Leave a Reply

Close Menu