C++ FAQ Celebrating Twenty-One Years of the C++ FAQ!!!
(Click here for a personal note from Marshall Cline.)
Section 10:
10.1 What's the deal with constructors?
10.2 Is there any difference between List x; and List x();?
10.3 Can one constructor of a class call another constructor of the same class to initialize the this object? Updated!
10.4 Is the default constructor for Fred always Fred::Fred()?
10.5 Which constructor gets called when I create an array of Fred objects?
10.6 Should my constructors use "initialization lists" or "assignment"?
10.7 Should you use the this pointer in the constructor?
10.8 What is the "Named Constructor Idiom"?
10.9 Does return-by-value mean extra copies and extra overhead?
10.10 Does the compiler optimize returning a local variable by value?
10.11 Why can't I initialize my static member data in my constructor's initialization list?
10.12 Why are classes with static data members getting linker errors?
10.13 Can I add = initializer; to the declaration of a class-scope static const data member?
10.14 What's the "static initialization order fiasco"?
10.15 How do I prevent the "static initialization order fiasco"?
10.16 Why doesn't the construct-on-first-use idiom use a static object instead of a static pointer?
10.17 How do I prevent the "static initialization order fiasco" for my static data members?
10.18 Do I need to worry about the "static initialization order fiasco" for variables of built-in/intrinsic types?
10.19 How can I handle a constructor that fails?
10.20 What is the "Named Parameter Idiom"?
10.21 Why am I getting an error after declaring a Foo object via Foo x(Bar())?
10.22 What is the purpose of the explicit keyword?
[10.9] Does return-by-value mean extra copies and extra overhead?

Not necessarily.

All(?) commercial-grade compilers optimize away the extra copy, at least in cases as illustrated in the previous FAQ.

To keep the example clean, let's strip things down to the bare essentials. Suppose function caller() calls rbv() ("rbv" stands for "return by value") which returns a Foo object by value:

class Foo { ... };

Foo rbv();

void caller()
{
  Foo x = rbv();  the return-value of rbv() goes into x
  ...
}
Now the question is, How many Foo objects will there be? Will rbv() create a temporary Foo object that gets copy-constructed into x? How many temporaries? Said another way, does return-by-value necessarily degrade performance?

The point of this FAQ is that the answer is No, commercial-grade C++ compilers implement return-by-value in a way that lets them eliminate the overhead, at least in simple cases like those shown in the previous FAQ. In particular, all(?) commercial-grade C++ compilers will optimize this case:

Foo rbv()
{
  ...
  return Foo(42, 73);  suppose Foo has a ctor Foo::Foo(int a, int b)
}
Certainly the compiler is allowed to create a temporary, local Foo object, then copy-construct that temporary into variable x within caller(), then destruct the temporary. But all(?) commercial-grade C++ compilers won't do that: the return statement will directly construct x itself. Not a copy of x, not a pointer to x, not a reference to x, but x itself.

You can stop here if you don't want to genuinely understand the previous paragraph, but if you want to know the secret sauce (so you can, for example, reliably predict when the compiler can and cannot provide that optimization for you), the key is to know that compilers usually implement return-by-value using pass-by-pointer. When caller() calls rbv(), the compiler secretly passes a pointer to the location where rbv() is supposed to construct the "returned" object. It might look something like this (it's shown as a void* rather than a Foo* since the Foo object has not yet been constructed):

// Pseudo-code
void rbv(void* put_result_here)  Original C++ code: Foo rbv()
{
  ...     Note: rbv() initializes (not assigns to) the variable pointed to by put_result_here
}

// Pseudo-code
void caller()
{
  //Original C++ code: Foo x = rbv()
  struct Foo x;  Note: x does not get initialized prior to calling rbv()
  rbv(&x);       Note: rbv() initializes a local variable defined in caller()
  ...
}
So the first ingredient in the secret sauce is that the compiler (usually) transforms return-by-value into pass-by-pointer. This means that commercial-grade compilers don't bother creating a temporary: they directly construct the returned object in the location pointed to by put_result_here.

The second ingredient in the secret sauce is that compilers typically implement constructors using a similar technique. This is compiler-dependent and somewhat idealized (I'm intentionally ignoring how to handle new and overloading), but compilers typically implement Foo::Foo(int a, int b) using something like this:

// Pseudo-code
void Foo_ctor(Foo* this, int a, int b)  Original C++ code: Foo::Foo(int a, int b)
{
  ...
}
Putting these together, the compiler might implement the return statement in rbv() by simply passing put_result_here as the constructor's this pointer:
// Pseudo-code
void rbv(void* put_result_here)  Original C++ code: Foo rbv()
{
  ...
  Foo_ctor((Foo*)put_result_here, 42, 73);  Original C++ code: return Foo(42,73);
  return;
}
So caller() passes &x to rbv(), and rbv() in turn passes &x to the constructor (as the this pointer). That means constructor directly constructs x.

In the early 90s I did a seminar for IBM's compiler group in Toronto, and one of their engineers told me that they found this return-by-value optimization to be so fast that you get it even if you don't compile with optimization turned on. Because the return-by-value optimization causes the compiler to generate less code, it actually improves compile-times in addition to making your generated code smaller and faster. The point is that the return-by-value optimization is almost universally implemented, at least in code cases like those shown above.

Final thought: this discussion was limited to whether there will be any extra copies of the returned object in a return-by-value call. Don't confuse that with other things that could happen in caller(). For example, if you changed caller() from Foo x = rbv(); to Foo x; x = rbv(); (note the ; after the declaration), the compiler is required to use Foo's assignment operator, and unless the compiler can prove that Foo's default constructor followed by assignment operator is exactly the same as its copy constructor, the compiler is required by the language to put the returned object into an unnamed temporary within caller(), use the assignment operator to copy the temporary into x, then destruct the temporary. The return-by-value optimization still plays its part since there will be only one temporary, but by changing Foo x = rbv(); to Foo x; x = rbv();, you have prevented the compiler from eliminating that last temporary.