What is the problem of returning a local variable?

5

What can happen if I return a local variable? I read on the internet that it is not a good idea to return a local variable.

Maybe because the variable when exiting the function is deleted?

Example:

std::string StrLower(std::string str)
{
    std::transform(str.begin(), str.end(), str.begin(), tolower);
    return str;
}
    
asked by anonymous 22.04.2018 / 14:56

2 answers

2
This statement is old and exists precisely because compilers did not know how to optimize code well in the past, resulting in an executable with questionable performance (the variable is copied, and copies can be costly). Nowadays, returning something local may be even better than using output parameters. Of course, never rely on popular optimization phrases, always do the calculations and performance measures of your program for any conclusion.

We have two names for the possible optimization types in this case:

  • RVO: Return Value Optimization , and
  • NRVO: Named Return Value Optimization , which is basically a variation of RVO for cases when the value has a name (ie is a variable). >

These two optimization techniques are part of Copy Elision ( elision / omission of copy in Portuguese). In , elision is part of standardization. Previously, this technique was mentioned as permissible, but did not go into many details about which cases were allowed to omit copies.

With all that said, we can now see the effects of RVO and NRVO:

When the RVO optimization technique is successfully applied, the copy (which would previously be done) of an object, which has just been created and returned by the function, is omitted, making the storage area of this object the even from the object that is receiving that return value. To get clearer, the following code:

#include <string>

std::string foo() { return "teste"; }

auto s = foo();

It is transformed into the following:

#include <string>

std::string s;

void foo() { s = "teste"; }

foo();

Notice how the optimization used to store the variable from outside s to assign the literal string "teste" instead of creating a new std::string and copying that object to s . Compiling with GCC 7.3 and with level 3 optimization, we have the following body for the std::string foo() function:

foo[abi:cxx11]():
  lea rdx, [rdi+16]                    # Calcula o local onde 's' está
  mov rax, rdi
  mov DWORD PTR [rdi+16], 1953719668   # Escreve "teste" no buffer de 's'.
  mov BYTE PTR [rdi+20], 101
  mov QWORD PTR [rdi+8], 5             # Escreve o tamanho da string.
  mov QWORD PTR [rdi], rdx
  mov BYTE PTR [rdi+21], 0             # Escreve o caractere nulo da string.
  ret

Instead of creating a new object of std::string , the function foo only assumes that the storage location of the object already exists (that is, who called the function has already allocated space to the object) and makes use of it.

The NRVO variation does exactly the same thing, except that it is extended to variables. If we had the following code:

#include <string>

std::string foo()
{
    std::string s_local = "teste";
    s_local[0] = 'T';
    return s_local;
}

auto s = foo();

We would have exactly the same optimized output, with the only addition of a mov BYTE PTR [rdi+16], 84 at the end, which changes the first character of the string to a capital T. That is, s_local and s will have the same storage location after optimization.

There are some cases where NRVO optimization can not be applied easily. If we just return the same local variable, then the NRVO application is trivial. Otherwise, if we have multiple-value returns, then we are in a difficult case for the NRVO, and optimization will probably not take place. For example:

std::string foo(bool b)
{
    std::string s1 = "abc";
    std::string s2 = "def";
    return b ? s1 : s2;
}

Here, the compiler may even be able to apply NRVO (writing "abc" or "def" in the string, depending on the value of b ), but as soon as the code becomes more complex, chances of NRVO being applied with success decreases. In contrast, if we only have constant returns to the same variable, the function can be as complex as you want, that the NRVO application will be trivial regardless.

Finally, here is the output of your function (briefly changed) from some compilers (compiling with in all).

#include <string>
#include <algorithm>

std::string foo()
{
    std::string s = "teste";
    std::transform(begin(s), end(s), begin(s),
                   [](char c) { return c - 32; });
    return s;
}

With GCC 7.3 and optimization level 3:

foo[abi:cxx11]():
  lea rdx, [rdi+16] # Calcula o começo da string que já existe fora da função
  mov DWORD PTR [rdi+16], 1953719668 # Escreve "teste"
  mov BYTE PTR [rdi+20], 101
  mov rax, rdi
  mov QWORD PTR [rdi+8], 5
  mov BYTE PTR [rdi+21], 0
  mov QWORD PTR [rdi], rdx
  sub BYTE PTR [rdi+16], 32 # Sequência de subtração (pra passar pra maiúsculo)
  sub BYTE PTR [rdi+17], 32 # que foi desenrolado de 'std::transform'
  sub BYTE PTR [rdi+18], 32
  sub BYTE PTR [rdi+19], 32
  sub BYTE PTR [rdi+20], 32
  ret

With Clang 6.0.0, optimization level 3 and also compiling with libstdc ++:

foo[abi:cxx11](): # @foo[abi:cxx11]()
  lea rax, [rdi + 16]
  mov qword ptr [rdi], rax
  mov qword ptr [rdi + 8], 5
  mov dword ptr [rdi + 16], 1414743380 # Clang conseguiu remover o 'std::transform'
  mov word ptr [rdi + 20], 69          # e já passou a string na versão maiúscula
  mov rax, rdi
  ret

You can play and test with compiler outputs at Compiler Explorer Godbolt .

    
22.04.2018 / 19:48
3

You never return variables. This is an abstract concept to facilitate the understanding of the code, but not being something concrete we can not transpose to another place. So the local variable only exists there and can not get out.

You return a value. Okay, I understand that what I wanted to say was that. A locally created value (allocated in stack ) can only be copied because that value lives in an area that is guaranteed to be alive while the function is running, then for all intents and purposes consider this area will be destroyed. If you try to access the value in its original position you will potentially be accessing junk, something you should not. Copying brings you the value to an area that will surely be alive when you access it.

Long types have a lot of data to carry and end up getting slow. This is why most of the long types are by reference.

Some types have reference semantics, so instead of returning the value of the object itself you return a reference to it, then you are considered to have moved the object, so only the reference (pointer) is copied, and the reference points to the effective value. So that's the problem. you returned a reference to a value that will be destroyed at the end of the function execution, which will obviously give you trouble, in simplicity. That's what the people talk about.

In this case the solution is to allocate in the heap (definition in link above), or pass the value by reference, which will allow to write in the place where the value being manipulated this function will be used in the calling function. So it is guaranteed that the value will be alive when it will be used.

  

Maybe because the variable when exiting the function is deleted?

This, although to use the correct terminology I would say it is because the value is potentially destroyed when leaving the function.

Fixing the bugs the code works fine (I hope it's a generic example, I do not think that's an idea in real code):

#include <iostream>
#include <string>
#include <algorithm>
using namespace std;

string StrLower(string str) {
    transform(str.begin(), str.end(), str.begin(), [](char c) -> char { return tolower(c); });
    return str;
}
int main() {
    cout << StrLower("TESTE");
}
    
22.04.2018 / 17:34