Understanding Move Semantics
Overview
One of the poweful features of modern C++ is move semantics. So, what is move semantics? Move semantics makes it possible for compiler to replace expensive copying operations with less expensive moves. In instead of constructing/assigning an object by a deep copy, you can do it by move constructor or move assignment operators.
To appreciate the use of move semantics, we need to know the scope of it, which brings us to the topic of rvalue
and lvalue
.
What is rvalue
and lvalue
To understand move semantics, it’s important to understand the concept about lvalue
and rvalue
. Let’s first look at an example:
// test.cpp
int foo() {return 2;}
int main()
{
int a = 3; // okay
foo() = a; // this won't get compiled
return 0;
}
If you want to run this code, an compiler error will occur, saying:
test.cpp:7:8: error: expression is not assignable
foo() = a; // this won't get compiled
~~~~~ ^
1 error generated.
Apperantly, foo()
cannot get assigned a value because itself also returns a value. It is therefore, an rvalue
. For simplicity, I personally interpret rvalue
as expressions that can only exist on the rhs of =
, whereas lvalue
can exist either in lhs or rhs.
A more formal definition, quoted from here, states that:
An lvalue (locator value) represents an object that occupies some identifiable location in memory (i.e. has an address). rvalues are expressions that are not lvalue.
Therefore, from the above definition of lvalue, an rvalue is an expression that does not represent an object occupying some identifiable location in memory.
Instead, rvalue are temporary results that reside on register, and will be disgarded when the line of code it is located finishes execution. Thus, you cannot assign/update value of rvalue, but you can do it on lvalue (by assigning rvalue to it).
For more details, I recommend reading this great article by Eli 1.
Why we need move semantics?
Since lvalue resides on memory, they can be modified (assign, delete, update). However, modifying large objects could be expensive. Consider the following example:
// StringVec.h
#include <string>
#include <iostream>
using namespace std;
class StringVec
{
public:
const size_t TEST_SIZE = 1000;
StringVec() : ptr(new string[TEST_SIZE]), size(TEST_SIZE){
cout<<"default constructor invoked"<<endl;
}
~StringVec() { delete[] ptr; }
// Copy constructor:
StringVec(const StringVec &rhs) :
ptr(new string[rhs.size]), size(rhs.size) {
memcpy(ptr, rhs.ptr, size);
cout<<"copy constructor invoked"<<endl;
}
// Copy assignment operator:
StringVec& operator=(const StringVec &rhs) {
StringVec tmp(rhs);
_swap(tmp);
cout<<"copy assignment invoked"<<endl;
return *this;
}
void _swap(StringVec &rhs) {
swap(size, rhs.size);
swap(ptr, rhs.ptr);
}
private:
string *ptr;
size_t size;
};
It has two constructors (default and copy). Notice that in assignment operator, we are not supposed to modify the value of input object. Thus, we must create an temporary object out of rhs(using copy constructor), then swap the temporary object with our object.
Then, if we have the following driver code:
// main.cpp
#include "StringVec.h"
int main()
{
StringVec v1;
StringVec v2;
v2 = v1;
return 0;
}
The output will be like:
default constructor invoked
default constructor invoked
copy constructor invoked
copy assignment invoked
The 1st line is invoked for v1, and line 2 to 4 are for v2. To perform v2=v1
, we copied content of v1 into v2.
// Move constructor
StringVec(StringVec &&rhs) noexcept : ptr(rhs.ptr), size(rhs.size) {
rhs.ptr = nullptr; rhs.size = 0;
}
// Move assignment operator
StringVec &operator=(StringVec &&rhs) noexcept {
StringVec tmp(std::move(rhs));
_swap(tmp);
return *this;
}
Notice that, &&
implies move ctor will take in rvalue. Also, there is a std::move
in move assignment. Actually, std::move
does not move anything, and std::forward
does not move anything either. They actually do not have logic, instead, they do casting.
std::move
Applying std::move
to an object tells the compiler that the object is eligible to be moved from. That’s why std::move
has the name it does: to make it easy to designate objects that may be moved from.
std::forward
Unlike std::move
, which unconditionally cast an argument to an rvalue, std::forward
will cast it only when some criteria are met.The cast happens if and only if argument is bound to an rvalue.
Incorrect usage
class Annotation {
public:
explicit Annotation(const std::string text)
: value(std::move(text)) // "move" text into value; this code { ... }
// doesn't do what it seems to!
...
private:
std::string value;
};
This brings a key point with move senmatics: you cannot move
a const
object. This is because, object cast by std::move
can be modified (just like passing by reference), which violates the const
-ness of original object. Therefore, we will need a move constructor to solve the issue.
class string { // std::string is actually a
public: // typedef for std::basic_string<char>
...
string(const string& rhs);
string(string&& rhs);
...
};
Benchmark
This blog was inspired by Leor Zolman’s 2 major reasons why modern C++ is a performance beast. In that article, he provided code for benchmarking constructors with and without move semantics.
Extension
Move semantics also enables the creation of move-only types, such as std::unique_ptr
(only one copy can exist), std::future
, and std::thread
.
Reference
- Meyers, S. (2014). Effective modern C++. 1st ed. O’Reilly Media.