Understanding Move Semantics

Overview

One of the poweful features of modern C++ is move semantics. So, what is move semantics? Move semantics makes it possible for compiler to replace expensive copying operations with less expensive moves. In instead of constructing/assigning an object by a deep copy, you can do it by move constructor or move assignment operators.

To appreciate the use of move semantics, we need to know the scope of it, which brings us to the topic of rvalue and lvalue.

What is rvalue and lvalue

To understand move semantics, it’s important to understand the concept about lvalue and rvalue. Let’s first look at an example:

// test.cpp

int foo() {return 2;}
int main()
{
	int a = 3;    // okay
	foo() = a;    // this won't get compiled

	return 0;
}

If you want to run this code, an compiler error will occur, saying:

test.cpp:7:8: error: expression is not assignable
        foo() = a;       // this won't get compiled
        ~~~~~ ^
1 error generated.

Apperantly, foo() cannot get assigned a value because itself also returns a value. It is therefore, an rvalue. For simplicity, I personally interpret rvalue as expressions that can only exist on the rhs of =, whereas lvalue can exist either in lhs or rhs.

A more formal definition, quoted from here, states that:

An lvalue (locator value) represents an object that occupies some identifiable location in memory (i.e. has an address). rvalues are expressions that are not lvalue.

Therefore, from the above definition of lvalue, an rvalue is an expression that does not represent an object occupying some identifiable location in memory.

Instead, rvalue are temporary results that reside on register, and will be disgarded when the line of code it is located finishes execution. Thus, you cannot assign/update value of rvalue, but you can do it on lvalue (by assigning rvalue to it).

For more details, I recommend reading this great article by Eli 1.

Why we need move semantics?

Since lvalue resides on memory, they can be modified (assign, delete, update). However, modifying large objects could be expensive. Consider the following example:

// StringVec.h

#include <string>
#include <iostream> 

using namespace std;
class StringVec
{
    public:
        const size_t TEST_SIZE = 1000;
        
        StringVec() : ptr(new string[TEST_SIZE]), size(TEST_SIZE){
            cout<<"default constructor invoked"<<endl;
        }
        
        ~StringVec() { delete[] ptr; }

        // Copy constructor:
        StringVec(const StringVec &rhs) :
                ptr(new string[rhs.size]), size(rhs.size) {
            memcpy(ptr, rhs.ptr, size);
            cout<<"copy constructor invoked"<<endl;
        }
        
        // Copy assignment operator:
        StringVec& operator=(const StringVec &rhs) {
            StringVec tmp(rhs);
            _swap(tmp);
            cout<<"copy assignment invoked"<<endl;
            return *this;
        }
        
        void _swap(StringVec &rhs) {
            swap(size, rhs.size);
            swap(ptr, rhs.ptr);
        }

    private:
        string *ptr;
        size_t size;
};

It has two constructors (default and copy). Notice that in assignment operator, we are not supposed to modify the value of input object. Thus, we must create an temporary object out of rhs(using copy constructor), then swap the temporary object with our object.

Then, if we have the following driver code:

// main.cpp

#include "StringVec.h"

int main()
{
  StringVec v1;
  StringVec v2;

  v2 = v1;

  return 0;
}

The output will be like:

default constructor invoked
default constructor invoked
copy constructor invoked
copy assignment invoked

The 1st line is invoked for v1, and line 2 to 4 are for v2. To perform v2=v1, we copied content of v1 into v2.

    // Move constructor
    StringVec(StringVec &&rhs) noexcept : ptr(rhs.ptr), size(rhs.size) {
        rhs.ptr = nullptr; rhs.size = 0;
    }

    // Move assignment operator
    StringVec &operator=(StringVec &&rhs) noexcept {
        StringVec tmp(std::move(rhs));
        _swap(tmp);
        return *this;
    }



Notice that, && implies move ctor will take in rvalue. Also, there is a std::move in move assignment. Actually, std::move does not move anything, and std::forward does not move anything either. They actually do not have logic, instead, they do casting.

std::move

Applying std::move to an object tells the compiler that the object is eligible to be moved from. That’s why std::move has the name it does: to make it easy to designate objects that may be moved from.

std::forward

Unlike std::move, which unconditionally cast an argument to an rvalue, std::forward will cast it only when some criteria are met.The cast happens if and only if argument is bound to an rvalue.

Incorrect usage

class Annotation {
   public:
explicit Annotation(const std::string text)
: value(std::move(text)) // "move" text into value; this code { ... } 
                         // doesn't do what it seems to!
...
   private:
     std::string value;
};

This brings a key point with move senmatics: you cannot move a const object. This is because, object cast by std::move can be modified (just like passing by reference), which violates the const-ness of original object. Therefore, we will need a move constructor to solve the issue.

class string {               // std::string is actually a
   public:                   // typedef for std::basic_string<char>
  ...
  string(const string& rhs);
  string(string&& rhs);
  ...
};

Benchmark

This blog was inspired by Leor Zolman’s 2 major reasons why modern C++ is a performance beast. In that article, he provided code for benchmarking constructors with and without move semantics.

Extension

Move semantics also enables the creation of move-only types, such as std::unique_ptr(only one copy can exist), std::future, and std::thread.

Reference

  1. Meyers, S. (2014). Effective modern C++. 1st ed. O’Reilly Media.


  1. Understanding lvalues and rvalues in C and C++ ↩︎

Ziji SHI(史子骥)
Ziji SHI(史子骥)
Ph.D. candidate

My research interests include distributed machine learning and high-performance computing.