push_back v.s. emplace_back

What are the differences between push_back and emplace_back?

Intro

Let's see an example in C++98.

push_back

Suppose there is a class A, and we want to use a vector to store some instances of class A.

class A {
  protected:
    int *ptr;
    int size;

  public:
    A(int n = 16) {
        ptr = new int[n];
        size = n;
        puts("A(int)");
    }
    A(const A &a) {
        size = a.size;
        ptr = new int[size];
        memcpy(ptr, a.ptr, size * sizeof(int));
        puts("A(const A&)");
    }
    virtual ~A() {
        if (ptr != nullptr)
            delete[] ptr;
    }
};

Compile this code with command:

clang++ -std=c++98 push_back.cpp

And it will output:

A(int)
A(const A&)

We can see that, if we want to store an instance of A in the vector, there are at least two instances constructed. One is temporary, and the another one is stored in the heap of vector.

If A were a very very heavy class, then the temporary one will slow performance of our program. And this is what emplace_back wants to optimize, to reduce one temporary instance copy.

emplace_back

In template class vector, push_back is defined as:

void push_back (const value_type& val); // C++98
void push_back (value_type&& val);      // since C++11, where && denote rvalue reference

However, emplace_back is defined as:

template <class... Args>
  void emplace_back (Args&&... args);  // where && denote universal reference, we will explain it latter
template< class... Args >
  reference emplace_back( Args&&... args );

The arguments of emplace_back is variadic, which is similar to printf. What's more, the args are type of rvalue reference. We will explain what is rvalue reference.

We can see that the argument of emplace_back is universal reference, and it's variadic.

After C++11, the C++ standard introduces "move semantic" and "perfect forward". And there is a new type of constructor, call "move constructor".

#include <iostream>
#include <vector>
using namespace std;
class A {
  protected:
    int *ptr;
    int size;

  public:
    A(int n = 16) {
        ptr = new int[n];
        size = n;
        puts("A(int)");
    }
    A(const A &a) {
        size = a.size;
        ptr = new int[size];
        memcpy(ptr, a.ptr, size * sizeof(int));
        puts("A(const A&)");
    }
    A(A &&a) {
        size = a.size;
        ptr = a.ptr;
        a.ptr = nullptr;
        puts("A(const A&&)");
    }
    virtual ~A() {
        if (ptr != nullptr)
            delete[] ptr;
    }
};

int main() {
    vector<A> vec;
    vec.emplace_back(10);
}

Compiled it with clang++ -std=c++17 push.cpp. Then the program will output:

A(int)

Now, we can see the differences between push_back and emplace_back.

What will happen if we call emplace_back(A(10)) ? Actually, it will output:

A(int)
A(const A&&)

So we can see that, there is still only one copy, no temporary object.

In the next section, we will explain what is "universal reference", and introduce the difference among lvalue-reference, rvalue-reference and universal-reference.

lvalue, rvalue and xvalue

Please refer to

for more details.

Generally speaking,

  • lvalue - Left-hand side value of an assignment expression. A lvalue always has an identity name.

    • Please note that "assignment" is not declaration and initialization.
    • For example, int x = 1; is declaring a lvalue x, initialized it with 1.
    • int &y = x; is declaring a lvalue reference y, initialized it with lvalue x.
  • rvalue - Right-hand side value of an assignment expression. A rvalue usually is a temporary object.

    • e.g. string s = string("hello"), where string("hello") is a rvalue.
    • A rvalue has no identity name.
  • xvalue - "eXpiring value", it usually refers to an object, usually near the end of its lifetime (so that its resources may be moved).

    • e.g. suppose we have a function auto f() { return string("hello")}, and we let str += f(), where f() is a xvalue (also a rvalue).

Universal Reference

In C++, there are two common reference types: lvalue reference and rvalue reference. In addition,

  • non-const lvalue reference must be binded to a lvalue,
  • const lvalue reference can be binded to a either const lvalue or a rvalue
    • e.g. if we have a function void f(const string &str);, then f(string("ABC")) is valid.
  • rvalue reference must be binded to a rvalue.
void f1(vector<int>& vec) {}
void f2(vector<int>&& vec) {}

In above code, vector<int>& means vec is lvalue reference, and vector<int>&& means vec is a rvalue reference.

Actully, there are 3rd reference type, called "universal reference". Universal reference is a reference that may resolve to either an lvalue reference or an rvalue reference.

Now, let us see another example, which is about template.

template<class T> void f1(T &val);          // lvalue reference
template<class T> void f2(T &&val);         // universal reference
template<class T> void f3(vector<T> &&val); // rvalue reference
template<class T> void f4(const T&& param); // rvalue reference
  • T & is the most common reference type, lvalue-reference, which must be binded to a lvalue.
  • T && is actually the universal reference.
  • vector<T> && and const T && are the rvalue references.

So, we can see that it's easy to distinguish the lvalue reference, there is only one & in lvalue reference.

But how can we distinguish rvalue reference and universal reference, both of them have two &?

Refer to this blog: Universal References in C++11

  • "Universal references can only occur in the form T&&!"
  • More specifically, universal references always have the form T&& for some deduced type T.

Let's revisit the push_back and emplace_back.

template <class T, class Allocator = allocator<T> >
class vector {
public:
    ...
    void push_back(T&& x);       // fully specified parameter type => no type deduction;
    ...                          // && is rvalue reference
};

Actually, the declaration for push_back is:

template <class T>
void vector<T>::push_back(T&& x);

push_back can't exist without the class std::vector<T> that contains it. But if we have a class std::vector<T>, we already know what T is, so there’s no need to deduce it. Hence T && is not a deduced type.

The case is different in emplace_back.

template <class T, class Allocator = allocator<T> >
class vector {
public:
    ...
    template <class... Args>
    void emplace_back(Args&&... args); // deduced parameter types => type deduction;
    ...                                // && is universal references
};

And the declaration of emplace_back is:

template<class T>
template<class... Args>
void std::vector<T>::emplace_back(Args&&... args);

Here Args is a deduced type, obviously. Hence Args && is universal reference.

move

std::move is used to "cast a lvalue to rvalue".

std::move is used to indicate that an object t may be "moved from", i.e. allowing the efficient transfer of resources from t to another object.

In particular, std::move produces an xvalue expression that identifies its argument t. It is exactly equivalent to a static_cast to an rvalue reference type.

std::move is defined as:

template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept;    // since C++14

Here T &&t is an universal reference, since T is a deduced type.

The implementation of move is very simple, what it does is to make a type-casting by static_cast.

template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept {
    return static_cast<typename std::remove_reference<T>::type&&>(t);
}

The effect of remove_reference is remove reference qualifier of a type T.

template<class T> struct remove_reference      {typedef T type;};
template<class T> struct remove_reference<T&>  {typedef T type;};
template<class T> struct remove_reference<T&&> {typedef T type;};

We can make these code simpler, that is:

template<class T>
constexpr T&& move(T&& t) noexcept {
    return static_cast<T &&>(t);
}

forward

std::forward is defined as:

template< class T >
constexpr T&& forward( std::remove_reference_t<T>& t ) noexcept {
    return static_cast<T&&>(t);
}

template< class T >
constexpr T&& forward(std::remove_reference_t<T>&& t) noexcept {
    static_assert(!is_lvalue_reference<T>::value,
                  "can not forward an rvalue as an lvalue");
    return static_cast<T&&>(t);
}
  • For the 1st one, it forwards lvalue t as either lvalue or as rvalue, depending on T.
    • std::forward<string &>(str) will produce an lvalue reference. (Actually, it does nothing here.)
    • std::forward<string &&>(str) will produce an rvalue reference. It can forward str (a lvalue) as rvalue. Here we can see that, this version of forward can replace move. See Usage of std::forward vs std::move.
  • For the 2nd one, it forwards rvalue t as rvalues and prohibits forwarding of rvalues as lvalues.
    • e.g. std::forward<string &>("") will cause compiler error, since it attempts to forward a rvalue "" as a lvalue.

std::forward makes it possible to forward a result of an expression (such as function call), which may be rvalue or lvalue, as the original value category of a forwarding reference argument.

The forward operation will keep the reference property while forwarding t, hence it is called "Perfect Forwarding".

Implementation of emplace_back

Based on std::forward<>() and std::move(), (after C++11) one of the possible implementations of push_back and emplace_back is:

template<class T>
class Vector {
protected:
    using value_type = T;
    using pointer_type = T*;
    using reference_type = T&;
    pointer_type start;
    std::size_t size;
    std::size_t capacity;
    // ...
public:
    void push_back(value_type &&val) { this->emplace_back(val); }
    void push_back(const value_type &val) { this->emplace_back(std::move(val)); }
    
    template <class... Args>
    reference_type emplace_back (Args&&... args) {
        if (size == capacity) {
            // make vector grow via some strategies
        }
        // new placement
        return *new(start + (size++)) T(std::forward<Args>args...);
    }
};

In C++98 (before C++11), implementation of push_back maybe:

void push_back(const value_type &val) {
    if (size == capacity) {
        // ...
    }
    start[size] = new T(val);  // this will call copy constructor
    ++size;
}

References

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章