C++ remove elements from a vector
The Problem
Given below vector v containing 8 integers, remove all elements that are greater than 5.
std::vector<int> v{3, 6, 1, 2, 7, 0, 9, 4};
The Solution
I was actually asked this question during an interview long time ago. At first glance this question seems simple. But to do it right and efficiently, you need to have some familiarity with the STL.
Before diving into the solutions, I want to highlight that a vector in C++ is a continuous container, meaning that the elements are adjacent to each other. This gives you the constant time complexity to get an element at any index. This also means you need to make sure the vector is still continuous after one element is removed.
1. The naive way
The naive way is to loop through each element in the vector, and check if the element is greater than 5. If yes, erase it.
std::vector<int> v{3, 6, 1, 2, 7, 0, 9, 4};
for (auto iter = v.begin(); iter != v.end();) {
if (*iter > 5) {
iter = v.erase(iter);
} else {
++iter;
}
}
You should always be careful when you modify an array in-place while looping through it, in any languages. Fortunately in this case our code works correctly. Here is the first knowledge point:
v.erase(iter) returns the iterator to the next element following the last removed element.
Effectively this advances the iterator to the next in the case when the current element is erased.
Though this solution works, it’s not efficient. Every time we erase an element, we need to move all subsequent elements. The time complexity is O(n²). Can we do it in a faster way?
2. Erase-remove idiom
The faster way is called erase-remove idiom:
std::vector<int> v{3, 6, 1, 2, 7, 0, 9, 4};
v.erase(std::remove_if(v.begin(), v.end(), [](int i) { return i > 5; }), v.end());
This is both syntactically concise and efficient. Let’s break down the statement and understand what it does.
std::remove_if
std::remove_if takes the beginning and end iterators of the range you want to work on, and each element in the range is passed to the third parameter, which is a UnaryPredicate. The UnaryPredicate is basically a function that takes a single element and returns a bool. If the return value is false, std::remove_if will move the element to the front of the range. At the end, it returns an iterator to the location immediately after the last element that should be kept (UnaryPredicate false).
A visual demo may better help this: Suppose we print out the result of v after calling
std::remove_if(v.begin(), v.end(), [](int i) { return i > 5; })
vector::erase
The std::remove_if does not actually remove anything. It merely cluster all elements that should be kept together, and give you an iterator, after which the range are meaningless rubbish and should be erased. Erasing the leftover is straightforward:
v.erase(returned_iter, v.end())
It erases all the elements between the returned_iter and the end of the vector.
The time complexity is linear: you only need one linear traversal to finish std::remove_if(), and v.erase() is a constant time operation.
3. Post-C++20 solution
C++20 makes the solution easier by introducing a new algorithm: std::erase_if. So all you need to do is
std::erase_if(v, [](int i) { return i > 5; })
That simple and concise! The the time complexity is also linear.
Final notes
If you checked the reference of std::remove_if and std::erase_if, you’ve probably noticed they each have another version respectively: std::remove and std::erase. This is just a specialized version of std::remove_if or erase_if — the predicate is: return true if the element in the vector equals the input value.