A few days ago, I saw Jonathan
Boccara’s post on std::for_each
. While I haven’t
decided yet on whether I agree with everything said there, he makes a
really nice point about the levels of abstraction and that even if they
look similar enough, the std::for_each
algorithm and the
range-based for loop are not really meant for the same thing. And I
couldn’t agree more.
One of the slides that I often had in my talks is exactly about this
distinction. I used to ask “What is the difference between these
two?”:
for (item: items) {
// do something
}
for_each(items, [] (item) {
// do something
});
The answers I usually got at first were about how each of these
handle references, and about the for_each
algorithm being
able to process just a part of a collection, or to process the elements
in a reversed order since it receives a pair of iterators and we can
pass in the reverse iterators rbegin
and rend
.
Or that it can use const iterators instead of the normal ones. And more
along those lines.
These answers are all correct, but all of these changes can be
bypassed one way or the other.
THE ABSTRACT
___ _
/ __) | |
_| |__ ___ ____ _____ _____ ____| |__
(_ __) _ \ / ___) | ___ (____ |/ ___) _ \
| | | |_| | |_______| ____/ ___ ( (___| | | |
|_| \___/|_(_______)_____)_____|\____)_| |_|
CHRONICLES
Abstractions
Now, this slide intentionally contained pseudo-code – no type
declaration for the item
variable in the range-based for
loop nor for the lambda argument and a for_each
receiving a
collection (like in boost
) instead of an iterator pair. The
idea was to try to make people not to focus on these things but on the
bigger picture.
While the range-based for loop is quite useful, it is nothing more
than an ordinary for loop that operates on iterators. Every time we
write a range-based for loop, the compiler sees this in C++17:
{
auto && __range = range_expression ;
auto __begin = begin_expr ;
auto __end = end_expr ;
for ( ; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
Whatever you do, you can not change this (unless, of course, you
write a proposal for changing the C++ standard, it gets voted in, and
the compiler vendors implement this change).
But, what happens when we call for_each
?
for_each(begin, end, function);
There is one thing intentionally missing from this code snippet – we
are calling for_each
without specifying its namespace. If
begin
and end
are iterators to a collection
from the standard library, the above code will call the
std::for_each
algorithm thanks to the ADL -
argument-dependent lookup (formerly known as Koenig lookup).
ADL specifies that, when looking up an unqualified function name, the
compiler should try to find that function in the namespace of its
arguments’ types in addition to the normal name lookup. In the case
where begin
and end
are iterators to a
std::vector<T>
, the iterators live in the
std::
namespace, and the compiler will find the
std::for_each
algorithm.
What is for_each?
This all means that, unlike the range-based for loop which has a
fixed semantics to always iterate over a given sequence and execute the
body of the loop for each element in that sequence, the call to
for_each
can mean absolutely anything. It depends on the
type passed to it.
You can be anything you want to be.
Just turn yourself into
anything
you think you could ever be.
~ Innuendo by Queen
Imagine the following – we want to create a UI for drawing graphs,
and we want to make it create a new graph node every time the user
clicks somewhere on the canvas component.
We want to create a new graph node for each mouse click.
Mouse clicks are not stored in a vector or a similar collection, they
just appear one by one asynchronously when the user clicks the mouse. We
can not use iterators to go through all the clicks and therefore we can
not use the range-based for loop to iterate trough them and create graph
nodes. It is a some kind of sequence, but it is an asynchronous sequence
and is not iterable.
But again, we want to create a new graph node for each mouse
click. And the code should be able to express that.
Consider the following code snippet:
for_each(begin(canvas.click_events),
end(canvas.click_events),
[] (point2d coordinate) {
// create a new graph node
}
);
Looking at it, it is obvious what it does – for each mouse click
event that happens on the canvas object, it creates a new graph
node.
In the same way a plus operator can mean different things depending
on the type it operates on – number addition, string concatenation, set
union, etc. – for_each
can mean different things for
different types. It should always be used to define a function to be
executed for each element, like the plus operator should always do some
sort of addition, but the meaning of what for each element
means for a given type can vary significantly.
In the mouse click example, it would mean “register this lambda as
the mouse click event handler” (or “connect the clicked signal
to the lambda” if you’re into Qt).
This idea can work on other types as well – for a
std::optional<T>
, it would mean “execute on the value
stored inside the optional, if the value exists”; for a
std::future<T>
, it would mean “when the future value
arrives, execute the given function on it”; and so on.
What’s the point
Now, you might be wondering what would be the point of using
for_each
for these things. Why not just use the regular
event handler registration mechanisms to listen to mouse clicks, use the
.then
on the future values, etc.?
Just like iterators are abstractions over pointers that allow us to
write generic code which works on any sane collection,
for_each
is a common abstraction that can work on anything
for each has a meaning for. It is generic programming and
polymorphism at its finest.
One of the immediate benefits of using abstractions like this in your
code is that it becomes much more easy to test.
Do you want to test whether the program is creating the graph nodes
in the right places? Just run the same code that uses
for_each
to listen to mouse clicks, but replace the event
stream with a pre-populated std::vector<point2d>
containing any coordinates you like. No need for mocking and no need for
complex testing systems or anything.
How to implement?
The for_each
can easily be implemented for various
types. We just need to implement the begin
and
end
functions for our types to have a return type that
lives in a custom namespace (this return type does not need to have
anything in common with iterators if the types we are implementing
begin
and end
functions for are not iterable),
and implement the for_each
function for these types in that
custom namespace.
For the mouse events, The begin
function could return a
value of a ui::event_definition
type (which would, for
example, contain the object we are listening the events on and the type
of events we are interested in) while the ui::for_each
function would do the event handler registration. It does not matter
what end
returns as it is not used in this case since the
mouse event streams are potentially infinite (the user can click on the
canvas object for as long as that object exists).