Ivan Čukić

dr Ivan Čukić

Author of "Functional Programming in C++", KDE developer and Free/Libre software enthusiast

For inquiries about KDE projects (requesting features, submitting bugs, asking general questions etc.), you can use:

KDE Community Forum
Bug reporting site
#plasma IRC channel

You need to contact me directly?

KDE related:
ivan.cukic [at] kde.org

Non KDE related things:
ivan [at] this domain

IRC Network: libera.chat
Nick: ivan|home
Channel: #plasma

Projects

Blog categories

Functional Programming in C++

If you like C++, you might be interested in my book. It contains quite a few gems of modern C++ programming.

It is available for purchase at:

Patronage

C++ Russia 2018 and template meta-programming in modern C++

Published in the Prog C++ section, on 15 April 2018

Tomorrow, I’ll be heading to Piter (St. Petersburg) for acclimatization training before the C++ Russia 2018 conference starts.

While most talks at C++ Russia are in Russian, this year will have more than a few amazing talks in English – from famous names like Andrei Alexandrescu, Jon Kalb and Herb Sutter, to a few less famous ones including yours truly. :)

My talk will be a bit out of my comfort zone – it will not be about monads nor any other aspect of functional programming. I’ll be talking about modern meta-programming for C++17 and 20 – the void_t meta-function and the detection idiom.

If you happen to be close to Piter, join us (the tickets are available at cppconf.ru) or if you just want to grab a beer, send me an e-mail.

Plasma Vault with KDE Connect, and more

Published in the KDE development: Plasma Vault section, on 14 April 2018

There have been a few smaller improvements to the Plasma Vault pushed to master in the past few days, scheduled for release in Plasma 5.13.

KDE Connect

Imagine you left your computer unattentded, unlocked with a few Vaults open. And you have a few spies from a competing company roaming around.

Thanks to the new feature of KDE Connect, it is possible to define commands that you can execute remotely from you phone.

This amazing feature can be used to lock your screen, but also to close the open vaults thanks to the new D-Bus commands of the Vault system.

The commands are defined in the KDE Connect Settings module. Select the phone you want to be able to run the commands from and go to configure the ‘Run commands’ plugin. There you can add any commands you wish, and they will automatically appear on your phone in the KDE Connect application.

For locking the session, you can create a new command that executes

qdbus org.kde.screensaver /ScreenSaver Lock

For closing the vaults, you have two options:

qdbus org.kde.kded5 /modules/plasmavault closeAllVaults
qdbus org.kde.kded5 /modules/plasmavault forceCloseAllVaults

The difference between these two commands is that closeAllVaults will close only the vaults that are not used by a running application, while forceCloseAllVaults will try to kill all those applications and close all vaults.

UI polish

A few things got more polished. The offline vaults I wrote about the last time got a nifty emblem to differentiate them from ordinary vaults:

Also, error messages in the password dialogue got more prominent:

Next steps

The next task will be to rewrite the CryFS backend to use the features added in the 0.9.9 version. This will provide much better error handling and will finally allow CryFS to be the default encryption mechanism for Plasma Vault.

Offline Vaults for an extra layer of protection

Published in the KDE development: Plasma Vault section, on 10 April 2018

I’m slowly returning to KDE development after a few months of being mostly in bugfix mode due to my other-life obligations (more on that later), so I decided to implement a new feature for my youngest project – the Plasma Vault.

One of the possible attack vectors to your Plasma Vaults is that people could potentially have access to your computer while the vault is open.

This is not a problem if we consider direct access because it is something that is easily controlled – you see everyone who approaches your computer, but the problem can be remote access.

We have recently seen that even CPUs can sometimes be used as attack vectors, it is common for the web browsers to be, and obviously, through social engineering, the user can also be used to gain remote access to the system.

For this reason, starting with the next Plasma release, if you have an extremely sensitive data you can mark a vault as offline-only.

This means that networking will be shut down as soon as you open the offline-only vault so that any potential remote access is inhibited.

The networking will be restored as soon as you close all offline-only vaults.

For this to work, you will need to use Network Manager as it is the supported networking interface in Plasma. I know it might be a burden for some of you (it is for me), but there is no alternative at the moment.

There is much more to std::for_each than meets the eye

Published in the Prog C++ section, on 3 April 2018

A few days ago, I saw Jonathan Boccara’s post on std::for_each. While I haven’t decided yet on whether I agree with everything said there, he makes a really nice point about the levels of abstraction and that even if they look similar enough, the std::for_each algorithm and the range-based for loop are not really meant for the same thing. And I couldn’t agree more.

One of the slides that I often had in my talks is exactly about this distinction. I used to ask “What is the difference between these two?”:

for (item: items) {
    // do something
}

for_each(items, [] (item) {
    // do something
});

The answers I usually got at first were about how each of these handle references, and about the for_each algorithm being able to process just a part of a collection, or to process the elements in a reversed order since it receives a pair of iterators and we can pass in the reverse iterators rbegin and rend. Or that it can use const iterators instead of the normal ones. And more along those lines.

These answers are all correct, but all of these changes can be bypassed one way or the other.

             THE ABSTRACT
                 ___                                  _
                / __)                                | |
              _| |__ ___   ____     _____ _____  ____| |__
             (_   __) _ \ / ___)   | ___ (____ |/ ___)  _ \
               | | | |_| | |_______| ____/ ___ ( (___| | | |
               |_|  \___/|_(_______)_____)_____|\____)_| |_|

                                                  CHRONICLES

Abstractions

Now, this slide intentionally contained pseudo-code – no type declaration for the item variable in the range-based for loop nor for the lambda argument and a for_each receiving a collection (like in boost) instead of an iterator pair. The idea was to try to make people not to focus on these things but on the bigger picture.

While the range-based for loop is quite useful, it is nothing more than an ordinary for loop that operates on iterators. Every time we write a range-based for loop, the compiler sees this in C++17:

{
    auto && __range = range_expression ;
    auto __begin = begin_expr ;
    auto __end = end_expr ;
    for ( ; __begin != __end; ++__begin) {
        range_declaration = *__begin;
        loop_statement
    }
}

Whatever you do, you can not change this (unless, of course, you write a proposal for changing the C++ standard, it gets voted in, and the compiler vendors implement this change).

But, what happens when we call for_each?

for_each(begin, end, function);

There is one thing intentionally missing from this code snippet – we are calling for_each without specifying its namespace. If begin and end are iterators to a collection from the standard library, the above code will call the std::for_each algorithm thanks to the ADL - argument-dependent lookup (formerly known as Koenig lookup).

ADL specifies that, when looking up an unqualified function name, the compiler should try to find that function in the namespace of its arguments’ types in addition to the normal name lookup. In the case where begin and end are iterators to a std::vector<T>, the iterators live in the std:: namespace, and the compiler will find the std::for_each algorithm.

What is for_each?

This all means that, unlike the range-based for loop which has a fixed semantics to always iterate over a given sequence and execute the body of the loop for each element in that sequence, the call to for_each can mean absolutely anything. It depends on the type passed to it.

You can be anything you want to be.
Just turn yourself into anything
you think you could ever be.

~ Innuendo by Queen

Imagine the following – we want to create a UI for drawing graphs, and we want to make it create a new graph node every time the user clicks somewhere on the canvas component.

We want to create a new graph node for each mouse click.

Mouse clicks are not stored in a vector or a similar collection, they just appear one by one asynchronously when the user clicks the mouse. We can not use iterators to go through all the clicks and therefore we can not use the range-based for loop to iterate trough them and create graph nodes. It is a some kind of sequence, but it is an asynchronous sequence and is not iterable.

But again, we want to create a new graph node for each mouse click. And the code should be able to express that.

Consider the following code snippet:

for_each(begin(canvas.click_events),
         end(canvas.click_events),
         [] (point2d coordinate) {
            // create a new graph node
         }
    );

Looking at it, it is obvious what it does – for each mouse click event that happens on the canvas object, it creates a new graph node.

In the same way a plus operator can mean different things depending on the type it operates on – number addition, string concatenation, set union, etc. – for_each can mean different things for different types. It should always be used to define a function to be executed for each element, like the plus operator should always do some sort of addition, but the meaning of what for each element means for a given type can vary significantly.

In the mouse click example, it would mean “register this lambda as the mouse click event handler” (or “connect the clicked signal to the lambda” if you’re into Qt).

This idea can work on other types as well – for a std::optional<T>, it would mean “execute on the value stored inside the optional, if the value exists”; for a std::future<T>, it would mean “when the future value arrives, execute the given function on it”; and so on.

What’s the point

Now, you might be wondering what would be the point of using for_each for these things. Why not just use the regular event handler registration mechanisms to listen to mouse clicks, use the .then on the future values, etc.?

Just like iterators are abstractions over pointers that allow us to write generic code which works on any sane collection, for_each is a common abstraction that can work on anything for each has a meaning for. It is generic programming and polymorphism at its finest.

One of the immediate benefits of using abstractions like this in your code is that it becomes much more easy to test.

Do you want to test whether the program is creating the graph nodes in the right places? Just run the same code that uses for_each to listen to mouse clicks, but replace the event stream with a pre-populated std::vector<point2d> containing any coordinates you like. No need for mocking and no need for complex testing systems or anything.

How to implement?

The for_each can easily be implemented for various types. We just need to implement the begin and end functions for our types to have a return type that lives in a custom namespace (this return type does not need to have anything in common with iterators if the types we are implementing begin and end functions for are not iterable), and implement the for_each function for these types in that custom namespace.

For the mouse events, The begin function could return a value of a ui::event_definition type (which would, for example, contain the object we are listening the events on and the type of events we are interested in) while the ui::for_each function would do the event handler registration. It does not matter what end returns as it is not used in this case since the mouse event streams are potentially infinite (the user can click on the canvas object for as long as that object exists).

C++17 and parallel algorithms in STL - setting up

Published in the Prog C++ section, on 29 March 2018

C++17 has introduced parallelized versions of many standard algorithms like count, count_if, copy and many others. Some algorithms like accumulate and inner_product, got parallelized siblings reduce and transform_reduce which do not process the collections in-order.

While GCC and Clang have a really great support for other parts of the C++17 standard, the implementation of parallelized algorithms is missing in both libstdc++ and libc++ (it is being worked on).

Fortunately, there are 3rd-party libraries like HPX and Intel’s PSTL library that can be used until GCC and Clang standard library developers catch up.

Setting up PSTL

The latest release of PSTL can be downloaded from intel/parallelstl. You can extract it anywhere you like, I keep it in /usr/local/intel-pstl (you can choose whichever install prefix you want, I tend to keep everything that is not installed through APT in separate directories in /usr/local). Also, you will need Intel TBB2018. If your distribution does not provide an up-to-date version (Debian, for example, has TBB2017 which is not sufficient to run PSTL) you can get it from p01orgu/tbb.

If you downloaded TBB from github, do not forget to set the LD_LIBRARY_PATH environment variable to include the directory where the .so files are located. In my case, the directory is /usr/local/intel-tbb/lib/intel64/gcc4.7 (it works with new versions of GCC, not only 4.x).

The last step is to add a few compiler and linker flags for your project, and you are ready to speed up your computations. For the compiler flags, you need to add the include dirs and to enable OpenMP SIMD. For GCC, it means doing this (just change the paths to the include directories):

CXXFLAGS = -std=c++17 -I/usr/local/intel-tbb/include \
           -I/usr/local/intel-pstl/include -fopenmp-simd

As far as the linker goes, it just needs to know it needs to link the tbb library:

LDFLAGS = -L/usr/local/intel-tbb/lib/intel64/gcc4.7 -ltbb

Setting up HPX

Setting up the HPX library is a bit simpler. You can get it library by cloning git://github.com/STEllAR-GROUP/hpx.git (you are advised to do a --depth 1 clone if you do not plan to contribute to HPX). After that, just do an out-of-the-source build using cmake. The only prerequisites for building HPX are boost and hwloc libraries, so make sure you have their development packages installed. Optionally, you can install tcmalloc or jemalloc development packages to provide efficient multi-threading malloc for HPX to use, or, alternatively, you can pass the -DHPX_WITH_MALLOC=system flag to CMake.

mkdir build
cd !$
cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local/hpx -DHPX_WITH_MALLOC=jemalloc
make -j4

When compiling a program, you just need to link against the hpx library (and set the appropriate include and library paths like it was the case with PSTL).

Usage

Both libraries provide a similar API – based on the C++17 standard. They differ in the namespace they use and in which headers you need to include. The PSTL library is meant to be used as a part of the standard library, so it uses std:: namespace while HPX is a proper 3rd-party library and defines the parallel algorithms and policies in the hpx::parallel namespace. These differences can be easily remedied by defining a std_par macro to point to the right namespace for the library you want to use.

#include <iostream>
#include <vector>

#ifdef USE_HPX

    #include <hpx/hpx_init.hpp>
    #include <hpx/hpx.hpp>
    #include <hpx/include/parallel_numeric.hpp>
    #include <hpx/include/parallel_algorithm.hpp>
    #include <hpx/parallel/algorithms/fill.hpp>

    #define std_par hpx::parallel

#elif USE_INTEL_PSTL

    #include <pstl/execution>
    #include <pstl/numeric>
    #include <pstl/algorithm>

    #define std_par std

#else

    // Use the standard library implementation
    #include <execution>
    #include <numeric>
    #include <algorithm>

    #define std_par std

#endif

int main(int argc, char *argv[])
{
    using std_par::execution::par;

    std::vector<int> xs(100000000);

    std_par::fill(par, std::begin(xs), std::end(xs), 42);
    std::cout
        << std_par::reduce(par, std::begin(xs), std::end(xs), 0)
        << std::endl;
}

Speed

It is good that we have two separate implementations of parallel STL algorithms to choose from. If you use only the algorithms defined by the C++17 standard, it will be easy to switch from one implementation to another to find the one most efficient for your particular use-case.

The HPX library seems to be significantly faster on my system (not Intel-based) with GCC than Intel’s PSTL, but your mileage might vary.

Newer posts Older posts