mirror of
https://github.com/boostorg/unordered.git
synced 2025-07-30 03:17:15 +02:00
added tutorial on boost::concurrent_flat_map
This commit is contained in:
@ -14,6 +14,7 @@ include::unordered/intro.adoc[]
|
|||||||
include::unordered/buckets.adoc[]
|
include::unordered/buckets.adoc[]
|
||||||
include::unordered/hash_equality.adoc[]
|
include::unordered/hash_equality.adoc[]
|
||||||
include::unordered/comparison.adoc[]
|
include::unordered/comparison.adoc[]
|
||||||
|
include::unordered/concurrent_flat_map_intro.adoc[]
|
||||||
include::unordered/compliance.adoc[]
|
include::unordered/compliance.adoc[]
|
||||||
include::unordered/benchmarks.adoc[]
|
include::unordered/benchmarks.adoc[]
|
||||||
include::unordered/rationale.adoc[]
|
include::unordered/rationale.adoc[]
|
||||||
|
180
doc/unordered/concurrent_flat_map_intro.adoc
Normal file
180
doc/unordered/concurrent_flat_map_intro.adoc
Normal file
@ -0,0 +1,180 @@
|
|||||||
|
[#concurrent_flat_map_intro]
|
||||||
|
= An introduction to boost::concurrent_flat_map
|
||||||
|
|
||||||
|
:idprefix: concurrent_flat_map_intro_
|
||||||
|
|
||||||
|
`boost::concurrent_flat_map` is a hash table that allows concurrent write/read access from
|
||||||
|
different threads without having to implement any synchronzation mechanism on the user's side.
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
std::vector<int> input;
|
||||||
|
boost::concurrent_flat_map<int,int> m;
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
// process input in parallel
|
||||||
|
const int num_threads = 8;
|
||||||
|
std::vector<std::jthread> threads;
|
||||||
|
std::size_t chunk = input.size() / num_threads; // how many elements per thread
|
||||||
|
|
||||||
|
for (int i = 0; i < num_threads; ++i) {
|
||||||
|
threads.emplace_back([&,i] {
|
||||||
|
// calculate the portion of input this thread takes care of
|
||||||
|
std::size_t start = i * chunk;
|
||||||
|
std::size_t end = (i == num_threads - 1)? input.size(): (i + 1) * chunk;
|
||||||
|
|
||||||
|
for (std::size_t n = start; n < end; ++n) {
|
||||||
|
m.emplace(input[n], calculation(input[n]));
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
----
|
||||||
|
|
||||||
|
In the example above, threads access `m` without synchronization, just as we'd do in a
|
||||||
|
single-threaded scenario. In an ideal setting, if a given workload is distributed among
|
||||||
|
_N_ threads, execution is _N_ times faster than with one thread —this limit is
|
||||||
|
never attained in practice due to synchronization overheads and _contention_ (one thread
|
||||||
|
waiting for another to leave a locked portion of the map), but `boost::concurrent_flat_map`
|
||||||
|
is designed to perform with very little overhead and typically achieves _linear scaling_
|
||||||
|
(that is, performance is proportional to the number of threads up to the number of
|
||||||
|
logical cores in the CPU).
|
||||||
|
|
||||||
|
== Visitation-based API
|
||||||
|
|
||||||
|
The first thing a new user of `boost::concurrent_flat_map` will notice is that this
|
||||||
|
class _does not provide iterators_ (which makes it technically
|
||||||
|
not a https://en.cppreference.com/w/cpp/named_req/Container[Container^]
|
||||||
|
in the C++ standard sense). The reason for this is that iterators are inherently
|
||||||
|
thread-unsafe. Consider this hypothetical code:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
auto it = m.find(k); // A: get an iterator pointing to the element with key k
|
||||||
|
if (it != m.end() ) {
|
||||||
|
some_function(*it); // B: use the value of the element
|
||||||
|
}
|
||||||
|
----
|
||||||
|
|
||||||
|
In a multithreaded scenario, the iterator `it` may be invalid at point B if some other
|
||||||
|
thread issues an `m.erase(k)` operation between A and B. There are designs that
|
||||||
|
can remedy this by making iterators lock the element they point to, but this
|
||||||
|
approach lends itself to high contention and can easily produce deadlocks in a program.
|
||||||
|
`operator[]` has similar concurrency issues, and is not provided by
|
||||||
|
`boost::concurrent_flat_map` either. Instead, element access is done through
|
||||||
|
so-called _visitation functions_:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.visit(k, [](const auto& x) { // x is the element with key k (if it exists)
|
||||||
|
some_function(x); // use it
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
The visitation function passed by the user (in this case, a lambda function)
|
||||||
|
is executed internally by `boost::concurrent_flat_map` in
|
||||||
|
a thread-safe manner, so it can access the element without worries about other
|
||||||
|
threads interfering in the process. On the other hand, a
|
||||||
|
visitation function can _not_ access the container itself:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.visit(k, [&](const auto& x) {
|
||||||
|
some_function(x, m.size()); // forbidden: m can't be accessed inside visitation
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
Access to a different container is allowed, though:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.visit(k, [&](const auto& x) {
|
||||||
|
if (some_function(x)) {
|
||||||
|
m2.insert(x); // OK, m2 is a different boost::concurrent_flat_map
|
||||||
|
}
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
But, in general, visitation functions should be as lightweight as possible to
|
||||||
|
reduce contention and increase parallelization. In some cases, moving heavy work
|
||||||
|
outside of visitation may be beneficial:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
std::optional<value_type> o;
|
||||||
|
bool found = m.visit(k, [&](const auto& x) {
|
||||||
|
o = x;
|
||||||
|
});
|
||||||
|
if (found) {
|
||||||
|
some_heavy_duty_function(*o);
|
||||||
|
}
|
||||||
|
----
|
||||||
|
|
||||||
|
Visitation is pervasive in the API provided by `boost::concurrent_flat_map`, and
|
||||||
|
many classical operations have visitation-enabled variations:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.insert_or_visit(x, [](auto& y) {
|
||||||
|
// if insertion failed because of an equivalent element y,
|
||||||
|
// do something with it, for instance:
|
||||||
|
++y.second; // increment the mapped part of the element
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
Note that in this last example the visitation function could actually _modify_
|
||||||
|
the element: as a general rule, operations on a `boost::concurrent_flat_map` `m`
|
||||||
|
will grant visitation functions const/non-const access to the element depending on whether
|
||||||
|
`m` is const/non-const. Const access can be always be explicitly requested
|
||||||
|
by using `cvisit` overloads (for instance, `insert_or_cvisit`) and may result
|
||||||
|
in higher parallelization. Consult the xref:#concurrent_flat_map[reference]
|
||||||
|
for a complete list of available operations.
|
||||||
|
|
||||||
|
== Whole-table visitation
|
||||||
|
|
||||||
|
In the absence of iterators, `boost::concurrent_flat_map` provides `visit_all`
|
||||||
|
as an alternative way to process all the elements in the map:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.visit_all([](auto& x) {
|
||||||
|
x.second = 0; // reset the mapped part of the element
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
In C++17 compilers implementing standard parallel algorithms, whole-table
|
||||||
|
visitation can be parallelized:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.visit_all(std::execution::par, [](auto& x) { // run in parallel
|
||||||
|
x.second = 0; // reset the mapped part of the element
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
There is another whole-table visitation operation, `erase_if`:
|
||||||
|
|
||||||
|
[source,c++]
|
||||||
|
----
|
||||||
|
m.erase_if([](auto& x) {
|
||||||
|
return x.second == 0; // erase the elements whose mapped value is zero
|
||||||
|
});
|
||||||
|
----
|
||||||
|
|
||||||
|
`erase_if` can also be parallelized. Note that, in order to increase efficiency,
|
||||||
|
these operations do not block the table during execution: this implies that elements
|
||||||
|
may be inserted, modified or erased by other threads during visitation. It is
|
||||||
|
advisable not to assume too much about the exact global state of a `boost::concurrent_flat_map`
|
||||||
|
at any point in your program.
|
||||||
|
|
||||||
|
== Blocking operations
|
||||||
|
|
||||||
|
``boost::concurrent_flat_map``s can be copied, assigned, cleared and merged just like any
|
||||||
|
Boost.Unordered container. Unlike most other operations, these are _blocking_,
|
||||||
|
that is, all other threads are prevented from accesing the tables involved while a copy, assignment,
|
||||||
|
clear or merge operation is in progress. Blocking is taken care of automatically by the library
|
||||||
|
and the user need not take any special precaution, but overall performance may be affected.
|
||||||
|
|
||||||
|
Another blocking operation is _rehashing_, which happens explicitly via `rehash`/`reserve`
|
||||||
|
or during insertion when the table's load hits `max_load()`. As with non-concurrent hashmaps,
|
||||||
|
reserving space in advance of bulk insertions will generally speed up the process.
|
@ -216,3 +216,13 @@ for more details.
|
|||||||
|
|
||||||
There are other differences, which are listed in the
|
There are other differences, which are listed in the
|
||||||
<<comparison,Comparison with Associative Containers>> section.
|
<<comparison,Comparison with Associative Containers>> section.
|
||||||
|
|
||||||
|
== A concurrent hashmap
|
||||||
|
|
||||||
|
Starting in Boost 1.83, Boost.Unordered provides `boost::concurrent_flat_map`,
|
||||||
|
a thread-safe hash table for high performance multithreaded scenarios. Although
|
||||||
|
it shares the internal data structure and most of the algorithms with Boost.Unordered
|
||||||
|
open-addressing `boost::unordered_flat_map`, ``boost::concurrent_flat_map``'s API departs significantly
|
||||||
|
from that of C++ unordered associative containers to make this table suitable for
|
||||||
|
concurrent usage. Consult the xref:#concurrent_flat_map_intro[dedicated tutorial]
|
||||||
|
for more information.
|
||||||
|
Reference in New Issue
Block a user