mirror of
https://github.com/boostorg/unordered.git
synced 2026-05-19 23:24:44 +02:00
Revert "update documentation to use antora"
This reverts commit 3c452f93c5.
This commit is contained in:
@@ -1,725 +0,0 @@
|
||||
[#benchmarks]
|
||||
:idprefix: benchmarks_
|
||||
|
||||
= Benchmarks
|
||||
|
||||
== boost::unordered_[multi]set
|
||||
|
||||
All benchmarks were created using `unordered_set<unsigned int>` (non-duplicate) and `unordered_multiset<unsigned int>` (duplicate). The source code can be https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_unordered_set[found here^].
|
||||
|
||||
The insertion benchmarks insert `n` random values, where `n` is between 10,000 and 3 million. For the duplicated benchmarks, the same random values are repeated an average of 5 times.
|
||||
|
||||
The erasure benchmarks erase all `n` elements randomly until the container is empty. Erasure by key uses `erase(const key_type&)` to remove entire groups of equivalent elements in each operation.
|
||||
|
||||
The successful lookup benchmarks are done by looking up all `n` values, in their original insertion order.
|
||||
|
||||
The unsuccessful lookup benchmarks use `n` randomly generated integers but using a different seed value.
|
||||
|
||||
=== GCC 12 + libstdc++-v3, x64
|
||||
|
||||
==== Insertion
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice.png[width=250,link=_images/benchmarks-set/gcc/running insertion.xlsx.practice.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice non-unique.png[width=250,link=_images/benchmarks-set/gcc/running insertion.xlsx.practice non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice non-unique 5.png[width=250,link=_images/benchmarks-set/gcc/running insertion.xlsx.practice non-unique 5.png,window=_blank]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice norehash.png[width=250,link= _images/benchmarks-set/gcc/running insertion.xlsx.practice norehash.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice norehash non-unique.png[width=250,link= _images/benchmarks-set/gcc/running insertion.xlsx.practice norehash non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/running insertion.xlsx.practice norehash non-unique 5.png[width=250,link= _images/benchmarks-set/gcc/running insertion.xlsx.practice norehash non-unique 5.png,window=_blank]
|
||||
|
||||
h|non-duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
max load factor 5, +
|
||||
prior `reserve`
|
||||
|
||||
|===
|
||||
|
||||
==== Erasure
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/gcc/scattered erasure.xlsx.practice.png[width=250,link= _images/benchmarks-set/gcc/scattered erasure.xlsx.practice.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/scattered erasure.xlsx.practice non-unique.png[width=250,link= _images/benchmarks-set/gcc/scattered erasure.xlsx.practice non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/scattered erasure.xlsx.practice non-unique 5.png[width=250,link= _images/benchmarks-set/gcc/scattered erasure.xlsx.practice non-unique 5.png,window=_blank]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|
|
||||
|image::benchmarks-set/gcc/scattered erasure by key.xlsx.practice non-unique.png[width=250,link= _images/benchmarks-set/gcc/scattered erasure by key.xlsx.practice non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/gcc/scattered erasure by key.xlsx.practice non-unique 5.png[width=250,link= _images/benchmarks-set/gcc/scattered erasure by key.xlsx.practice non-unique 5.png,window=_blank]
|
||||
|
||||
|
|
||||
h|by key, duplicate elements
|
||||
h|by key, duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Successful Lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/gcc/scattered successful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered successful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/gcc/scattered successful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered successful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/gcc/scattered successful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered successful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Unsuccessful lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/gcc/scattered unsuccessful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
=== Clang 15 + libc++, x64
|
||||
|
||||
==== Insertion
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice.png[width=250, window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice.png]
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice non-unique.png[width=250, window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice non-unique 5.png[width=250, window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash.png]
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash non-unique.png]
|
||||
|image::benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/running insertion.xlsx.practice norehash non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
max load factor 5, +
|
||||
prior `reserve`
|
||||
|
||||
|===
|
||||
|
||||
==== Erasure
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered erasure.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|
|
||||
|image::benchmarks-set/clang_libcpp/scattered erasure by key.xlsx.practice non-unique.png[width=250,link= _images/benchmarks-set/clang_libcpp/scattered erasure by key.xlsx.practice non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/clang_libcpp/scattered erasure by key.xlsx.practice non-unique 5.png[width=250,link= _images/benchmarks-set/clang_libcpp/scattered erasure by key.xlsx.practice non-unique 5.png,window=_blank]
|
||||
|
||||
|
|
||||
h|by key, duplicate elements
|
||||
h|by key, duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Successful lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered successful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Unsuccessful lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/clang_libcpp/scattered unsuccessful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
=== Visual Studio 2022 + Dinkumware, x64
|
||||
|
||||
==== Insertion
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice.png]
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice norehash.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice norehash.png]
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice norehash non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice norehash non-unique.png]
|
||||
|image::benchmarks-set/vs/running insertion.xlsx.practice norehash non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/vs/running insertion.xlsx.practice norehash non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
prior `reserve`
|
||||
h|duplicate elements, +
|
||||
max load factor 5, +
|
||||
prior `reserve`
|
||||
|
||||
|===
|
||||
|
||||
==== Erasure
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/vs/scattered erasure.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered erasure.xlsx.practice.png]
|
||||
|image::benchmarks-set/vs/scattered erasure.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered erasure.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/vs/scattered erasure.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered erasure.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|
|
||||
|image::benchmarks-set/vs/scattered erasure by key.xlsx.practice non-unique.png[width=250,link= _images/benchmarks-set/vs/scattered erasure by key.xlsx.practice non-unique.png,window=_blank]
|
||||
|image::benchmarks-set/vs/scattered erasure by key.xlsx.practice non-unique 5.png[width=250,link= _images/benchmarks-set/vs/scattered erasure by key.xlsx.practice non-unique 5.png,window=_blank]
|
||||
|
||||
|
|
||||
h|by key, duplicate elements
|
||||
h|by key, duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Successful lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/vs/scattered successful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered successful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/vs/scattered successful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered successful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/vs/scattered successful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered successful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
==== Unsuccessful lookup
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice.png]
|
||||
|image::benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice non-unique.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice non-unique.png]
|
||||
|image::benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice non-unique 5.png[width=250,window=_blank,link= _images/benchmarks-set/vs/scattered unsuccessful looukp.xlsx.practice non-unique 5.png]
|
||||
|
||||
h|non-duplicate elements
|
||||
h|duplicate elements
|
||||
h|duplicate elements, +
|
||||
max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
== boost::unordered_(flat|node)_map
|
||||
|
||||
All benchmarks were created using:
|
||||
|
||||
* `https://abseil.io/docs/cpp/guides/container[absl::flat_hash_map^]<uint64_t, uint64_t>`
|
||||
* `boost::unordered_map<uint64_t, uint64_t>`
|
||||
* `boost::unordered_flat_map<uint64_t, uint64_t>`
|
||||
* `boost::unordered_node_map<uint64_t, uint64_t>`
|
||||
|
||||
The source code can be https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_unordered_flat_map[found here^].
|
||||
|
||||
The insertion benchmarks insert `n` random values, where `n` is between 10,000 and 10 million.
|
||||
|
||||
The erasure benchmarks erase traverse the `n` elements and erase those with odd key (50% on average).
|
||||
|
||||
The successful lookup benchmarks are done by looking up all `n` values, in their original insertion order.
|
||||
|
||||
The unsuccessful lookup benchmarks use `n` randomly generated integers but using a different seed value.
|
||||
|
||||
|
||||
=== GCC 12, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/gcc-x64/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x64/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x64/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x64/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x64/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x64/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x64/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x64/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== Clang 15, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/clang-x64/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x64/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x64/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x64/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x64/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x64/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x64/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x64/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== Visual Studio 2022, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/vs-x64/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x64/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x64/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x64/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x64/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x64/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x64/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x64/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== Clang 12, ARM64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/clang-arm64/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-arm64/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-arm64/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-arm64/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-arm64/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-arm64/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-arm64/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-arm64/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== GCC 12, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/gcc-x86/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x86/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x86/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x86/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x86/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x86/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/gcc-x86/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/gcc-x86/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== Clang 15, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/clang-x86/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x86/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x86/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x86/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x86/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x86/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/clang-x86/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/clang-x86/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
=== Visual Studio 2022, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="4*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-flat_map/vs-x86/Running insertion.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x86/Running insertion.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x86/Running erasure.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x86/Running erasure.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x86/Scattered successful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x86/Scattered successful looukp.xlsx.plot.png]
|
||||
|image::benchmarks-flat_map/vs-x86/Scattered unsuccessful looukp.xlsx.plot.png[width=250,window=_blank,link= _images/benchmarks-flat_map/vs-x86/Scattered unsuccessful looukp.xlsx.plot.png]
|
||||
|
||||
h|running insertion
|
||||
h|running erasure
|
||||
h|successful lookup
|
||||
h|unsuccessful lookup
|
||||
|
||||
|===
|
||||
|
||||
== boost::concurrent_(flat|node)_map
|
||||
|
||||
All benchmarks were created using:
|
||||
|
||||
* `https://spec.oneapi.io/versions/latest/elements/oneTBB/source/containers/concurrent_hash_map_cls.html[oneapi::tbb::concurrent_hash_map^]<int, int>`
|
||||
* `https://github.com/greg7mdp/gtl/blob/main/docs/phmap.md[gtl::parallel_flat_hash_map^]<int, int>` with 64 submaps
|
||||
* `boost::concurrent_flat_map<int, int>`
|
||||
* `boost::concurrent_node_map<int, int>`
|
||||
|
||||
The source code can be https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_concurrent_flat_map[found here^].
|
||||
|
||||
The benchmarks exercise a number of threads _T_ (between 1 and 16) concurrently performing operations
|
||||
randomly chosen among **update**, **successful lookup** and **unsuccessful lookup**. The keys used in the
|
||||
operations follow a https://en.wikipedia.org/wiki/Zipf%27s_law#Formal_definition[Zipf distribution^]
|
||||
with different _skew_ parameters: the higher the skew, the more concentrated are the keys in the lower values
|
||||
of the covered range.
|
||||
|
||||
`boost::concurrent_flat_map` and `boost::concurrent_node_map` are exercised using both regular and xref:#concurrent_bulk_visitation[bulk visitation]:
|
||||
in the latter case, lookup keys are buffered in a local array and then processed at
|
||||
once each time the buffer reaches xref:#concurrent_flat_map_constants[`bulk_visit_size`].
|
||||
|
||||
=== GCC 12, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x64/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== Clang 15, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x64/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== Visual Studio 2022, x64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x64/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== Clang 12, ARM64
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-arm64/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== GCC 12, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/gcc-x86/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== Clang 15, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/clang-x86/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
=== Visual Studio 2022, x86
|
||||
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.500k, 0.99.png]
|
||||
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.01
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.5
|
||||
h|500k updates, 4.5M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
|
||||
[caption=]
|
||||
[cols="3*^.^a", frame=all, grid=all]
|
||||
|===
|
||||
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.01.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.01.png]
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.5.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.5.png]
|
||||
|image::benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.99.png[width=250,window=_blank,link= _images/benchmarks-concurrent_map/vs-x86/Parallel workload.xlsx.5M, 0.99.png]
|
||||
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.01
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.5
|
||||
h|5M updates, 45M lookups +
|
||||
skew=0.99
|
||||
|===
|
||||
@@ -1,12 +0,0 @@
|
||||
[#bibliography]
|
||||
|
||||
:idprefix: bibliography_
|
||||
|
||||
= Bibliography
|
||||
|
||||
* _C/C++ Users Journal_. February, 2006. Pete Becker. http://www.ddj.com/cpp/184402066[STL and TR1: Part III - Unordered containers^]. +
|
||||
An introduction to the standard unordered containers.
|
||||
* _Wikipedia_. https://en.wikipedia.org/wiki/Hash_table[Hash table^]. +
|
||||
An introduction to hash table implementations. Discusses the differences between closed-addressing and open-addressing approaches.
|
||||
* Peter Dimov, 2022. https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
|
||||
|
||||
@@ -1,147 +0,0 @@
|
||||
[#buckets]
|
||||
:idprefix: buckets_
|
||||
|
||||
= Basics of Hash Tables
|
||||
|
||||
The containers are made up of a number of _buckets_, each of which can contain
|
||||
any number of elements. For example, the following diagram shows a <<unordered_set,`boost::unordered_set`>> with 7 buckets containing 5 elements, `A`,
|
||||
`B`, `C`, `D` and `E` (this is just for illustration, containers will typically
|
||||
have more buckets).
|
||||
|
||||
image::buckets.png[]
|
||||
|
||||
In order to decide which bucket to place an element in, the container applies
|
||||
the hash function, `Hash`, to the element's key (for sets the key is the whole element, but is referred to as the key
|
||||
so that the same terminology can be used for sets and maps). This returns a
|
||||
value of type `std::size_t`. `std::size_t` has a much greater range of values
|
||||
then the number of buckets, so the container applies another transformation to
|
||||
that value to choose a bucket to place the element in.
|
||||
|
||||
Retrieving the elements for a given key is simple. The same process is applied
|
||||
to the key to find the correct bucket. Then the key is compared with the
|
||||
elements in the bucket to find any elements that match (using the equality
|
||||
predicate `Pred`). If the hash function has worked well the elements will be
|
||||
evenly distributed amongst the buckets so only a small number of elements will
|
||||
need to be examined.
|
||||
|
||||
There is <<hash_equality, more information on hash functions and
|
||||
equality predicates in the next section>>.
|
||||
|
||||
You can see in the diagram that `A` & `D` have been placed in the same bucket.
|
||||
When looking for elements in this bucket up to 2 comparisons are made, making
|
||||
the search slower. This is known as a *collision*. To keep things fast we try to
|
||||
keep collisions to a minimum.
|
||||
|
||||
If instead of `boost::unordered_set` we had used <<unordered_flat_set,`boost::unordered_flat_set`>>, the
|
||||
diagram would look as follows:
|
||||
|
||||
image::buckets-oa.png[]
|
||||
|
||||
In open-addressing containers, buckets can hold at most one element; if a collision happens
|
||||
(like is the case of `D` in the example), the element uses some other available bucket in
|
||||
the vicinity of the original position. Given this simpler scenario, Boost.Unordered
|
||||
open-addressing containers offer a very limited API for accessing buckets.
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Methods for Accessing Buckets']
|
||||
[cols="1,.^1", frame=all, grid=rows]
|
||||
|===
|
||||
2+^h| *All containers*
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`size_type bucket_count() const`
|
||||
|The number of buckets.
|
||||
|
||||
2+^h| *Closed-addressing containers only*
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`size_type max_bucket_count() const`
|
||||
|An upper bound on the number of buckets.
|
||||
|`size_type bucket_size(size_type n) const`
|
||||
|The number of elements in bucket `n`.
|
||||
|
||||
|`size_type bucket(key_type const& k) const`
|
||||
|Returns the index of the bucket which would contain `k`.
|
||||
|
||||
|`local_iterator begin(size_type n)`
|
||||
1.6+|Return begin and end iterators for bucket `n`.
|
||||
|
||||
|`local_iterator end(size_type n)`
|
||||
|
||||
|`const_local_iterator begin(size_type n) const`
|
||||
|
||||
|`const_local_iterator end(size_type n) const`
|
||||
|
||||
|`const_local_iterator cbegin(size_type n) const`
|
||||
|
||||
|`const_local_iterator cend(size_type n) const`
|
||||
|
||||
|===
|
||||
|
||||
== Controlling the Number of Buckets
|
||||
|
||||
As more elements are added to an unordered associative container, the number
|
||||
of collisions will increase causing performance to degrade.
|
||||
To combat this the containers increase the bucket count as elements are inserted.
|
||||
You can also tell the container to change the bucket count (if required) by
|
||||
calling `rehash`.
|
||||
|
||||
The standard leaves a lot of freedom to the implementer to decide how the
|
||||
number of buckets is chosen, but it does make some requirements based on the
|
||||
container's _load factor_, the number of elements divided by the number of buckets.
|
||||
Containers also have a _maximum load factor_ which they should try to keep the
|
||||
load factor below.
|
||||
|
||||
You can't control the bucket count directly but there are two ways to
|
||||
influence it:
|
||||
|
||||
* Specify the minimum number of buckets when constructing a container or when calling `rehash`.
|
||||
* Suggest a maximum load factor by calling `max_load_factor`.
|
||||
|
||||
`max_load_factor` doesn't let you set the maximum load factor yourself, it just
|
||||
lets you give a _hint_. And even then, the standard doesn't actually
|
||||
require the container to pay much attention to this value. The only time the
|
||||
load factor is _required_ to be less than the maximum is following a call to
|
||||
`rehash`. But most implementations will try to keep the number of elements
|
||||
below the max load factor, and set the maximum load factor to be the same as
|
||||
or close to the hint - unless your hint is unreasonably small or large.
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Methods for Controlling Bucket Size']
|
||||
[cols="1,.^1", frame=all, grid=rows]
|
||||
|===
|
||||
2+^h| *All containers*
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`X(size_type n)`
|
||||
|Construct an empty container with at least `n` buckets (`X` is the container type).
|
||||
|
||||
|`X(InputIterator i, InputIterator j, size_type n)`
|
||||
|Construct an empty container with at least `n` buckets and insert elements from the range `[i, j)` (`X` is the container type).
|
||||
|
||||
|`float load_factor() const`
|
||||
|The average number of elements per bucket.
|
||||
|
||||
|`float max_load_factor() const`
|
||||
|Returns the current maximum load factor.
|
||||
|
||||
|`float max_load_factor(float z)`
|
||||
|Changes the container's maximum load factor, using `z` as a hint. +
|
||||
**Open-addressing and concurrent containers:** this function does nothing: users are not allowed to change the maximum load factor.
|
||||
|
||||
|`void rehash(size_type n)`
|
||||
|Changes the number of buckets so that there at least `n` buckets, and so that the load factor is less than the maximum load factor.
|
||||
|
||||
2+^h| *Open-addressing and concurrent containers only*
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`size_type max_load() const`
|
||||
|Returns the maximum number of allowed elements in the container before rehash.
|
||||
|
||||
|===
|
||||
|
||||
A note on `max_load` for open-addressing and concurrent containers: the maximum load will be
|
||||
(`max_load_factor() * bucket_count()`) right after `rehash` or on container creation, but may
|
||||
slightly decrease when erasing elements in high-load situations. For instance, if we
|
||||
have a <<unordered_flat_map,`boost::unordered_flat_map`>> with `size()` almost
|
||||
at `max_load()` level and then erase 1,000 elements, `max_load()` may decrease by around a
|
||||
few dozen elements. This is done internally by Boost.Unordered in order
|
||||
to keep its performance stable, and must be taken into account when planning for rehash-free insertions.
|
||||
@@ -1,458 +0,0 @@
|
||||
[#changes]
|
||||
= Change Log
|
||||
|
||||
:idprefix: changes_
|
||||
:svn-ticket-url: https://svn.boost.org/trac/boost/ticket
|
||||
:github-pr-url: https://github.com/boostorg/unordered/pull
|
||||
:cpp: C++
|
||||
|
||||
== Release 1.87.0 - Major update
|
||||
|
||||
* Added concurrent, node-based containers `boost::concurrent_node_map` and `boost::concurrent_node_set`.
|
||||
* Added `insert_and_visit(x, f1, f2)` and similar operations to concurrent containers, which
|
||||
allow for visitation of an element right after insertion (by contrast, `insert_or_visit(x, f)` only
|
||||
visits the element if insertion did _not_ take place).
|
||||
* Made visitation exclusive-locked within certain
|
||||
`boost::concurrent_flat_set` operations to allow for safe mutable modification of elements
|
||||
({github-pr-url}/265[PR#265^]).
|
||||
* In Visual Studio Natvis, supported any container with an allocator that uses fancy pointers. This applies to any fancy pointer type, as long as the proper Natvis customization point "Intrinsic" functions are written for the fancy pointer type.
|
||||
* Added GDB pretty-printers for all containers and iterators. For a container with an allocator that uses fancy pointers, these only work if the proper pretty-printer is written for the fancy pointer type itself.
|
||||
* Fixed `std::initializer_list` assignment issues for open-addressing containers
|
||||
({github-pr-url}/277[PR#277^]).
|
||||
* Allowed non-copyable callables to be passed to the `std::initializer_list` overloads of `insert_{and|or}_[c]visit` for concurrent containers, by internally passing a `std::reference_wrapper` of the callable to the iterator-pair overloads.
|
||||
|
||||
|
||||
== Release 1.86.0
|
||||
|
||||
* Added container `pmr` aliases when header `<memory_resource>` is available. The alias `boost::unordered::pmr::[container]` refers to `boost::unordered::[container]` with a `std::pmr::polymorphic_allocator` allocator type.
|
||||
* Equipped open-addressing and concurrent containers to internally calculate and provide statistical metrics affected by the quality of the hash function. This functionality is enabled by the global macro `BOOST_UNORDERED_ENABLE_STATS`.
|
||||
* Avalanching hash functions must now be marked via an `is_avalanching` typedef with an embedded `value` constant set to `true` (typically, defining `is_avalanching` as `std::true_type`). `using is_avalanching = void` is deprecated but allowed for backwards compatibility.
|
||||
* Added Visual Studio Natvis framework custom visualizations for containers and iterators. This works for all containers with an allocator using raw pointers. In this release, containers and iterators are not supported if their allocator uses fancy pointers. This may be addressed in later releases.
|
||||
|
||||
== Release 1.85.0
|
||||
|
||||
* Optimized `emplace()` for a `value_type` or `init_type` (if applicable) argument to bypass creating an intermediate object. The argument is already the same type as the would-be intermediate object.
|
||||
* Optimized `emplace()` for `k,v` arguments on map containers to delay constructing the object until it is certain that an element should be inserted. This optimization happens when the map's `key_type` is move constructible or when the `k` argument is a `key_type`.
|
||||
* Fixed support for allocators with `explicit` copy constructors ({github-pr-url}/234[PR#234^]).
|
||||
* Fixed bug in the `const` version of `unordered_multimap::find(k, hash, eq)` ({github-pr-url}/238[PR#238^]).
|
||||
|
||||
== Release 1.84.0 - Major update
|
||||
|
||||
* Added `boost::concurrent_flat_set`.
|
||||
* Added `[c]visit_while` operations to concurrent containers,
|
||||
with serial and parallel variants.
|
||||
* Added efficient move construction of `boost::unordered_flat_(map|set)` from
|
||||
`boost::concurrent_flat_(map|set)` and vice versa.
|
||||
* Added bulk visitation to concurrent containers for increased lookup performance.
|
||||
* Added debug-mode mechanisms for detecting illegal reentrancies into
|
||||
a concurrent container from user code.
|
||||
* Added Boost.Serialization support to all containers and their (non-local) iterator types.
|
||||
* Added support for fancy pointers to open-addressing and concurrent containers.
|
||||
This enables scenarios like the use of Boost.Interprocess allocators to construct containers in shared memory.
|
||||
* Fixed bug in member of pointer operator for local iterators of closed-addressing
|
||||
containers ({github-pr-url}/221[PR#221^], credit goes to GitHub user vslashg for finding
|
||||
and fixing this issue).
|
||||
* Starting with this release, `boost::unordered_[multi]set` and `boost::unordered_[multi]map`
|
||||
only work with C++11 onwards.
|
||||
|
||||
== Release 1.83.0 - Major update
|
||||
|
||||
* Added `boost::concurrent_flat_map`, a fast, thread-safe hashmap based on open addressing.
|
||||
* Sped up iteration of open-addressing containers.
|
||||
* In open-addressing containers, `erase(iterator)`, which previously returned nothing, now
|
||||
returns a proxy object convertible to an iterator to the next element.
|
||||
This enables the typical `it = c.erase(it)` idiom without incurring any performance penalty
|
||||
when the returned proxy is not used.
|
||||
|
||||
== Release 1.82.0 - Major update
|
||||
|
||||
* {cpp}03 support is planned for deprecation. Boost 1.84.0 will no longer support
|
||||
{cpp}03 mode and {cpp}11 will become the new minimum for using the library.
|
||||
* Added node-based, open-addressing containers
|
||||
`boost::unordered_node_map` and `boost::unordered_node_set`.
|
||||
* Extended heterogeneous lookup to more member functions as specified in
|
||||
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2363r5.html[P2363].
|
||||
* Replaced the previous post-mixing process for open-addressing containers with
|
||||
a new algorithm based on extended multiplication by a constant.
|
||||
* Fixed bug in internal emplace() impl where stack-local types were not properly
|
||||
constructed using the Allocator of the container which breaks uses-allocator
|
||||
construction.
|
||||
|
||||
== Release 1.81.0 - Major update
|
||||
|
||||
* Added fast containers `boost::unordered_flat_map` and `boost::unordered_flat_set`
|
||||
based on open addressing.
|
||||
* Added CTAD deduction guides for all containers.
|
||||
* Added missing constructors as specified in https://cplusplus.github.io/LWG/issue2713[LWG issue 2713].
|
||||
|
||||
== Release 1.80.0 - Major update
|
||||
|
||||
* Refactor internal implementation to be dramatically faster
|
||||
* Allow `final` Hasher and KeyEqual objects
|
||||
* Update documentation, adding benchmark graphs and notes on the new internal
|
||||
data structures
|
||||
|
||||
== Release 1.79.0
|
||||
|
||||
* Improved {cpp}20 support:
|
||||
** All containers have been updated to support
|
||||
heterogeneous `count`, `equal_range` and `find`.
|
||||
** All containers now implement the member function `contains`.
|
||||
** `erase_if` has been implemented for all containers.
|
||||
* Improved {cpp}23 support:
|
||||
** All containers have been updated to support
|
||||
heterogeneous `erase` and `extract`.
|
||||
* Changed behavior of `reserve` to eagerly
|
||||
allocate ({github-pr-url}/59[PR#59^]).
|
||||
* Various warning fixes in the test suite.
|
||||
* Update code to internally use `boost::allocator_traits`.
|
||||
* Switch to Fibonacci hashing.
|
||||
* Update documentation to be written in AsciiDoc instead of QuickBook.
|
||||
|
||||
== Release 1.67.0
|
||||
|
||||
* Improved {cpp}17 support:
|
||||
** Add template deduction guides from the standard.
|
||||
** Use a simple implementation of `optional` in node handles, so
|
||||
that they're closer to the standard.
|
||||
** Add missing `noexcept` specifications to `swap`, `operator=`
|
||||
and node handles, and change the implementation to match.
|
||||
Using `std::allocator_traits::is_always_equal`, or our own
|
||||
implementation when not available, and
|
||||
`boost::is_nothrow_swappable` in the implementation.
|
||||
* Improved {cpp}20 support:
|
||||
** Use `boost::to_address`, which has the proposed {cpp}20 semantics,
|
||||
rather than the old custom implementation.
|
||||
* Add `element_type` to iterators, so that `std::pointer_traits`
|
||||
will work.
|
||||
* Use `std::piecewise_construct` on recent versions of Visual {cpp},
|
||||
and other uses of the Dinkumware standard library,
|
||||
now using Boost.Predef to check compiler and library versions.
|
||||
* Use `std::iterator_traits` rather than the boost iterator traits
|
||||
in order to remove dependency on Boost.Iterator.
|
||||
* Remove iterators' inheritance from `std::iterator`, which is
|
||||
deprecated in {cpp}17, thanks to Daniela Engert
|
||||
({github-pr-url}/7[PR#7^]).
|
||||
* Stop using `BOOST_DEDUCED_TYPENAME`.
|
||||
* Update some Boost include paths.
|
||||
* Rename some internal methods, and variables.
|
||||
* Various testing improvements.
|
||||
* Miscellaneous internal changes.
|
||||
|
||||
== Release 1.66.0
|
||||
|
||||
* Simpler move construction implementation.
|
||||
* Documentation fixes ({github-pr-url}/6[GitHub #6^]).
|
||||
|
||||
== Release 1.65.0
|
||||
|
||||
* Add deprecated attributes to `quick_erase` and `erase_return_void`.
|
||||
I really will remove them in a future version this time.
|
||||
* Small standards compliance fixes:
|
||||
** `noexpect` specs for `swap` free functions.
|
||||
** Add missing `insert(P&&)` methods.
|
||||
|
||||
== Release 1.64.0
|
||||
|
||||
* Initial support for new {cpp}17 member functions:
|
||||
`insert_or_assign` and `try_emplace` in `unordered_map`,
|
||||
* Initial support for `merge` and `extract`.
|
||||
Does not include transferring nodes between
|
||||
`unordered_map` and `unordered_multimap` or between `unordered_set` and
|
||||
`unordered_multiset` yet. That will hopefully be in the next version of
|
||||
Boost.
|
||||
|
||||
== Release 1.63.0
|
||||
|
||||
* Check hint iterator in `insert`/`emplace_hint`.
|
||||
* Fix some warnings, mostly in the tests.
|
||||
* Manually write out `emplace_args` for small numbers of arguments -
|
||||
should make template error messages a little more bearable.
|
||||
* Remove superfluous use of `boost::forward` in emplace arguments,
|
||||
which fixes emplacing string literals in old versions of Visual {cpp}.
|
||||
* Fix an exception safety issue in assignment. If bucket allocation
|
||||
throws an exception, it can overwrite the hash and equality functions while
|
||||
leaving the existing elements in place. This would mean that the function
|
||||
objects wouldn't match the container elements, so elements might be in the
|
||||
wrong bucket and equivalent elements would be incorrectly handled.
|
||||
* Various reference documentation improvements.
|
||||
* Better allocator support ({svn-ticket-url}/12459[#12459^]).
|
||||
* Make the no argument constructors implicit.
|
||||
* Implement missing allocator aware constructors.
|
||||
* Fix assigning the hash/key equality functions for empty containers.
|
||||
* Remove unary/binary_function from the examples in the documentation.
|
||||
They are removed in {cpp}17.
|
||||
* Support 10 constructor arguments in emplace. It was meant to support up to 10
|
||||
arguments, but an off by one error in the preprocessor code meant it only
|
||||
supported up to 9.
|
||||
|
||||
== Release 1.62.0
|
||||
|
||||
* Remove use of deprecated `boost::iterator`.
|
||||
* Remove `BOOST_NO_STD_DISTANCE` workaround.
|
||||
* Remove `BOOST_UNORDERED_DEPRECATED_EQUALITY` warning.
|
||||
* Simpler implementation of assignment, fixes an exception safety issue
|
||||
for `unordered_multiset` and `unordered_multimap`. Might be a little slower.
|
||||
* Stop using return value SFINAE which some older compilers have issues
|
||||
with.
|
||||
|
||||
== Release 1.58.0
|
||||
|
||||
* Remove unnecessary template parameter from const iterators.
|
||||
* Rename private `iterator` typedef in some iterator classes, as it
|
||||
confuses some traits classes.
|
||||
* Fix move assignment with stateful, propagate_on_container_move_assign
|
||||
allocators ({svn-ticket-url}/10777[#10777^]).
|
||||
* Fix rare exception safety issue in move assignment.
|
||||
* Fix potential overflow when calculating number of buckets to allocate
|
||||
({github-pr-url}/4[GitHub #4^]).
|
||||
|
||||
== Release 1.57.0
|
||||
|
||||
* Fix the `pointer` typedef in iterators ({svn-ticket-url}/10672[#10672^]).
|
||||
* Fix Coverity warning
|
||||
({github-pr-url}/2[GitHub #2^]).
|
||||
|
||||
== Release 1.56.0
|
||||
|
||||
* Fix some shadowed variable warnings ({svn-ticket-url}/9377[#9377^]).
|
||||
* Fix allocator use in documentation ({svn-ticket-url}/9719[#9719^]).
|
||||
* Always use prime number of buckets for integers. Fixes performance
|
||||
regression when inserting consecutive integers, although makes other
|
||||
uses slower ({svn-ticket-url}/9282[#9282^]).
|
||||
* Only construct elements using allocators, as specified in {cpp}11 standard.
|
||||
|
||||
== Release 1.55.0
|
||||
|
||||
* Avoid some warnings ({svn-ticket-url}/8851[#8851^], {svn-ticket-url}/8874[#8874^]).
|
||||
* Avoid exposing some detail functions via. ADL on the iterators.
|
||||
* Follow the standard by only using the allocators' construct and destroy
|
||||
methods to construct and destroy stored elements. Don't use them for internal
|
||||
data like pointers.
|
||||
|
||||
== Release 1.54.0
|
||||
|
||||
* Mark methods specified in standard as `noexpect`. More to come in the next
|
||||
release.
|
||||
* If the hash function and equality predicate are known to both have nothrow
|
||||
move assignment or construction then use them.
|
||||
|
||||
== Release 1.53.0
|
||||
|
||||
* Remove support for the old pre-standard variadic pair constructors, and
|
||||
equality implementation. Both have been deprecated since Boost 1.48.
|
||||
* Remove use of deprecated config macros.
|
||||
* More internal implementation changes, including a much simpler
|
||||
implementation of `erase`.
|
||||
|
||||
== Release 1.52.0
|
||||
|
||||
* Faster assign, which assigns to existing nodes where possible, rather than
|
||||
creating entirely new nodes and copy constructing.
|
||||
* Fixed bug in `erase_range` ({svn-ticket-url}/7471[#7471^]).
|
||||
* Reverted some of the internal changes to how nodes are created, especially
|
||||
for {cpp}11 compilers. 'construct' and 'destroy' should work a little better
|
||||
for {cpp}11 allocators.
|
||||
* Simplified the implementation a bit. Hopefully more robust.
|
||||
|
||||
== Release 1.51.0
|
||||
|
||||
* Fix construction/destruction issue when using a {cpp}11 compiler with a
|
||||
{cpp}03 allocator ({svn-ticket-url}/7100[#7100^]).
|
||||
* Remove a `try..catch` to support compiling without exceptions.
|
||||
* Adjust SFINAE use to try to support g++ 3.4 ({svn-ticket-url}/7175[#7175^]).
|
||||
* Updated to use the new config macros.
|
||||
|
||||
== Release 1.50.0
|
||||
|
||||
* Fix equality for `unordered_multiset` and `unordered_multimap`.
|
||||
* {svn-ticket-url}/6857[Ticket 6857^]:
|
||||
Implement `reserve`.
|
||||
* {svn-ticket-url}/6771[Ticket 6771^]:
|
||||
Avoid gcc's `-Wfloat-equal` warning.
|
||||
* {svn-ticket-url}/6784[Ticket 6784^]:
|
||||
Fix some Sun specific code.
|
||||
* {svn-ticket-url}/6190[Ticket 6190^]:
|
||||
Avoid gcc's `-Wshadow` warning.
|
||||
* {svn-ticket-url}/6905[Ticket 6905^]:
|
||||
Make namespaces in macros compatible with `bcp` custom namespaces.
|
||||
Fixed by Luke Elliott.
|
||||
* Remove some of the smaller prime number of buckets, as they may make
|
||||
collisions quite probable (e.g. multiples of 5 are very common because
|
||||
we used base 10).
|
||||
* On old versions of Visual {cpp}, use the container library's implementation
|
||||
of `allocator_traits`, as it's more likely to work.
|
||||
* On machines with 64 bit std::size_t, use power of 2 buckets, with Thomas
|
||||
Wang's hash function to pick which one to use. As modulus is very slow
|
||||
for 64 bit values.
|
||||
* Some internal changes.
|
||||
|
||||
== Release 1.49.0
|
||||
|
||||
* Fix warning due to accidental odd assignment.
|
||||
* Slightly better error messages.
|
||||
|
||||
== Release 1.48.0 - Major update
|
||||
|
||||
This is major change which has been converted to use Boost.Move's move
|
||||
emulation, and be more compliant with the {cpp}11 standard. See the
|
||||
xref:compliance.adoc[compliance section] for details.
|
||||
|
||||
The container now meets {cpp}11's complexity requirements, but to do so
|
||||
uses a little more memory. This means that `quick_erase` and
|
||||
`erase_return_void` are no longer required, they'll be removed in a
|
||||
future version.
|
||||
|
||||
{cpp}11 support has resulted in some breaking changes:
|
||||
|
||||
* Equality comparison has been changed to the {cpp}11 specification.
|
||||
In a container with equivalent keys, elements in a group with equal
|
||||
keys used to have to be in the same order to be considered equal,
|
||||
now they can be a permutation of each other. To use the old
|
||||
behavior define the macro `BOOST_UNORDERED_DEPRECATED_EQUALITY`.
|
||||
|
||||
* The behaviour of swap is different when the two containers to be
|
||||
swapped has unequal allocators. It used to allocate new nodes using
|
||||
the appropriate allocators, it now swaps the allocators if
|
||||
the allocator has a member structure `propagate_on_container_swap`,
|
||||
such that `propagate_on_container_swap::value` is true.
|
||||
|
||||
* Allocator's `construct` and `destroy` functions are called with raw
|
||||
pointers, rather than the allocator's `pointer` type.
|
||||
|
||||
* `emplace` used to emulate the variadic pair constructors that
|
||||
appeared in early {cpp}0x drafts. Since they were removed it no
|
||||
longer does so. It does emulate the new `piecewise_construct`
|
||||
pair constructors - only you need to use
|
||||
`boost::piecewise_construct`. To use the old emulation of
|
||||
the variadic constructors define
|
||||
`BOOST_UNORDERED_DEPRECATED_PAIR_CONSTRUCT`.
|
||||
|
||||
== Release 1.45.0
|
||||
|
||||
* Fix a bug when inserting into an `unordered_map` or `unordered_set` using
|
||||
iterators which returns `value_type` by copy.
|
||||
|
||||
== Release 1.43.0
|
||||
|
||||
* {svn-ticket-url}/3966[Ticket 3966^]:
|
||||
`erase_return_void` is now `quick_erase`, which is the
|
||||
http://home.roadrunner.com/~hinnant/issue_review/lwg-active.html#579[
|
||||
current forerunner for resolving the slow erase by iterator^], although
|
||||
there's a strong possibility that this may change in the future. The old
|
||||
method name remains for backwards compatibility but is considered deprecated
|
||||
and will be removed in a future release.
|
||||
* Use Boost.Exception.
|
||||
* Stop using deprecated `BOOST_HAS_*` macros.
|
||||
|
||||
== Release 1.42.0
|
||||
|
||||
* Support instantiating the containers with incomplete value types.
|
||||
* Reduced the number of warnings (mostly in tests).
|
||||
* Improved codegear compatibility.
|
||||
* {svn-ticket-url}/3693[Ticket 3693^]:
|
||||
Add `erase_return_void` as a temporary workaround for the current
|
||||
`erase` which can be inefficient because it has to find the next
|
||||
element to return an iterator.
|
||||
* Add templated find overload for compatible keys.
|
||||
* {svn-ticket-url}/3773[Ticket 3773^]:
|
||||
Add missing `std` qualifier to `ptrdiff_t`.
|
||||
* Some code formatting changes to fit almost all lines into 80 characters.
|
||||
|
||||
== Release 1.41.0 - Major update
|
||||
|
||||
* The original version made heavy use of macros to sidestep some of the older
|
||||
compilers' poor template support. But since I no longer support those
|
||||
compilers and the macro use was starting to become a maintenance burden it
|
||||
has been rewritten to use templates instead of macros for the implementation
|
||||
classes.
|
||||
|
||||
* The container object is now smaller thanks to using `boost::compressed_pair`
|
||||
for EBO and a slightly different function buffer - now using a bool instead
|
||||
of a member pointer.
|
||||
|
||||
* Buckets are allocated lazily which means that constructing an empty container
|
||||
will not allocate any memory.
|
||||
|
||||
== Release 1.40.0
|
||||
|
||||
* {svn-ticket-url}/2975[Ticket 2975^]:
|
||||
Store the prime list as a preprocessor sequence - so that it will always get
|
||||
the length right if it changes again in the future.
|
||||
* {svn-ticket-url}/1978[Ticket 1978^]:
|
||||
Implement `emplace` for all compilers.
|
||||
* {svn-ticket-url}/2908[Ticket 2908^],
|
||||
{svn-ticket-url}/3096[Ticket 3096^]:
|
||||
Some workarounds for old versions of borland, including adding explicit
|
||||
destructors to all containers.
|
||||
* {svn-ticket-url}/3082[Ticket 3082^]:
|
||||
Disable incorrect Visual {cpp} warnings.
|
||||
* Better configuration for {cpp}0x features when the headers aren't available.
|
||||
* Create less buckets by default.
|
||||
|
||||
== Release 1.39.0
|
||||
|
||||
* {svn-ticket-url}/2756[Ticket 2756^]: Avoid a warning
|
||||
on Visual {cpp} 2009.
|
||||
* Some other minor internal changes to the implementation, tests and
|
||||
documentation.
|
||||
* Avoid an unnecessary copy in `operator[]`.
|
||||
* {svn-ticket-url}/2975[Ticket 2975^]: Fix length of
|
||||
prime number list.
|
||||
|
||||
== Release 1.38.0
|
||||
|
||||
* Use link:../../../core/swap.html[`boost::swap`^].
|
||||
* {svn-ticket-url}/2237[Ticket 2237^]:
|
||||
Document that the equality and inequality operators are undefined for two
|
||||
objects if their equality predicates aren't equivalent. Thanks to Daniel
|
||||
Krügler.
|
||||
* {svn-ticket-url}/1710[Ticket 1710^]:
|
||||
Use a larger prime number list. Thanks to Thorsten Ottosen and Hervé
|
||||
Brönnimann.
|
||||
* Use
|
||||
link:../../../type_traits/index.html[aligned storage^] to store the types.
|
||||
This changes the way the allocator is used to construct nodes. It used to
|
||||
construct the node with two calls to the allocator's `construct`
|
||||
method - once for the pointers and once for the value. It now constructs
|
||||
the node with a single call to construct and then constructs the value using
|
||||
in place construction.
|
||||
* Add support for {cpp}0x initializer lists where they're available (currently
|
||||
only g++ 4.4 in {cpp}0x mode).
|
||||
|
||||
== Release 1.37.0
|
||||
|
||||
* Rename overload of `emplace` with hint, to `emplace_hint` as specified in
|
||||
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2691.pdf[n2691^].
|
||||
* Provide forwarding headers at `<boost/unordered/unordered_map_fwd.hpp>` and
|
||||
`<boost/unordered/unordered_set_fwd.hpp>`.
|
||||
* Move all the implementation inside `boost/unordered`, to assist
|
||||
modularization and hopefully make it easier to track Release subversion.
|
||||
|
||||
== Release 1.36.0
|
||||
|
||||
First official release.
|
||||
|
||||
* Rearrange the internals.
|
||||
* Move semantics - full support when rvalue references are available, emulated
|
||||
using a cut down version of the Adobe move library when they are not.
|
||||
* Emplace support when rvalue references and variadic template are available.
|
||||
* More efficient node allocation when rvalue references and variadic template
|
||||
are available.
|
||||
* Added equality operators.
|
||||
|
||||
== Boost 1.35.0 Add-on - 31st March 2008
|
||||
|
||||
Unofficial release uploaded to vault, to be used with Boost 1.35.0. Incorporated
|
||||
many of the suggestions from the review.
|
||||
|
||||
* Improved portability thanks to Boost regression testing.
|
||||
* Fix lots of typos, and clearer text in the documentation.
|
||||
* Fix floating point to `std::size_t` conversion when calculating sizes from
|
||||
the max load factor, and use `double` in the calculation for greater accuracy.
|
||||
* Fix some errors in the examples.
|
||||
|
||||
== Review Version
|
||||
|
||||
Initial review version, for the review conducted from 7th December 2007 to
|
||||
16th December 2007.
|
||||
@@ -1,150 +0,0 @@
|
||||
[#compliance]
|
||||
= Standard Compliance
|
||||
|
||||
:idprefix: compliance_
|
||||
|
||||
:cpp: C++
|
||||
|
||||
== Closed-addressing Containers
|
||||
|
||||
`boost::unordered_[multi]set` and `boost::unordered_[multi]map` provide a conformant
|
||||
implementation for {cpp}11 (or later) compilers of the latest standard revision of
|
||||
{cpp} unordered associative containers, with very minor deviations as noted.
|
||||
The containers are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^]
|
||||
and support https://en.cppreference.com/w/cpp/named_req/Allocator#Fancy_pointers[fancy pointers^].
|
||||
|
||||
=== Deduction Guides
|
||||
|
||||
Deduction guides for
|
||||
https://en.cppreference.com/w/cpp/language/class_template_argument_deduction[class template argument deduction (CTAD)^]
|
||||
are only available on {cpp}17 (or later) compilers.
|
||||
|
||||
=== Piecewise Pair Emplacement
|
||||
|
||||
In accordance with the standard specification,
|
||||
`boost::unordered_[multi]map::emplace` supports piecewise pair construction:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
boost::unordered_multimap<std::string, std::complex> x;
|
||||
|
||||
x.emplace(
|
||||
std::piecewise_construct,
|
||||
std::make_tuple("key"), std::make_tuple(1, 2));
|
||||
----
|
||||
|
||||
Additionally, the same
|
||||
functionality is provided via non-standard `boost::unordered::piecewise_construct`
|
||||
and Boost.Tuple:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
x.emplace(
|
||||
boost::unordered::piecewise_construct,
|
||||
boost::make_tuple("key"), boost::make_tuple(1, 2));
|
||||
----
|
||||
|
||||
This feature has been retained for backwards compatibility with
|
||||
previous versions of Boost.Unordered: users are encouraged to
|
||||
update their code to use `std::piecewise_construct` and
|
||||
``std::tuple``s instead.
|
||||
|
||||
=== Swap
|
||||
|
||||
When swapping, `Pred` and `Hash` are not currently swapped by calling
|
||||
`swap`, their copy constructors are used. As a consequence, when swapping
|
||||
an exception may be thrown from their copy constructor.
|
||||
|
||||
== Open-addressing Containers
|
||||
|
||||
The C++ standard does not currently provide any open-addressing container
|
||||
specification to adhere to, so `boost::unordered_flat_set`/`unordered_node_set` and
|
||||
`boost::unordered_flat_map`/`unordered_node_map` take inspiration from `std::unordered_set` and
|
||||
`std::unordered_map`, respectively, and depart from their interface where
|
||||
convenient or as dictated by their internal data structure, which is
|
||||
radically different from that imposed by the standard (closed addressing).
|
||||
|
||||
Open-addressing containers provided by Boost.Unordered only work with reasonably
|
||||
compliant C++11 (or later) compilers. Language-level features such as move semantics
|
||||
and variadic template parameters are then not emulated.
|
||||
The containers are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^]
|
||||
and support https://en.cppreference.com/w/cpp/named_req/Allocator#Fancy_pointers[fancy pointers^].
|
||||
|
||||
|
||||
The main differences with C++ unordered associative containers are:
|
||||
|
||||
* In general:
|
||||
** `begin()` is not constant-time.
|
||||
** `erase(iterator)` does not return an iterator to the following element, but
|
||||
a proxy object that converts to that iterator if requested; this avoids
|
||||
a potentially costly iterator increment operation when not needed.
|
||||
** There is no API for bucket handling (except `bucket_count`).
|
||||
** The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
|
||||
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
|
||||
* Flat containers (`boost::unordered_flat_set` and `boost::unordered_flat_map`):
|
||||
** `value_type` must be move-constructible.
|
||||
** Pointer stability is not kept under rehashing.
|
||||
** There is no API for node extraction/insertion.
|
||||
|
||||
== Concurrent Containers
|
||||
|
||||
There is currently no specification in the C++ standard for this or any other type of concurrent
|
||||
data structure. The APIs of `boost::concurrent_flat_set`/`boost::concurrent_node_set` and
|
||||
`boost::concurrent_flat_map`/`boost::concurrent_node_map`
|
||||
are modelled after `std::unordered_flat_set` and `std::unordered_flat_map`, respectively,
|
||||
with the crucial difference that iterators are not provided
|
||||
due to their inherent problems in concurrent scenarios (high contention, prone to deadlocking):
|
||||
so, Boost.Unordered concurrent containers are technically not models of
|
||||
https://en.cppreference.com/w/cpp/named_req/Container[Container^], although
|
||||
they meet all the requirements of https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^]
|
||||
containers (including
|
||||
https://en.cppreference.com/w/cpp/named_req/Allocator#Fancy_pointers[fancy pointer^] support)
|
||||
except those implying iterators.
|
||||
|
||||
In a non-concurrent unordered container, iterators serve two main purposes:
|
||||
|
||||
* Access to an element previously located via lookup.
|
||||
* Container traversal.
|
||||
|
||||
In place of iterators, Boost.Unordered concurrent containers use _internal visitation_
|
||||
facilities as a thread-safe substitute. Classical operations returning an iterator to an
|
||||
element already existing in the container, like for instance:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
iterator find(const key_type& k);
|
||||
std::pair<iterator, bool> insert(const value_type& obj);
|
||||
----
|
||||
|
||||
are transformed to accept a _visitation function_ that is passed such element:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
template<class F> size_t visit(const key_type& k, F f);
|
||||
template<class F> bool insert_or_visit(const value_type& obj, F f);
|
||||
----
|
||||
|
||||
(In the second case `f` is only invoked if there's an equivalent element
|
||||
to `obj` in the table, not if insertion is successful). Container traversal
|
||||
is served by:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
template<class F> size_t visit_all(F f);
|
||||
----
|
||||
|
||||
of which there are parallelized versions in C++17 compilers with parallel
|
||||
algorithm support. In general, the interface of concurrent containers
|
||||
is derived from that of their non-concurrent counterparts by a fairly straightforward
|
||||
process of replacing iterators with visitation where applicable. If for
|
||||
regular maps `iterator` and `const_iterator` provide mutable and const access to elements,
|
||||
respectively, here visitation is granted mutable or const access depending on
|
||||
the constness of the member function used (there are also `*cvisit` overloads for
|
||||
explicit const visitation); In the case of `boost::concurrent_flat_set`, visitation is always const.
|
||||
|
||||
One notable operation not provided by `boost::concurrent_flat_map`/`boost::concurrent_node_map`
|
||||
is `operator[]`/`at`, which can be
|
||||
replaced, if in a more convoluted manner, by
|
||||
xref:#concurrent_flat_map_try_emplace_or_cvisit[`try_emplace_or_visit`].
|
||||
|
||||
//-
|
||||
@@ -1,320 +0,0 @@
|
||||
[#concurrent]
|
||||
= Concurrent Containers
|
||||
|
||||
:idprefix: concurrent_
|
||||
|
||||
Boost.Unordered provides `boost::concurrent_node_set`, `boost::concurrent_node_map`,
|
||||
`boost::concurrent_flat_set` and `boost::concurrent_flat_map`,
|
||||
hash tables that allow concurrent write/read access from
|
||||
different threads without having to implement any synchronzation mechanism on the user's side.
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::vector<int> input;
|
||||
boost::concurrent_flat_map<int,int> m;
|
||||
|
||||
...
|
||||
|
||||
// process input in parallel
|
||||
const int num_threads = 8;
|
||||
std::vector<std::jthread> threads;
|
||||
std::size_t chunk = input.size() / num_threads; // how many elements per thread
|
||||
|
||||
for (int i = 0; i < num_threads; ++i) {
|
||||
threads.emplace_back([&,i] {
|
||||
// calculate the portion of input this thread takes care of
|
||||
std::size_t start = i * chunk;
|
||||
std::size_t end = (i == num_threads - 1)? input.size(): (i + 1) * chunk;
|
||||
|
||||
for (std::size_t n = start; n < end; ++n) {
|
||||
m.emplace(input[n], calculation(input[n]));
|
||||
}
|
||||
});
|
||||
}
|
||||
----
|
||||
|
||||
In the example above, threads access `m` without synchronization, just as we'd do in a
|
||||
single-threaded scenario. In an ideal setting, if a given workload is distributed among
|
||||
_N_ threads, execution is _N_ times faster than with one thread —this limit is
|
||||
never attained in practice due to synchronization overheads and _contention_ (one thread
|
||||
waiting for another to leave a locked portion of the map), but Boost.Unordered concurrent containers
|
||||
are designed to perform with very little overhead and typically achieve _linear scaling_
|
||||
(that is, performance is proportional to the number of threads up to the number of
|
||||
logical cores in the CPU).
|
||||
|
||||
== Visitation-based API
|
||||
|
||||
The first thing a new user of Boost.Unordered concurrent containers
|
||||
will notice is that these classes _do not provide iterators_ (which makes them technically
|
||||
not https://en.cppreference.com/w/cpp/named_req/Container[Containers^]
|
||||
in the C++ standard sense). The reason for this is that iterators are inherently
|
||||
thread-unsafe. Consider this hypothetical code:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
auto it = m.find(k); // A: get an iterator pointing to the element with key k
|
||||
if (it != m.end() ) {
|
||||
some_function(*it); // B: use the value of the element
|
||||
}
|
||||
----
|
||||
|
||||
In a multithreaded scenario, the iterator `it` may be invalid at point B if some other
|
||||
thread issues an `m.erase(k)` operation between A and B. There are designs that
|
||||
can remedy this by making iterators lock the element they point to, but this
|
||||
approach lends itself to high contention and can easily produce deadlocks in a program.
|
||||
`operator[]` has similar concurrency issues, and is not provided by
|
||||
`boost::concurrent_flat_map`/`boost::concurrent_node_map` either. Instead, element access is done through
|
||||
so-called _visitation functions_:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [](const auto& x) { // x is the element with key k (if it exists)
|
||||
some_function(x); // use it
|
||||
});
|
||||
----
|
||||
|
||||
The visitation function passed by the user (in this case, a lambda function)
|
||||
is executed internally by Boost.Unordered in
|
||||
a thread-safe manner, so it can access the element without worrying about other
|
||||
threads interfering in the process.
|
||||
|
||||
On the other hand, a visitation function can _not_ access the container itself:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [&](const auto& x) {
|
||||
some_function(x, m.size()); // forbidden: m can't be accessed inside visitation
|
||||
});
|
||||
----
|
||||
|
||||
Access to a different container is allowed, though:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [&](const auto& x) {
|
||||
if (some_function(x)) {
|
||||
m2.insert(x); // OK, m2 is a different boost::concurrent_flat_map
|
||||
}
|
||||
});
|
||||
----
|
||||
|
||||
But, in general, visitation functions should be as lightweight as possible to
|
||||
reduce contention and increase parallelization. In some cases, moving heavy work
|
||||
outside of visitation may be beneficial:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::optional<value_type> o;
|
||||
bool found = m.visit(k, [&](const auto& x) {
|
||||
o = x;
|
||||
});
|
||||
if (found) {
|
||||
some_heavy_duty_function(*o);
|
||||
}
|
||||
----
|
||||
|
||||
Visitation is prominent in the API provided by concurrent containers, and
|
||||
many classical operations have visitation-enabled variations:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.insert_or_visit(x, [](auto& y) {
|
||||
// if insertion failed because of an equivalent element y,
|
||||
// do something with it, for instance:
|
||||
++y.second; // increment the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
Note that in this last example the visitation function could actually _modify_
|
||||
the element: as a general rule, operations on a concurrent map `m`
|
||||
will grant visitation functions const/non-const access to the element depending on whether
|
||||
`m` is const/non-const. Const access can be always be explicitly requested
|
||||
by using `cvisit` overloads (for instance, `insert_or_cvisit`) and may result
|
||||
in higher parallelization. For concurrent sets, on the other hand,
|
||||
visitation is always const access.
|
||||
|
||||
Although expected to be used much less frequently, concurrent containers
|
||||
also provide insertion operations where an element can be visited right after
|
||||
element creation (in addition to the usual visitation when an equivalent
|
||||
element already exists):
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.insert_and_cvisit(x,
|
||||
[](const auto& y) {
|
||||
std::cout<< "(" << y.first << ", " << y.second <<") inserted\n";
|
||||
},
|
||||
[](const auto& y) {
|
||||
std::cout<< "(" << y.first << ", " << y.second << ") already exists\n";
|
||||
});
|
||||
----
|
||||
|
||||
Consult the references of
|
||||
xref:#concurrent_node_set[`boost::concurrent_node_set`],
|
||||
xref:#concurrent_flat_map[`boost::concurrent_node_map`],
|
||||
xref:#concurrent_flat_set[`boost::concurrent_flat_set`] and
|
||||
xref:#concurrent_flat_map[`boost::concurrent_flat_map`]
|
||||
for the complete list of visitation-enabled operations.
|
||||
|
||||
== Whole-Table Visitation
|
||||
|
||||
In the absence of iterators, `visit_all` is provided
|
||||
as an alternative way to process all the elements in the container:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit_all([](auto& x) {
|
||||
x.second = 0; // reset the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
In C++17 compilers implementing standard parallel algorithms, whole-table
|
||||
visitation can be parallelized:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit_all(std::execution::par, [](auto& x) { // run in parallel
|
||||
x.second = 0; // reset the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
Traversal can be interrupted midway:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// finds the key to a given (unique) value
|
||||
|
||||
int key = 0;
|
||||
int value = ...;
|
||||
bool found = !m.visit_while([&](const auto& x) {
|
||||
if(x.second == value) {
|
||||
key = x.first;
|
||||
return false; // finish
|
||||
}
|
||||
else {
|
||||
return true; // keep on visiting
|
||||
}
|
||||
});
|
||||
|
||||
if(found) { ... }
|
||||
----
|
||||
|
||||
There is one last whole-table visitation operation, `erase_if`:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.erase_if([](auto& x) {
|
||||
return x.second == 0; // erase the elements whose mapped value is zero
|
||||
});
|
||||
----
|
||||
|
||||
`visit_while` and `erase_if` can also be parallelized. Note that, in order to increase efficiency,
|
||||
whole-table visitation operations do not block the table during execution: this implies that elements
|
||||
may be inserted, modified or erased by other threads during visitation. It is
|
||||
advisable not to assume too much about the exact global state of a concurrent container
|
||||
at any point in your program.
|
||||
|
||||
== Bulk visitation
|
||||
|
||||
Suppose you have an `std::array` of keys you want to look up for in a concurrent map:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::array<int, N> keys;
|
||||
...
|
||||
for(const auto& key: keys) {
|
||||
m.visit(key, [](auto& x) { ++x.second; });
|
||||
}
|
||||
----
|
||||
|
||||
_Bulk visitation_ allows us to pass all the keys in one operation:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(keys.begin(), keys.end(), [](auto& x) { ++x.second; });
|
||||
----
|
||||
|
||||
This functionality is not provided for mere syntactic convenience, though: by processing all the
|
||||
keys at once, some internal optimizations can be applied that increase
|
||||
performance over the regular, one-at-a-time case (consult the
|
||||
xref:#benchmarks_boostconcurrent_flat_map[benchmarks]). In fact, it may be beneficial
|
||||
to buffer incoming keys so that they can be bulk visited in chunks:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
static constexpr auto bulk_visit_size = boost::concurrent_flat_map<int,int>::bulk_visit_size;
|
||||
std::array<int, bulk_visit_size> buffer;
|
||||
std::size_t i=0;
|
||||
while(...) { // processing loop
|
||||
...
|
||||
buffer[i++] = k;
|
||||
if(i == bulk_visit_size) {
|
||||
map.visit(buffer.begin(), buffer.end(), [](auto& x) { ++x.second; });
|
||||
i = 0;
|
||||
}
|
||||
...
|
||||
}
|
||||
// flush remaining keys
|
||||
map.visit(buffer.begin(), buffer.begin() + i, [](auto& x) { ++x.second; });
|
||||
----
|
||||
|
||||
There's a latency/throughput tradeoff here: it will take longer for incoming keys to
|
||||
be processed (since they are buffered), but the number of processed keys per second
|
||||
is higher. `bulk_visit_size` is the recommended chunk size —smaller buffers
|
||||
may yield worse performance.
|
||||
|
||||
== Blocking Operations
|
||||
|
||||
Concurrent containers can be copied, assigned, cleared and merged just like any other
|
||||
Boost.Unordered container. Unlike most other operations, these are _blocking_,
|
||||
that is, all other threads are prevented from accesing the tables involved while a copy, assignment,
|
||||
clear or merge operation is in progress. Blocking is taken care of automatically by the library
|
||||
and the user need not take any special precaution, but overall performance may be affected.
|
||||
|
||||
Another blocking operation is _rehashing_, which happens explicitly via `rehash`/`reserve`
|
||||
or during insertion when the table's load hits `max_load()`. As with non-concurrent containers,
|
||||
reserving space in advance of bulk insertions will generally speed up the process.
|
||||
|
||||
== Interoperability with non-concurrent containers
|
||||
|
||||
As open-addressing and concurrent containers are based on the same internal data structure,
|
||||
they can be efficiently move-constructed from their non-concurrent counterpart, and vice versa.
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Concurrent/non-concurrent interoperatibility']
|
||||
[cols="1,1", frame=all, grid=all]
|
||||
|===
|
||||
^|`boost::concurrent_node_set`
|
||||
^|`boost::unordered_node_set`
|
||||
|
||||
^|`boost::concurrent_node_map`
|
||||
^|`boost::unordered_node_map`
|
||||
|
||||
^|`boost::concurrent_flat_set`
|
||||
^|`boost::unordered_flat_set`
|
||||
|
||||
^|`boost::concurrent_flat_map`
|
||||
^|`boost::unordered_flat_map`
|
||||
|
||||
|===
|
||||
|
||||
This interoperability comes handy in multistage scenarios where parts of the data processing happen
|
||||
in parallel whereas other steps are non-concurrent (or non-modifying). In the following example,
|
||||
we want to construct a histogram from a huge input vector of words:
|
||||
the population phase can be done in parallel with `boost::concurrent_flat_map` and results
|
||||
then transferred to the final container.
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::vector<std::string> words = ...;
|
||||
|
||||
// Insert words in parallel
|
||||
boost::concurrent_flat_map<std::string_view, std::size_t> m0;
|
||||
std::for_each(
|
||||
std::execution::par, words.begin(), words.end(),
|
||||
[&](const auto& word) {
|
||||
m0.try_emplace_or_visit(word, 1, [](auto& x) { ++x.second; });
|
||||
});
|
||||
|
||||
// Transfer to a regular unordered_flat_map
|
||||
boost::unordered_flat_map m=std::move(m0);
|
||||
----
|
||||
@@ -1,20 +0,0 @@
|
||||
[#copyright]
|
||||
= Copyright and License
|
||||
|
||||
:idprefix: copyright_
|
||||
|
||||
*Daniel James*
|
||||
|
||||
Copyright (C) 2003, 2004 Jeremy B. Maitin-Shepard
|
||||
|
||||
Copyright (C) 2005-2008 Daniel James
|
||||
|
||||
Copyright (C) 2022-2023 Christian Mazakas
|
||||
|
||||
Copyright (C) 2022-2024 Joaquín M López Muñoz
|
||||
|
||||
Copyright (C) 2022-2023 Peter Dimov
|
||||
|
||||
Copyright (C) 2024 Braden Ganetsky
|
||||
|
||||
Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
||||
@@ -1,90 +0,0 @@
|
||||
[#debuggability]
|
||||
:idprefix: debuggability_
|
||||
|
||||
= Debuggability
|
||||
|
||||
== Visual Studio Natvis
|
||||
|
||||
All containers and iterators have custom visualizations in the Natvis framework.
|
||||
|
||||
=== Using in your project
|
||||
|
||||
To visualize Boost.Unordered containers in the Natvis framework in your project, simply add the file link:https://github.com/boostorg/unordered/blob/develop/extra/boost_unordered.natvis[/extra/boost_unordered.natvis] to your Visual Studio project as an "Existing Item".
|
||||
|
||||
=== Visualization structure
|
||||
|
||||
The visualizations mirror those for the standard unordered containers. A container has a maximum of 100 elements displayed at once. Each set element has its item name listed as `[i]`, where `i` is the index in the display, starting at `0`. Each map element has its item name listed as `[\{key-display}]` by default. For example, if the first element is the pair `("abc", 1)`, the item name will be `["abc"]`. This behaviour can be overridden by using the view "ShowElementsByIndex", which switches the map display behaviour to name the elements by index. This same view name is used in the standard unordered containers.
|
||||
|
||||
By default, the closed-addressing containers will show the `[hash_function]` and `[key_eq]`, the `[spare_hash_function]` and `[spare_key_eq]` if applicable, the `[allocator]`, and the elements. Using the view "detailed" adds the `[bucket_count]` and `[max_load_factor]`. Conversely, using the view "simple" shows only the elements, with no other items present.
|
||||
|
||||
By default, the open-addressing containers will show the `[hash_function]`, `[key_eq]`, `[allocator]`, and the elements. Using the view "simple" shows only the elements, with no other items present. Both the SIMD and the non-SIMD implementations are viewable through the Natvis framework.
|
||||
|
||||
Iterators are displayed similarly to their standard counterparts. An iterator is displayed as though it were the element that it points to. An end iterator is simply displayed as `{ end iterator }`.
|
||||
|
||||
=== Fancy pointers
|
||||
|
||||
The container visualizations also work if you are using fancy pointers in your allocator, such as `boost::interprocess::offset_ptr`. While this is rare, Boost.Unordered has natvis customization points to support any type of fancy pointer. `boost::interprocess::offset_ptr` has support already defined in the Boost.Interprocess library, and you can add support to your own type by following the instructions contained in a comment near the end of the file link:https://github.com/boostorg/unordered/blob/develop/extra/boost_unordered.natvis[/extra/boost_unordered.natvis].
|
||||
|
||||
== GDB Pretty-Printers
|
||||
|
||||
All containers and iterators have a custom GDB pretty-printer.
|
||||
|
||||
=== Using in your project
|
||||
|
||||
Always, when using pretty-printers, you must enable pretty-printing like below. This is typically a one-time setup.
|
||||
|
||||
```plaintext
|
||||
(gdb) set print pretty on
|
||||
```
|
||||
|
||||
By default, if you compile into an ELF binary format, your binary will contain the Boost.Unordered pretty-printers. To use the embedded pretty-printers, ensure you allow auto-loading like below. This must be done every time you load GDB, or add it to a ".gdbinit" file.
|
||||
|
||||
```plaintext
|
||||
(gdb) add-auto-load-safe-path [/path/to/executable]
|
||||
```
|
||||
|
||||
You can choose to compile your binary _without_ embedding the pretty-printers by defining `BOOST_ALL_NO_EMBEDDED_GDB_SCRIPTS`, which disables the embedded GDB pretty-printers for all Boost libraries that have this feature.
|
||||
|
||||
You can load the pretty-printers externally from the non-embedded Python script. Add the script, link:https://github.com/boostorg/unordered/blob/develop/extra/boost_unordered_printers.py[/extra/boost_unordered_printers.py], using the `source` command as shown below.
|
||||
|
||||
```plaintext
|
||||
(gdb) source [/path/to/boost]/libs/unordered/extra/boost_unordered_printers.py
|
||||
```
|
||||
|
||||
=== Visualization structure
|
||||
|
||||
The visualizations mirror the standard unordered containers. The map containers display an association from key to mapped value. The set containers display an association from index to value. An iterator is either displayed with its item, or as an end iterator. Here is what may be shown for an example `boost::unordered_map`, an example `boost::unordered_set`, and their respective begin and end iterators.
|
||||
|
||||
```plaintext
|
||||
(gdb) print example_unordered_map
|
||||
$1 = boost::unordered_map with 3 elements = {["C"] = "c", ["B"] = "b", ["A"] = "a"}
|
||||
(gdb) print example_unordered_map_begin
|
||||
$2 = iterator = { {first = "C", second = "c"} }
|
||||
(gdb) print example_unordered_map_end
|
||||
$3 = iterator = { end iterator }
|
||||
(gdb) print example_unordered_set
|
||||
$4 = boost::unordered_set with 3 elements = {[0] = "c", [1] = "b", [2] = "a"}
|
||||
(gdb) print example_unordered_set_begin
|
||||
$5 = iterator = { "c" }
|
||||
(gdb) print example_unordered_set_end
|
||||
$6 = iterator = { end iterator }
|
||||
```
|
||||
|
||||
The other containers are identical other than replacing "`boost::unordered_{map|set}`" with the appropriate template name when displaying the container itself. Note that each sub-element (i.e. the key, the mapped value, or the value) is displayed based on its own printing settings which may include its own pretty-printer.
|
||||
|
||||
Both the SIMD and the non-SIMD implementations are viewable through the GDB pretty-printers.
|
||||
|
||||
For open-addressing containers where xref:#hash_quality_container_statistics[container statistics] are enabled, you can obtain these statistics by calling `get_stats()` on the container, from within GDB. This is overridden in GDB as an link:https://sourceware.org/gdb/current/onlinedocs/gdb.html/Xmethod-API.html[xmethod], so it will not invoke any C++ synchronization code. See the following printout as an example for the expected format.
|
||||
|
||||
```plaintext
|
||||
(gdb) print example_flat_map.get_stats()
|
||||
$1 = [stats] = {[insertion] = {[count] = 5, [probe_length] = {avg = 1.0, var = 0.0, dev = 0.0}},
|
||||
[successful_lookup] = {[count] = 0, [probe_length] = {avg = 0.0, var = 0.0, dev = 0.0},
|
||||
[num_comparisons] = {avg = 0.0, var = 0.0, dev = 0.0}}, [unsuccessful_lookup] = {[count] = 5,
|
||||
[probe_length] = {avg = 1.0, var = 0.0, dev = 0.0},
|
||||
[num_comparisons] = {avg = 0.0, var = 0.0, dev = 0.0}}}
|
||||
```
|
||||
|
||||
=== Fancy pointers
|
||||
|
||||
The pretty-printers also work if you are using fancy pointers in your allocator, such as `boost::interprocess::offset_ptr`. While this is rare, Boost.Unordered has GDB pretty-printer customization points to support any type of fancy pointer. `boost::interprocess::offset_ptr` has support already defined in the Boost.Interprocess library, and you can add support to your own type by following the instructions contained in a comment near the end of the file link:https://github.com/boostorg/unordered/blob/develop/extra/boost_unordered_printers.py[/extra/boost_unordered_printers.py].
|
||||
@@ -1,149 +0,0 @@
|
||||
[#hash_equality]
|
||||
|
||||
:idprefix: hash_equality_
|
||||
|
||||
= Equality Predicates and Hash Functions
|
||||
|
||||
While the associative containers use an ordering relation to specify how the
|
||||
elements are stored, the unordered associative containers use an equality
|
||||
predicate and a hash function. For example, <<unordered_map,boost::unordered_map>>
|
||||
is declared as:
|
||||
|
||||
```cpp
|
||||
template <
|
||||
class Key, class Mapped,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<std::pair<Key const, Mapped> > >
|
||||
class unordered_map;
|
||||
```
|
||||
|
||||
The hash function comes first as you might want to change the hash function
|
||||
but not the equality predicate. For example, if you wanted to use the
|
||||
https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash[FNV-1a hash^] you could write:
|
||||
|
||||
```cpp
|
||||
boost::unordered_map<std::string, int, hash::fnv_1a>
|
||||
dictionary;
|
||||
```
|
||||
|
||||
There is an link:../../examples/fnv1.hpp[implementation of FNV-1a^] in the examples directory.
|
||||
|
||||
If you wish to use a different equality function, you will also need to use a matching hash function. For example, to implement a case insensitive dictionary you need to define a case insensitive equality predicate and hash function:
|
||||
|
||||
```cpp
|
||||
struct iequal_to
|
||||
{
|
||||
bool operator()(std::string const& x,
|
||||
std::string const& y) const
|
||||
{
|
||||
return boost::algorithm::iequals(x, y, std::locale());
|
||||
}
|
||||
};
|
||||
|
||||
struct ihash
|
||||
{
|
||||
std::size_t operator()(std::string const& x) const
|
||||
{
|
||||
std::size_t seed = 0;
|
||||
std::locale locale;
|
||||
|
||||
for(std::string::const_iterator it = x.begin();
|
||||
it != x.end(); ++it)
|
||||
{
|
||||
boost::hash_combine(seed, std::toupper(*it, locale));
|
||||
}
|
||||
|
||||
return seed;
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
Which you can then use in a case insensitive dictionary:
|
||||
```cpp
|
||||
boost::unordered_map<std::string, int, ihash, iequal_to>
|
||||
idictionary;
|
||||
```
|
||||
|
||||
This is a simplified version of the example at
|
||||
link:../../examples/case_insensitive.hpp[/libs/unordered/examples/case_insensitive.hpp^] which supports other locales and string types.
|
||||
|
||||
CAUTION: Be careful when using the equality (`==`) operator with custom equality
|
||||
predicates, especially if you're using a function pointer. If you compare two
|
||||
containers with different equality predicates then the result is undefined.
|
||||
For most stateless function objects this is impossible - since you can only
|
||||
compare objects with the same equality predicate you know the equality
|
||||
predicates must be equal. But if you're using function pointers or a stateful
|
||||
equality predicate (e.g. `boost::function`) then you can get into trouble.
|
||||
|
||||
== Custom Types
|
||||
|
||||
Similarly, a custom hash function can be used for custom types:
|
||||
|
||||
```cpp
|
||||
struct point {
|
||||
int x;
|
||||
int y;
|
||||
};
|
||||
|
||||
bool operator==(point const& p1, point const& p2)
|
||||
{
|
||||
return p1.x == p2.x && p1.y == p2.y;
|
||||
}
|
||||
|
||||
struct point_hash
|
||||
{
|
||||
std::size_t operator()(point const& p) const
|
||||
{
|
||||
std::size_t seed = 0;
|
||||
boost::hash_combine(seed, p.x);
|
||||
boost::hash_combine(seed, p.y);
|
||||
return seed;
|
||||
}
|
||||
};
|
||||
|
||||
boost::unordered_multiset<point, point_hash> points;
|
||||
```
|
||||
|
||||
Since the default hash function is link:../../../container_hash/index.html[Boost.Hash^],
|
||||
we can extend it to support the type so that the hash function doesn't need to be explicitly given:
|
||||
|
||||
```cpp
|
||||
struct point {
|
||||
int x;
|
||||
int y;
|
||||
};
|
||||
|
||||
bool operator==(point const& p1, point const& p2)
|
||||
{
|
||||
return p1.x == p2.x && p1.y == p2.y;
|
||||
}
|
||||
|
||||
std::size_t hash_value(point const& p) {
|
||||
std::size_t seed = 0;
|
||||
boost::hash_combine(seed, p.x);
|
||||
boost::hash_combine(seed, p.y);
|
||||
return seed;
|
||||
}
|
||||
|
||||
// Now the default function objects work.
|
||||
boost::unordered_multiset<point> points;
|
||||
```
|
||||
|
||||
See the link:../../../container_hash/index.html[Boost.Hash documentation^] for more detail on how to
|
||||
do this. Remember that it relies on extensions to the standard - so it
|
||||
won't work for other implementations of the unordered associative containers,
|
||||
you'll need to explicitly use Boost.Hash.
|
||||
|
||||
[caption=, title='Table {counter:table-counter} Methods for accessing the hash and equality functions']
|
||||
[cols="1,.^1", frame=all, grid=rows]
|
||||
|===
|
||||
|Method |Description
|
||||
|
||||
|`hasher hash_function() const`
|
||||
|Returns the container's hash function.
|
||||
|
||||
|`key_equal key_eq() const`
|
||||
|Returns the container's key equality function..
|
||||
|
||||
|===
|
||||
@@ -1,145 +0,0 @@
|
||||
[#hash_quality]
|
||||
= Hash Quality
|
||||
|
||||
:idprefix: hash_quality_
|
||||
|
||||
In order to work properly, hash tables require that the supplied hash function
|
||||
be of __good quality__, roughly meaning that it uses its `std::size_t` output
|
||||
space as uniformly as possible, much like a random number generator would do
|
||||
—except, of course, that the value of a hash function is not random but strictly determined
|
||||
by its input argument.
|
||||
|
||||
Closed-addressing containers in Boost.Unordered are fairly robust against
|
||||
hash functions with less-than-ideal quality, but open-addressing and concurrent
|
||||
containers are much more sensitive to this factor, and their performance can
|
||||
degrade dramatically if the hash function is not appropriate. In general, if
|
||||
you're using functions provided by or generated with link:../../../container_hash/index.html[Boost.Hash^],
|
||||
the quality will be adequate, but you have to be careful when using alternative
|
||||
hash algorithms.
|
||||
|
||||
The rest of this section applies only to open-addressing and concurrent containers.
|
||||
|
||||
== Hash Post-mixing and the Avalanching Property
|
||||
|
||||
Even if your supplied hash function does not conform to the uniform behavior
|
||||
required by open addressing, chances are that
|
||||
the performance of Boost.Unordered containers will be acceptable, because the library
|
||||
executes an internal __post-mixing__ step that improves the statistical
|
||||
properties of the calculated hash values. This comes with an extra computational
|
||||
cost; if you'd like to opt out of post-mixing, annotate your hash function as
|
||||
follows:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
struct my_string_hash_function
|
||||
{
|
||||
using is_avalanching = std::true_type; // instruct Boost.Unordered to not use post-mixing
|
||||
|
||||
std::size_t operator()(const std::string& x) const
|
||||
{
|
||||
...
|
||||
}
|
||||
};
|
||||
----
|
||||
|
||||
By setting the
|
||||
xref:#hash_traits_hash_is_avalanching[hash_is_avalanching] trait, we inform Boost.Unordered
|
||||
that `my_string_hash_function` is of sufficient quality to be used directly without
|
||||
any post-mixing safety net. This comes at the risk of degraded performance in the
|
||||
cases where the hash function is not as well-behaved as we've declared.
|
||||
|
||||
== Container Statistics
|
||||
|
||||
If we globally define the macro `BOOST_UNORDERED_ENABLE_STATS`, open-addressing and
|
||||
concurrent containers will calculate some internal statistics directly correlated to the
|
||||
quality of the hash function:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
#define BOOST_UNORDERED_ENABLE_STATS
|
||||
#include <boost/unordered/unordered_map.hpp>
|
||||
|
||||
...
|
||||
|
||||
int main()
|
||||
{
|
||||
boost::unordered_flat_map<std::string, int, my_string_hash> m;
|
||||
... // use m
|
||||
|
||||
auto stats = m.get_stats();
|
||||
... // inspect stats
|
||||
}
|
||||
----
|
||||
|
||||
The `stats` object provides the following information:
|
||||
|
||||
[source,subs=+quotes]
|
||||
----
|
||||
stats
|
||||
.insertion // *Insertion operations*
|
||||
.count // Number of operations
|
||||
.probe_length // Probe length per operation
|
||||
.average
|
||||
.variance
|
||||
.deviation
|
||||
.successful_lookup // *Lookup operations (element found)*
|
||||
.count // Number of operations
|
||||
.probe_length // Probe length per operation
|
||||
.average
|
||||
.variance
|
||||
.deviation
|
||||
.num_comparisons // Elements compared per operation
|
||||
.average
|
||||
.variance
|
||||
.deviation
|
||||
.unsuccessful_lookup // *Lookup operations (element not found)*
|
||||
.count // Number of operations
|
||||
.probe_length // Probe length per operation
|
||||
.average
|
||||
.variance
|
||||
.deviation
|
||||
.num_comparisons // Elements compared per operation
|
||||
.average
|
||||
.variance
|
||||
.deviation
|
||||
----
|
||||
|
||||
Statistics for three internal operations are maintained: insertions (without considering
|
||||
the previous lookup to determine that the key is not present yet), successful lookups,
|
||||
and unsuccessful lookups (including those issued internally when inserting elements).
|
||||
_Probe length_ is the number of
|
||||
xref:#structures_open_addressing_containers[bucket groups] accessed per operation.
|
||||
If the hash function behaves properly:
|
||||
|
||||
* Average probe lengths should be close to 1.0.
|
||||
* The average number of comparisons per successful lookup should be close to 1.0 (that is,
|
||||
just the element found is checked).
|
||||
* The average number of comparisons per unsuccessful lookup should be close to 0.0.
|
||||
|
||||
An link:../../benchmark/string_stats.cpp[example^] is provided that displays container
|
||||
statistics for `boost::hash<std::string>`, an implementation of the
|
||||
https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash[FNV-1a hash^]
|
||||
and two ill-behaved custom hash functions that have been incorrectly marked as avalanching:
|
||||
|
||||
[listing]
|
||||
----
|
||||
boost::unordered_flat_map: 319 ms
|
||||
insertion: probe length 1.08771
|
||||
successful lookup: probe length 1.06206, num comparisons 1.02121
|
||||
unsuccessful lookup: probe length 1.12301, num comparisons 0.0388251
|
||||
|
||||
boost::unordered_flat_map, FNV-1a: 301 ms
|
||||
insertion: probe length 1.09567
|
||||
successful lookup: probe length 1.06202, num comparisons 1.0227
|
||||
unsuccessful lookup: probe length 1.12195, num comparisons 0.040527
|
||||
|
||||
boost::unordered_flat_map, slightly_bad_hash: 654 ms
|
||||
insertion: probe length 1.03443
|
||||
successful lookup: probe length 1.04137, num comparisons 6.22152
|
||||
unsuccessful lookup: probe length 1.29334, num comparisons 11.0335
|
||||
|
||||
boost::unordered_flat_map, bad_hash: 12216 ms
|
||||
insertion: probe length 699.218
|
||||
successful lookup: probe length 590.183, num comparisons 43.4886
|
||||
unsuccessful lookup: probe length 1361.65, num comparisons 75.238
|
||||
----
|
||||
@@ -1,12 +0,0 @@
|
||||
= Boost.Unordered
|
||||
|
||||
:toc: left
|
||||
:toclevels: 3
|
||||
:idprefix:
|
||||
:docinfo: private-footer
|
||||
:source-highlighter: rouge
|
||||
:source-language: c++
|
||||
:nofooter:
|
||||
:sectlinks:
|
||||
|
||||
:leveloffset: +1
|
||||
@@ -1,100 +0,0 @@
|
||||
[#intro]
|
||||
= Introduction
|
||||
|
||||
:idprefix: intro_
|
||||
:cpp: C++
|
||||
|
||||
link:https://en.wikipedia.org/wiki/Hash_table[Hash tables^] are extremely popular
|
||||
computer data structures and can be found under one form or another in virtually any programming
|
||||
language. Whereas other associative structures such as rb-trees (used in {cpp} by `std::set` and `std::map`)
|
||||
have logarithmic-time complexity for insertion and lookup, hash tables, if configured properly,
|
||||
perform these operations in constant time on average, and are generally much faster.
|
||||
|
||||
{cpp} introduced __unordered associative containers__ `std::unordered_set`, `std::unordered_map`,
|
||||
`std::unordered_multiset` and `std::unordered_multimap` in {cpp}11, but research on hash tables
|
||||
hasn't stopped since: advances in CPU architectures such as
|
||||
more powerful caches, link:https://en.wikipedia.org/wiki/Single_instruction,_multiple_data[SIMD] operations
|
||||
and increasingly available link:https://en.wikipedia.org/wiki/Multi-core_processor[multicore processors]
|
||||
open up possibilities for improved hash-based data structures and new use cases that
|
||||
are simply beyond reach of unordered associative containers as specified in 2011.
|
||||
|
||||
Boost.Unordered offers a catalog of hash containers with different standards compliance levels,
|
||||
performances and intented usage scenarios:
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Boost.Unordered containers']
|
||||
[cols="1,1,.^1", frame=all, grid=all]
|
||||
|===
|
||||
^h|
|
||||
^h|*Node-based*
|
||||
^h|*Flat*
|
||||
|
||||
^.^h|*Closed addressing*
|
||||
^m|
|
||||
boost::unordered_set +
|
||||
boost::unordered_map +
|
||||
boost::unordered_multiset +
|
||||
boost::unordered_multimap
|
||||
^|
|
||||
|
||||
^.^h|*Open addressing*
|
||||
^m| boost::unordered_node_set +
|
||||
boost::unordered_node_map
|
||||
^m| boost::unordered_flat_set +
|
||||
boost::unordered_flat_map
|
||||
|
||||
^.^h|*Concurrent*
|
||||
^| `boost::concurrent_node_set` +
|
||||
`boost::concurrent_node_map`
|
||||
^| `boost::concurrent_flat_set` +
|
||||
`boost::concurrent_flat_map`
|
||||
|
||||
|===
|
||||
|
||||
* **Closed-addressing containers** are fully compliant with the C++ specification
|
||||
for unordered associative containers and feature one of the fastest implementations
|
||||
in the market within the technical constraints imposed by the required standard interface.
|
||||
* **Open-addressing containers** rely on much faster data structures and algorithms
|
||||
(more than 2 times faster in typical scenarios) while slightly diverging from the standard
|
||||
interface to accommodate the implementation.
|
||||
There are two variants: **flat** (the fastest) and **node-based**, which
|
||||
provide pointer stability under rehashing at the expense of being slower.
|
||||
* Finally, **concurrent containers** are designed and implemented to be used in high-performance
|
||||
multithreaded scenarios. Their interface is radically different from that of regular C++ containers.
|
||||
Flat and node-based variants are provided.
|
||||
|
||||
All sets and maps in Boost.Unordered are instantiatied similarly as
|
||||
`std::unordered_set` and `std::unordered_map`, respectively:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
namespace boost {
|
||||
template <
|
||||
class Key,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<Key> >
|
||||
class unordered_set;
|
||||
// same for unordered_multiset, unordered_flat_set, unordered_node_set,
|
||||
// concurrent_flat_set and concurrent_node_set
|
||||
|
||||
template <
|
||||
class Key, class Mapped,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<std::pair<Key const, Mapped> > >
|
||||
class unordered_map;
|
||||
// same for unordered_multimap, unordered_flat_map, unordered_node_map,
|
||||
// concurrent_flat_map and concurrent_node_map
|
||||
}
|
||||
----
|
||||
|
||||
Storing an object in an unordered associative container requires both a
|
||||
key equality function and a hash function. The default function objects in
|
||||
the standard containers support a few basic types including integer types,
|
||||
floating point types, pointer types, and the standard strings. Since
|
||||
Boost.Unordered uses link:../../../container_hash/index.html[boost::hash^] it also supports some other types,
|
||||
including standard containers. To use any types not supported by these methods
|
||||
you have to extend Boost.Hash to support the type or use
|
||||
your own custom equality predicates and hash functions. See the
|
||||
<<hash_equality,Equality Predicates and Hash Functions>> section
|
||||
for more details.
|
||||
@@ -1,143 +0,0 @@
|
||||
[#rationale]
|
||||
|
||||
:idprefix: rationale_
|
||||
|
||||
= Implementation Rationale
|
||||
|
||||
== Closed-addressing Containers
|
||||
|
||||
`boost::unordered_[multi]set` and `boost::unordered_[multi]map`
|
||||
adhere to the standard requirements for unordered associative
|
||||
containers, so the interface was fixed. But there are
|
||||
still some implementation decisions to make. The priorities are
|
||||
conformance to the standard and portability.
|
||||
|
||||
The http://en.wikipedia.org/wiki/Hash_table[Wikipedia article on hash tables^]
|
||||
has a good summary of the implementation issues for hash tables in general.
|
||||
|
||||
=== Data Structure
|
||||
|
||||
By specifying an interface for accessing the buckets of the container the
|
||||
standard pretty much requires that the hash table uses closed addressing.
|
||||
|
||||
It would be conceivable to write a hash table that uses another method. For
|
||||
example, it could use open addressing, and use the lookup chain to act as a
|
||||
bucket but there are some serious problems with this:
|
||||
|
||||
* The standard requires that pointers to elements aren't invalidated, so
|
||||
the elements can't be stored in one array, but will need a layer of
|
||||
indirection instead - losing the efficiency and most of the memory gain,
|
||||
the main advantages of open addressing.
|
||||
* Local iterators would be very inefficient and may not be able to
|
||||
meet the complexity requirements.
|
||||
* There are also the restrictions on when iterators can be invalidated. Since
|
||||
open addressing degrades badly when there are a high number of collisions the
|
||||
restrictions could prevent a rehash when it's really needed. The maximum load
|
||||
factor could be set to a fairly low value to work around this - but the
|
||||
standard requires that it is initially set to 1.0.
|
||||
* And since the standard is written with a eye towards closed
|
||||
addressing, users will be surprised if the performance doesn't reflect that.
|
||||
|
||||
So closed addressing is used.
|
||||
|
||||
=== Number of Buckets
|
||||
|
||||
There are two popular methods for choosing the number of buckets in a hash
|
||||
table. One is to have a prime number of buckets, another is to use a power
|
||||
of 2.
|
||||
|
||||
Using a prime number of buckets, and choosing a bucket by using the modulus
|
||||
of the hash function's result will usually give a good result. The downside
|
||||
is that the required modulus operation is fairly expensive. This is what the
|
||||
containers used to do in most cases.
|
||||
|
||||
Using a power of 2 allows for much quicker selection of the bucket to use,
|
||||
but at the expense of losing the upper bits of the hash value. For some
|
||||
specially designed hash functions it is possible to do this and still get a
|
||||
good result but as the containers can take arbitrary hash functions this can't
|
||||
be relied on.
|
||||
|
||||
To avoid this a transformation could be applied to the hash function, for an
|
||||
example see
|
||||
http://web.archive.org/web/20121102023700/http://www.concentric.net/~Ttwang/tech/inthash.htm[Thomas Wang's article on integer hash functions^].
|
||||
Unfortunately, a transformation like Wang's requires knowledge of the number
|
||||
of bits in the hash value, so it was only used when `size_t` was 64 bit.
|
||||
|
||||
Since release 1.79.0, https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing[Fibonacci hashing]
|
||||
is used instead. With this implementation, the bucket number is determined
|
||||
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is `2^w` divided
|
||||
by the golden ratio, `w` is the word size (32 or 64), and `2^k` is the
|
||||
number of buckets. This provides a good compromise between speed and
|
||||
distribution.
|
||||
|
||||
Since release 1.80.0, prime numbers are chosen for the number of buckets in
|
||||
tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
|
||||
the result of the user's hash function as was used for release 1.79.0.
|
||||
|
||||
== Open-addresing Containers
|
||||
|
||||
The C++ standard specification of unordered associative containers impose
|
||||
severe limitations on permissible implementations, the most important being
|
||||
that closed addressing is implicitly assumed. Slightly relaxing this specification
|
||||
opens up the possibility of providing container variations taking full
|
||||
advantage of open-addressing techniques.
|
||||
|
||||
The design of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` has been
|
||||
guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
|
||||
We discuss here the most relevant principles.
|
||||
|
||||
=== Hash Function
|
||||
|
||||
Given its rich functionality and cross-platform interoperability,
|
||||
`boost::hash` remains the default hash function of open-addressing containers.
|
||||
As it happens, `boost::hash` for integral and other basic types does not possess
|
||||
the statistical properties required by open addressing; to cope with this,
|
||||
we implement a post-mixing stage:
|
||||
|
||||
{nbsp}{nbsp}{nbsp}{nbsp} _a_ <- _h_ *mulx* _C_, +
|
||||
{nbsp}{nbsp}{nbsp}{nbsp} _h_ <- *high*(_a_) *xor* *low*(_a_),
|
||||
|
||||
where *mulx* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
|
||||
and *high* and *low* are the upper and lower halves of an extended word, respectively.
|
||||
In 64-bit architectures, _C_ is the integer part of 2^64^∕https://en.wikipedia.org/wiki/Golden_ratio[_φ_],
|
||||
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
|
||||
|
||||
When using a hash function directly suitable for open addressing, post-mixing can be opted out of via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
|
||||
`boost::hash` specializations for string types are marked as avalanching.
|
||||
|
||||
=== Platform Interoperability
|
||||
|
||||
The observable behavior of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` is deterministically
|
||||
identical across different compilers as long as their ``std::size_t``s are the same size and the user-provided
|
||||
hash function and equality predicate are also interoperable
|
||||
—this includes elements being ordered in exactly the same way for the same sequence of
|
||||
operations.
|
||||
|
||||
Although the implementation internally uses SIMD technologies, such as https://en.wikipedia.org/wiki/SSE2[SSE2^]
|
||||
and https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(NEON)[Neon^], when available,
|
||||
this does not affect interoperatility. For instance, the behavior is the same
|
||||
for Visual Studio on an x64-mode Intel CPU with SSE2 and for GCC on an IBM s390x without any supported SIMD technology.
|
||||
|
||||
== Concurrent Containers
|
||||
|
||||
The same data structure used by Boost.Unordered open-addressing containers has been chosen
|
||||
also as the foundation of `boost::concurrent_flat_set`/`boost::concurrent_node_set` and
|
||||
`boost::concurrent_flat_map`/`boost::concurrent_node_map`:
|
||||
|
||||
* Open-addressing is faster than closed-addressing alternatives, both in non-concurrent and
|
||||
concurrent scenarios.
|
||||
* Open-addressing layouts are eminently suitable for concurrent access and modification
|
||||
with minimal locking. In particular, the metadata array can be used for implementations of
|
||||
lookup that are lock-free up to the last step of actual element comparison.
|
||||
* Layout compatibility with Boost.Unordered flat containers allows for
|
||||
xref:#concurrent_interoperability_with_non_concurrent_containers[fast transfer]
|
||||
of all elements between a concurrent container and its non-concurrent counterpart,
|
||||
and vice versa.
|
||||
|
||||
=== Hash Function and Platform Interoperability
|
||||
|
||||
Concurrent containers make the same decisions and provide the same guarantees
|
||||
as Boost.Unordered open-addressing containers with regards to
|
||||
xref:#rationale_hash_function[hash function defaults] and
|
||||
xref:#rationale_platform_interoperability[platform interoperability].
|
||||
|
||||
@@ -1,17 +0,0 @@
|
||||
[#reference]
|
||||
= Reference
|
||||
|
||||
* xref:reference/unordered_map.adoc[unordered_map]
|
||||
* xref:reference/unordered_multimap.adoc[unordered_multimap]
|
||||
* xref:reference/unordered_set.adoc[unordered_set]
|
||||
* xref:reference/unordered_multiset.adoc[unordered_multiset]
|
||||
* xref:reference/hash_traits.adoc[hash_traits]
|
||||
* xref:reference/stats.adoc[stats]
|
||||
* xref:reference/unordered_flat_map.adoc[unordered_flat_map]
|
||||
* xref:reference/unordered_flat_set.adoc[unordered_flat_set]
|
||||
* xref:reference/unordered_node_map.adoc[unordered_node_map]
|
||||
* xref:reference/unordered_node_set.adoc[unordered_node_set]
|
||||
* xref:reference/concurrent_flat_map.adoc[concurrent_flat_map]
|
||||
* xref:reference/concurrent_flat_set.adoc[concurrent_flat_set]
|
||||
* xref:reference/concurrent_node_map.adoc[concurrent_node_map]
|
||||
* xref:reference/concurrent_node_set.adoc[concurrent_node_set]
|
||||
@@ -1,320 +0,0 @@
|
||||
[#concurrent]
|
||||
= Concurrent Containers
|
||||
|
||||
:idprefix: concurrent_
|
||||
|
||||
Boost.Unordered provides `boost::concurrent_node_set`, `boost::concurrent_node_map`,
|
||||
`boost::concurrent_flat_set` and `boost::concurrent_flat_map`,
|
||||
hash tables that allow concurrent write/read access from
|
||||
different threads without having to implement any synchronzation mechanism on the user's side.
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::vector<int> input;
|
||||
boost::concurrent_flat_map<int,int> m;
|
||||
|
||||
...
|
||||
|
||||
// process input in parallel
|
||||
const int num_threads = 8;
|
||||
std::vector<std::jthread> threads;
|
||||
std::size_t chunk = input.size() / num_threads; // how many elements per thread
|
||||
|
||||
for (int i = 0; i < num_threads; ++i) {
|
||||
threads.emplace_back([&,i] {
|
||||
// calculate the portion of input this thread takes care of
|
||||
std::size_t start = i * chunk;
|
||||
std::size_t end = (i == num_threads - 1)? input.size(): (i + 1) * chunk;
|
||||
|
||||
for (std::size_t n = start; n < end; ++n) {
|
||||
m.emplace(input[n], calculation(input[n]));
|
||||
}
|
||||
});
|
||||
}
|
||||
----
|
||||
|
||||
In the example above, threads access `m` without synchronization, just as we'd do in a
|
||||
single-threaded scenario. In an ideal setting, if a given workload is distributed among
|
||||
_N_ threads, execution is _N_ times faster than with one thread —this limit is
|
||||
never attained in practice due to synchronization overheads and _contention_ (one thread
|
||||
waiting for another to leave a locked portion of the map), but Boost.Unordered concurrent containers
|
||||
are designed to perform with very little overhead and typically achieve _linear scaling_
|
||||
(that is, performance is proportional to the number of threads up to the number of
|
||||
logical cores in the CPU).
|
||||
|
||||
== Visitation-based API
|
||||
|
||||
The first thing a new user of Boost.Unordered concurrent containers
|
||||
will notice is that these classes _do not provide iterators_ (which makes them technically
|
||||
not https://en.cppreference.com/w/cpp/named_req/Container[Containers^]
|
||||
in the C++ standard sense). The reason for this is that iterators are inherently
|
||||
thread-unsafe. Consider this hypothetical code:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
auto it = m.find(k); // A: get an iterator pointing to the element with key k
|
||||
if (it != m.end() ) {
|
||||
some_function(*it); // B: use the value of the element
|
||||
}
|
||||
----
|
||||
|
||||
In a multithreaded scenario, the iterator `it` may be invalid at point B if some other
|
||||
thread issues an `m.erase(k)` operation between A and B. There are designs that
|
||||
can remedy this by making iterators lock the element they point to, but this
|
||||
approach lends itself to high contention and can easily produce deadlocks in a program.
|
||||
`operator[]` has similar concurrency issues, and is not provided by
|
||||
`boost::concurrent_flat_map`/`boost::concurrent_node_map` either. Instead, element access is done through
|
||||
so-called _visitation functions_:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [](const auto& x) { // x is the element with key k (if it exists)
|
||||
some_function(x); // use it
|
||||
});
|
||||
----
|
||||
|
||||
The visitation function passed by the user (in this case, a lambda function)
|
||||
is executed internally by Boost.Unordered in
|
||||
a thread-safe manner, so it can access the element without worrying about other
|
||||
threads interfering in the process.
|
||||
|
||||
On the other hand, a visitation function can _not_ access the container itself:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [&](const auto& x) {
|
||||
some_function(x, m.size()); // forbidden: m can't be accessed inside visitation
|
||||
});
|
||||
----
|
||||
|
||||
Access to a different container is allowed, though:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(k, [&](const auto& x) {
|
||||
if (some_function(x)) {
|
||||
m2.insert(x); // OK, m2 is a different boost::concurrent_flat_map
|
||||
}
|
||||
});
|
||||
----
|
||||
|
||||
But, in general, visitation functions should be as lightweight as possible to
|
||||
reduce contention and increase parallelization. In some cases, moving heavy work
|
||||
outside of visitation may be beneficial:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::optional<value_type> o;
|
||||
bool found = m.visit(k, [&](const auto& x) {
|
||||
o = x;
|
||||
});
|
||||
if (found) {
|
||||
some_heavy_duty_function(*o);
|
||||
}
|
||||
----
|
||||
|
||||
Visitation is prominent in the API provided by concurrent containers, and
|
||||
many classical operations have visitation-enabled variations:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.insert_or_visit(x, [](auto& y) {
|
||||
// if insertion failed because of an equivalent element y,
|
||||
// do something with it, for instance:
|
||||
++y.second; // increment the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
Note that in this last example the visitation function could actually _modify_
|
||||
the element: as a general rule, operations on a concurrent map `m`
|
||||
will grant visitation functions const/non-const access to the element depending on whether
|
||||
`m` is const/non-const. Const access can be always be explicitly requested
|
||||
by using `cvisit` overloads (for instance, `insert_or_cvisit`) and may result
|
||||
in higher parallelization. For concurrent sets, on the other hand,
|
||||
visitation is always const access.
|
||||
|
||||
Although expected to be used much less frequently, concurrent containers
|
||||
also provide insertion operations where an element can be visited right after
|
||||
element creation (in addition to the usual visitation when an equivalent
|
||||
element already exists):
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.insert_and_cvisit(x,
|
||||
[](const auto& y) {
|
||||
std::cout<< "(" << y.first << ", " << y.second <<") inserted\n";
|
||||
},
|
||||
[](const auto& y) {
|
||||
std::cout<< "(" << y.first << ", " << y.second << ") already exists\n";
|
||||
});
|
||||
----
|
||||
|
||||
Consult the references of
|
||||
xref:#concurrent_node_set[`boost::concurrent_node_set`],
|
||||
xref:#concurrent_flat_map[`boost::concurrent_node_map`],
|
||||
xref:#concurrent_flat_set[`boost::concurrent_flat_set`] and
|
||||
xref:#concurrent_flat_map[`boost::concurrent_flat_map`]
|
||||
for the complete list of visitation-enabled operations.
|
||||
|
||||
== Whole-Table Visitation
|
||||
|
||||
In the absence of iterators, `visit_all` is provided
|
||||
as an alternative way to process all the elements in the container:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit_all([](auto& x) {
|
||||
x.second = 0; // reset the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
In C++17 compilers implementing standard parallel algorithms, whole-table
|
||||
visitation can be parallelized:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit_all(std::execution::par, [](auto& x) { // run in parallel
|
||||
x.second = 0; // reset the mapped part of the element
|
||||
});
|
||||
----
|
||||
|
||||
Traversal can be interrupted midway:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// finds the key to a given (unique) value
|
||||
|
||||
int key = 0;
|
||||
int value = ...;
|
||||
bool found = !m.visit_while([&](const auto& x) {
|
||||
if(x.second == value) {
|
||||
key = x.first;
|
||||
return false; // finish
|
||||
}
|
||||
else {
|
||||
return true; // keep on visiting
|
||||
}
|
||||
});
|
||||
|
||||
if(found) { ... }
|
||||
----
|
||||
|
||||
There is one last whole-table visitation operation, `erase_if`:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.erase_if([](auto& x) {
|
||||
return x.second == 0; // erase the elements whose mapped value is zero
|
||||
});
|
||||
----
|
||||
|
||||
`visit_while` and `erase_if` can also be parallelized. Note that, in order to increase efficiency,
|
||||
whole-table visitation operations do not block the table during execution: this implies that elements
|
||||
may be inserted, modified or erased by other threads during visitation. It is
|
||||
advisable not to assume too much about the exact global state of a concurrent container
|
||||
at any point in your program.
|
||||
|
||||
== Bulk visitation
|
||||
|
||||
Suppose you have an `std::array` of keys you want to look up for in a concurrent map:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::array<int, N> keys;
|
||||
...
|
||||
for(const auto& key: keys) {
|
||||
m.visit(key, [](auto& x) { ++x.second; });
|
||||
}
|
||||
----
|
||||
|
||||
_Bulk visitation_ allows us to pass all the keys in one operation:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
m.visit(keys.begin(), keys.end(), [](auto& x) { ++x.second; });
|
||||
----
|
||||
|
||||
This functionality is not provided for mere syntactic convenience, though: by processing all the
|
||||
keys at once, some internal optimizations can be applied that increase
|
||||
performance over the regular, one-at-a-time case (consult the
|
||||
xref:#benchmarks_boostconcurrent_flat_map[benchmarks]). In fact, it may be beneficial
|
||||
to buffer incoming keys so that they can be bulk visited in chunks:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
static constexpr auto bulk_visit_size = boost::concurrent_flat_map<int,int>::bulk_visit_size;
|
||||
std::array<int, bulk_visit_size> buffer;
|
||||
std::size_t i=0;
|
||||
while(...) { // processing loop
|
||||
...
|
||||
buffer[i++] = k;
|
||||
if(i == bulk_visit_size) {
|
||||
map.visit(buffer.begin(), buffer.end(), [](auto& x) { ++x.second; });
|
||||
i = 0;
|
||||
}
|
||||
...
|
||||
}
|
||||
// flush remaining keys
|
||||
map.visit(buffer.begin(), buffer.begin() + i, [](auto& x) { ++x.second; });
|
||||
----
|
||||
|
||||
There's a latency/throughput tradeoff here: it will take longer for incoming keys to
|
||||
be processed (since they are buffered), but the number of processed keys per second
|
||||
is higher. `bulk_visit_size` is the recommended chunk size —smaller buffers
|
||||
may yield worse performance.
|
||||
|
||||
== Blocking Operations
|
||||
|
||||
Concurrent containers can be copied, assigned, cleared and merged just like any other
|
||||
Boost.Unordered container. Unlike most other operations, these are _blocking_,
|
||||
that is, all other threads are prevented from accesing the tables involved while a copy, assignment,
|
||||
clear or merge operation is in progress. Blocking is taken care of automatically by the library
|
||||
and the user need not take any special precaution, but overall performance may be affected.
|
||||
|
||||
Another blocking operation is _rehashing_, which happens explicitly via `rehash`/`reserve`
|
||||
or during insertion when the table's load hits `max_load()`. As with non-concurrent containers,
|
||||
reserving space in advance of bulk insertions will generally speed up the process.
|
||||
|
||||
== Interoperability with non-concurrent containers
|
||||
|
||||
As open-addressing and concurrent containers are based on the same internal data structure,
|
||||
they can be efficiently move-constructed from their non-concurrent counterpart, and vice versa.
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Concurrent/non-concurrent interoperatibility']
|
||||
[cols="1,1", frame=all, grid=all]
|
||||
|===
|
||||
^|`boost::concurrent_node_set`
|
||||
^|`boost::unordered_node_set`
|
||||
|
||||
^|`boost::concurrent_node_map`
|
||||
^|`boost::unordered_node_map`
|
||||
|
||||
^|`boost::concurrent_flat_set`
|
||||
^|`boost::unordered_flat_set`
|
||||
|
||||
^|`boost::concurrent_flat_map`
|
||||
^|`boost::unordered_flat_map`
|
||||
|
||||
|===
|
||||
|
||||
This interoperability comes handy in multistage scenarios where parts of the data processing happen
|
||||
in parallel whereas other steps are non-concurrent (or non-modifying). In the following example,
|
||||
we want to construct a histogram from a huge input vector of words:
|
||||
the population phase can be done in parallel with `boost::concurrent_flat_map` and results
|
||||
then transferred to the final container.
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
std::vector<std::string> words = ...;
|
||||
|
||||
// Insert words in parallel
|
||||
boost::concurrent_flat_map<std::string_view, std::size_t> m0;
|
||||
std::for_each(
|
||||
std::execution::par, words.begin(), words.end(),
|
||||
[&](const auto& word) {
|
||||
m0.try_emplace_or_visit(word, 1, [](auto& x) { ++x.second; });
|
||||
});
|
||||
|
||||
// Transfer to a regular unordered_flat_map
|
||||
boost::unordered_flat_map m=std::move(m0);
|
||||
----
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -1,51 +0,0 @@
|
||||
[#hash_traits]
|
||||
== Hash traits
|
||||
|
||||
:idprefix: hash_traits_
|
||||
|
||||
=== Synopsis
|
||||
|
||||
[listing,subs="+macros,+quotes"]
|
||||
-----
|
||||
// #include <boost/unordered/hash_traits.hpp>
|
||||
|
||||
namespace boost {
|
||||
namespace unordered {
|
||||
|
||||
template<typename Hash>
|
||||
struct xref:#hash_traits_hash_is_avalanching[hash_is_avalanching];
|
||||
|
||||
} // namespace unordered
|
||||
} // namespace boost
|
||||
-----
|
||||
|
||||
---
|
||||
|
||||
=== hash_is_avalanching
|
||||
```c++
|
||||
template<typename Hash>
|
||||
struct hash_is_avalanching;
|
||||
```
|
||||
|
||||
A hash function is said to have the _avalanching property_ if small changes in the input translate to
|
||||
large changes in the returned hash code —ideally, flipping one bit in the representation of
|
||||
the input value results in each bit of the hash code flipping with probability 50%. Approaching
|
||||
this property is critical for the proper behavior of open-addressing hash containers.
|
||||
|
||||
`hash_is_avalanching<Hash>::value` is:
|
||||
|
||||
* `false` if `Hash::is_avalanching` is not present,
|
||||
* `Hash::is_avalanching::value` if this is present and convertible at compile time to a `bool`,
|
||||
* `true` if `Hash::is_avalanching` is `void` (this usage is deprecated),
|
||||
* ill-formed otherwise.
|
||||
|
||||
Users can then declare a hash function `Hash` as avalanching either by embedding an appropriate `is_avalanching` typedef
|
||||
into the definition of `Hash`, or directly by specializing `hash_is_avalanching<Hash>` to a class with
|
||||
an embedded compile-time constant `value` set to `true`.
|
||||
|
||||
Open-addressing and concurrent containers
|
||||
use the provided hash function `Hash` as-is if `hash_is_avalanching<Hash>::value` is `true`; otherwise, they
|
||||
implement a bit-mixing post-processing stage to increase the quality of hashing at the expense of
|
||||
extra computational cost.
|
||||
|
||||
---
|
||||
@@ -1,71 +0,0 @@
|
||||
[#stats]
|
||||
== Statistics
|
||||
|
||||
:idprefix: stats_
|
||||
|
||||
Open-addressing and concurrent containers can be configured to keep running statistics
|
||||
of some internal operations affected by the quality of the supplied hash function.
|
||||
|
||||
=== Synopsis
|
||||
|
||||
[listing,subs="+macros,+quotes"]
|
||||
-----
|
||||
struct xref:#stats_stats_summary_type[__stats-summary-type__]
|
||||
{
|
||||
double average;
|
||||
double variance;
|
||||
double deviation;
|
||||
};
|
||||
|
||||
struct xref:#stats_insertion_stats_type[__insertion-stats-type__]
|
||||
{
|
||||
std::size_t count;
|
||||
xref:#stats_stats_summary_type[__stats-summary-type__] probe_length;
|
||||
};
|
||||
|
||||
struct xref:stats_lookup_stats_type[__lookup-stats-type__]
|
||||
{
|
||||
std::size_t count;
|
||||
xref:#stats_stats_summary_type[__stats-summary-type__] probe_length;
|
||||
xref:#stats_stats_summary_type[__stats-summary-type__] num_comparisons;
|
||||
};
|
||||
|
||||
struct xref:stats_stats_type[__stats-type__]
|
||||
{
|
||||
xref:#stats_insertion_stats_type[__insertion-stats-type__] insertion;
|
||||
xref:stats_lookup_stats_type[__lookup-stats-type__] successful_lookup,
|
||||
unsuccessful_lookup;
|
||||
};
|
||||
-----
|
||||
|
||||
==== __stats-summary-type__
|
||||
|
||||
Provides the average value, variance and standard deviation of a sequence of numerical values.
|
||||
|
||||
==== __insertion-stats-type__
|
||||
|
||||
Provides the number of insertion operations performed by a container and
|
||||
statistics on the associated __probe length__ (number of
|
||||
xref:#structures_open_addressing_containers[bucket groups] accessed per operation).
|
||||
|
||||
==== __lookup-stats-type__
|
||||
|
||||
For successful (element found) or unsuccessful (not found) lookup,
|
||||
provides the number of operations performed by a container and
|
||||
statistics on the associated __probe length__ (number of
|
||||
xref:#structures_open_addressing_containers[bucket groups] accessed)
|
||||
and number of element comparisons per operation.
|
||||
|
||||
==== __stats-type__
|
||||
|
||||
Provides statistics on insertion, successful and unsuccessful lookups performed by a container.
|
||||
If the supplied hash function has good quality, then:
|
||||
|
||||
* Average probe lenghts should be close to 1.0.
|
||||
* For successful lookups, the average number of element comparisons should be close to 1.0.
|
||||
* For unsuccessful lookups, the average number of element comparisons should be close to 0.0.
|
||||
|
||||
These statistics can be used to determine if a given hash function
|
||||
can be marked as xref:hash_traits_hash_is_avalanching[__avalanching__].
|
||||
|
||||
---
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -1,203 +0,0 @@
|
||||
[#regular]
|
||||
= Regular Containers
|
||||
|
||||
:idprefix: regular_
|
||||
|
||||
Boost.Unordered closed-addressing containers (`boost::unordered_set`, `boost::unordered_map`,
|
||||
`boost::unordered_multiset` and `boost::unordered_multimap`) are fully conformant with the
|
||||
C++ specification for unordered associative containers, so for those who know how to use
|
||||
`std::unordered_set`, `std::unordered_map`, etc., their homonyms in Boost.Unordered are
|
||||
drop-in replacements. The interface of open-addressing containers (`boost::unordered_node_set`,
|
||||
`boost::unordered_node_map`, `boost::unordered_flat_set` and `boost::unordered_flat_map`)
|
||||
is very similar, but they present some minor differences listed in the dedicated
|
||||
xref:#compliance_open_addressing_containers[standard compliance section].
|
||||
|
||||
|
||||
For readers without previous experience with hash containers but familiar
|
||||
with normal associative containers (`std::set`, `std::map`,
|
||||
`std::multiset` and `std::multimap`), Boost.Unordered containers are used in a similar manner:
|
||||
|
||||
[source,cpp]
|
||||
----
|
||||
typedef boost::unordered_map<std::string, int> map;
|
||||
map x;
|
||||
x["one"] = 1;
|
||||
x["two"] = 2;
|
||||
x["three"] = 3;
|
||||
|
||||
assert(x.at("one") == 1);
|
||||
assert(x.find("missing") == x.end());
|
||||
----
|
||||
|
||||
But since the elements aren't ordered, the output of:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
for(const map::value_type& i: x) {
|
||||
std::cout<<i.first<<","<<i.second<<"\n";
|
||||
}
|
||||
----
|
||||
|
||||
can be in any order. For example, it might be:
|
||||
|
||||
[source]
|
||||
----
|
||||
two,2
|
||||
one,1
|
||||
three,3
|
||||
----
|
||||
|
||||
There are other differences, which are listed in the
|
||||
<<comparison,Comparison with Associative Containers>> section.
|
||||
|
||||
== Iterator Invalidation
|
||||
|
||||
It is not specified how member functions other than `rehash` and `reserve` affect
|
||||
the bucket count, although `insert` can only invalidate iterators
|
||||
when the insertion causes the container's load to be greater than the maximum allowed.
|
||||
For most implementations this means that `insert` will only
|
||||
change the number of buckets when this happens. Iterators can be
|
||||
invalidated by calls to `insert`, `rehash` and `reserve`.
|
||||
|
||||
As for pointers and references,
|
||||
they are never invalidated for node-based containers
|
||||
(`boost::unordered_[multi]set`, `boost::unordered_[multi]map`, `boost::unordered_node_set`, `boost::unordered_node_map`),
|
||||
but they will be when rehashing occurs for
|
||||
`boost::unordered_flat_set` and `boost::unordered_flat_map`: this is because
|
||||
these containers store elements directly into their holding buckets, so
|
||||
when allocating a new bucket array the elements must be transferred by means of move construction.
|
||||
|
||||
In a similar manner to using `reserve` for ``vector``s, it can be a good idea
|
||||
to call `reserve` before inserting a large number of elements. This will get
|
||||
the expensive rehashing out of the way and let you store iterators, safe in
|
||||
the knowledge that they won't be invalidated. If you are inserting `n`
|
||||
elements into container `x`, you could first call:
|
||||
|
||||
```
|
||||
x.reserve(n);
|
||||
```
|
||||
|
||||
Note:: `reserve(n)` reserves space for at least `n` elements, allocating enough buckets
|
||||
so as to not exceed the maximum load factor.
|
||||
+
|
||||
Because the maximum load factor is defined as the number of elements divided by the total
|
||||
number of available buckets, this function is logically equivalent to:
|
||||
+
|
||||
```
|
||||
x.rehash(std::ceil(n / x.max_load_factor()))
|
||||
```
|
||||
+
|
||||
See the <<unordered_map_rehash,reference for more details>> on the `rehash` function.
|
||||
|
||||
[#comparison]
|
||||
|
||||
:idprefix: comparison_
|
||||
|
||||
== Comparison with Associative Containers
|
||||
|
||||
[caption=, title='Table {counter:table-counter} Interface differences']
|
||||
[cols="1,1", frame=all, grid=rows]
|
||||
|===
|
||||
|Associative Containers |Unordered Associative Containers
|
||||
|
||||
|Parameterized by an ordering relation `Compare`
|
||||
|Parameterized by a function object `Hash` and an equivalence relation `Pred`
|
||||
|
||||
|Keys can be compared using `key_compare` which is accessed by member function `key_comp()`, values can be compared using `value_compare` which is accessed by member function `value_comp()`.
|
||||
|Keys can be hashed using `hasher` which is accessed by member function `hash_function()`, and checked for equality using `key_equal` which is accessed by member function `key_eq()`. There is no function object for compared or hashing values.
|
||||
|
||||
|Constructors have optional extra parameters for the comparison object.
|
||||
|Constructors have optional extra parameters for the initial minimum number of buckets, a hash function and an equality object.
|
||||
|
||||
|Keys `k1`, `k2` are considered equivalent if `!Compare(k1, k2) && !Compare(k2, k1)`.
|
||||
|Keys `k1`, `k2` are considered equivalent if `Pred(k1, k2)`
|
||||
|
||||
|Member function `lower_bound(k)` and `upper_bound(k)`
|
||||
|No equivalent. Since the elements aren't ordered `lower_bound` and `upper_bound` would be meaningless.
|
||||
|
||||
|`equal_range(k)` returns an empty range at the position that `k` would be inserted if `k` isn't present in the container.
|
||||
|`equal_range(k)` returns a range at the end of the container if `k` isn't present in the container. It can't return a positioned range as `k` could be inserted into multiple place. +
|
||||
**Closed-addressing containers:** To find out the bucket that `k` would be inserted into use `bucket(k)`. But remember that an insert can cause the container to rehash - meaning that the element can be inserted into a different bucket.
|
||||
|
||||
|`iterator`, `const_iterator` are of the bidirectional category.
|
||||
|`iterator`, `const_iterator` are of at least the forward category.
|
||||
|
||||
|Iterators, pointers and references to the container's elements are never invalidated.
|
||||
|<<regular_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. +
|
||||
**Node-based containers:** Pointers and references to the container's elements are never invalidated. +
|
||||
**Flat containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|
||||
|
||||
|Iterators iterate through the container in the order defined by the comparison object.
|
||||
|Iterators iterate through the container in an arbitrary order, that can change as elements are inserted, although equivalent elements are always adjacent.
|
||||
|
||||
|No equivalent
|
||||
|**Closed-addressing containers:** Local iterators can be used to iterate through individual buckets. (The order of local iterators and iterators aren't required to have any correspondence.)
|
||||
|
||||
|Can be compared using the `==`, `!=`, `<`, `\<=`, `>`, `>=` operators.
|
||||
|Can be compared using the `==` and `!=` operators.
|
||||
|
||||
|
|
||||
|When inserting with a hint, implementations are permitted to ignore the hint.
|
||||
|
||||
|===
|
||||
|
||||
---
|
||||
|
||||
[caption=, title='Table {counter:table-counter} Complexity Guarantees']
|
||||
[cols="1,1,1", frame=all, grid=rows]
|
||||
|===
|
||||
|Operation |Associative Containers |Unordered Associative Containers
|
||||
|
||||
|Construction of empty container
|
||||
|constant
|
||||
|O(_n_) where _n_ is the minimum number of buckets.
|
||||
|
||||
|Construction of container from a range of _N_ elements
|
||||
|O(_N log N_), O(_N_) if the range is sorted with `value_comp()`
|
||||
|Average case O(_N_), worst case O(_N^2^_)
|
||||
|
||||
|Insert a single element
|
||||
|logarithmic
|
||||
|Average case constant, worst case linear
|
||||
|
||||
|Insert a single element with a hint
|
||||
|Amortized constant if `t` elements inserted right after hint, logarithmic otherwise
|
||||
|Average case constant, worst case linear (ie. the same as a normal insert).
|
||||
|
||||
|Inserting a range of _N_ elements
|
||||
|_N_ log(`size()` + _N_)
|
||||
|Average case O(_N_), worst case O(_N_ * `size()`)
|
||||
|
||||
|Erase by key, `k`
|
||||
|O(log(`size()`) + `count(k)`)
|
||||
|Average case: O(`count(k)`), Worst case: O(`size()`)
|
||||
|
||||
|Erase a single element by iterator
|
||||
|Amortized constant
|
||||
|Average case: O(1), Worst case: O(`size()`)
|
||||
|
||||
|Erase a range of _N_ elements
|
||||
|O(log(`size()`) + _N_)
|
||||
|Average case: O(_N_), Worst case: O(`size()`)
|
||||
|
||||
|Clearing the container
|
||||
|O(`size()`)
|
||||
|O(`size()`)
|
||||
|
||||
|Find
|
||||
|logarithmic
|
||||
|Average case: O(1), Worst case: O(`size()`)
|
||||
|
||||
|Count
|
||||
|O(log(`size()`) + `count(k)`)
|
||||
|Average case: O(1), Worst case: O(`size()`)
|
||||
|
||||
|`equal_range(k)`
|
||||
|logarithmic
|
||||
|Average case: O(`count(k)`), Worst case: O(`size()`)
|
||||
|
||||
|`lower_bound`,`upper_bound`
|
||||
|logarithmic
|
||||
|n/a
|
||||
|
||||
|===
|
||||
@@ -1,180 +0,0 @@
|
||||
[#structures]
|
||||
= Data Structures
|
||||
|
||||
:idprefix: structures_
|
||||
|
||||
== Closed-addressing Containers
|
||||
|
||||
++++
|
||||
<style>
|
||||
.imageblock > .title {
|
||||
text-align: inherit;
|
||||
}
|
||||
</style>
|
||||
++++
|
||||
|
||||
Boost.Unordered sports one of the fastest implementations of closed addressing, also commonly known as https://en.wikipedia.org/wiki/Hash_table#Separate_chaining[separate chaining]. An example figure representing the data structure is below:
|
||||
|
||||
[#img-bucket-groups,.text-center]
|
||||
.A simple bucket group approach
|
||||
image::bucket-groups.png[align=center]
|
||||
|
||||
An array of "buckets" is allocated and each bucket in turn points to its own individual linked list. This makes meeting the standard requirements of bucket iteration straight-forward. Unfortunately, iteration of the entire container is often times slow using this layout as each bucket must be examined for occupancy, yielding a time complexity of `O(bucket_count() + size())` when the standard requires complexity to be `O(size())`.
|
||||
|
||||
Canonical standard implementations will wind up looking like the diagram below:
|
||||
|
||||
[.text-center]
|
||||
.The canonical standard approach
|
||||
image::singly-linked.png[align=center,link=_images/singly-linked.png,window=_blank]
|
||||
|
||||
It's worth noting that this approach is only used by pass:[libc++] and pass:[libstdc++]; the MSVC Dinkumware implementation uses a different one. A more detailed analysis of the standard containers can be found http://bannalia.blogspot.com/2013/10/implementation-of-c-unordered.html[here].
|
||||
|
||||
This unusually laid out data structure is chosen to make iteration of the entire container efficient by inter-connecting all of the nodes into a singly-linked list. One might also notice that buckets point to the node _before_ the start of the bucket's elements. This is done so that removing elements from the list can be done efficiently without introducing the need for a doubly-linked list. Unfortunately, this data structure introduces a guaranteed extra indirection. For example, to access the first element of a bucket, something like this must be done:
|
||||
|
||||
```c++
|
||||
auto const idx = get_bucket_idx(hash_function(key));
|
||||
node* p = buckets[idx]; // first load
|
||||
node* n = p->next; // second load
|
||||
if (n && is_in_bucket(n, idx)) {
|
||||
value_type const& v = *n; // third load
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
With a simple bucket group layout, this is all that must be done:
|
||||
```c++
|
||||
auto const idx = get_bucket_idx(hash_function(key));
|
||||
node* n = buckets[idx]; // first load
|
||||
if (n) {
|
||||
value_type const& v = *n; // second load
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
In practice, the extra indirection can have a dramatic performance impact to common operations such as `insert`, `find` and `erase`. But to keep iteration of the container fast, Boost.Unordered introduces a novel data structure, a "bucket group". A bucket group is a fixed-width view of a subsection of the buckets array. It contains a bitmask (a `std::size_t`) which it uses to track occupancy of buckets and contains two pointers so that it can form a doubly-linked list with non-empty groups. An example diagram is below:
|
||||
|
||||
[#img-fca-layout]
|
||||
.The new layout used by Boost
|
||||
image::fca.png[align=center]
|
||||
|
||||
Thus container-wide iteration is turned into traversing the non-empty bucket groups (an operation with constant time complexity) which reduces the time complexity back to `O(size())`. In total, a bucket group is only 4 words in size and it views `sizeof(std::size_t) * CHAR_BIT` buckets meaning that for all common implementations, there's only 4 bits of space overhead per bucket introduced by the bucket groups.
|
||||
|
||||
A more detailed description of Boost.Unordered's closed-addressing implementation is
|
||||
given in an
|
||||
https://bannalia.blogspot.com/2022/06/advancing-state-of-art-for.html[external article].
|
||||
For more information on implementation rationale, read the
|
||||
xref:rationale.adoc#rationale_open_addresing_containers[corresponding section].
|
||||
|
||||
== Open-addressing Containers
|
||||
|
||||
The diagram shows the basic internal layout of `boost::unordered_flat_set`/`unordered_node_set` and
|
||||
`boost:unordered_flat_map`/`unordered_node_map`.
|
||||
|
||||
|
||||
[#img-foa-layout]
|
||||
.Open-addressing layout used by Boost.Unordered.
|
||||
image::foa.png[align=center]
|
||||
|
||||
As with all open-addressing containers, elements (or pointers to the element nodes in the case of
|
||||
`boost::unordered_node_set` and `boost::unordered_node_map`) are stored directly in the bucket array.
|
||||
This array is logically divided into 2^_n_^ _groups_ of 15 elements each.
|
||||
In addition to the bucket array, there is an associated _metadata array_ with 2^_n_^
|
||||
16-byte words.
|
||||
|
||||
[#img-foa-metadata]
|
||||
.Breakdown of a metadata word.
|
||||
image::foa-metadata.png[align=center]
|
||||
|
||||
A metadata word is divided into 15 _h_~_i_~ bytes (one for each associated
|
||||
bucket), and an _overflow byte_ (_ofw_ in the diagram). The value of _h_~_i_~ is:
|
||||
|
||||
- 0 if the corresponding bucket is empty.
|
||||
- 1 to encode a special empty bucket called a _sentinel_, which is used internally to
|
||||
stop iteration when the container has been fully traversed.
|
||||
- If the bucket is occupied, a _reduced hash value_ obtained from the hash value of
|
||||
the element.
|
||||
|
||||
When looking for an element with hash value _h_, SIMD technologies such as
|
||||
https://en.wikipedia.org/wiki/SSE2[SSE2] and
|
||||
https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(Neon)[Neon] allow us
|
||||
to very quickly inspect the full metadata word and look for the reduced value of _h_ among all the
|
||||
15 buckets with just a handful of CPU instructions: non-matching buckets can be
|
||||
readily discarded, and those whose reduced hash value matches need be inspected via full
|
||||
comparison with the corresponding element. If the looked-for element is not present,
|
||||
the overflow byte is inspected:
|
||||
|
||||
- If the bit in the position _h_ mod 8 is zero, lookup terminates (and the
|
||||
element is not present).
|
||||
- If the bit is set to 1 (the group has been _overflowed_), further groups are
|
||||
checked using https://en.wikipedia.org/wiki/Quadratic_probing[_quadratic probing_], and
|
||||
the process is repeated.
|
||||
|
||||
Insertion is algorithmically similar: empty buckets are located using SIMD,
|
||||
and when going past a full group its corresponding overflow bit is set to 1.
|
||||
|
||||
In architectures without SIMD support, the logical layout stays the same, but the metadata
|
||||
word is codified using a technique we call _bit interleaving_: this layout allows us
|
||||
to emulate SIMD with reasonably good performance using only standard arithmetic and
|
||||
logical operations.
|
||||
|
||||
[#img-foa-metadata-interleaving]
|
||||
.Bit-interleaved metadata word.
|
||||
image::foa-metadata-interleaving.png[align=center]
|
||||
|
||||
A more detailed description of Boost.Unordered's open-addressing implementation is
|
||||
given in an
|
||||
https://bannalia.blogspot.com/2022/11/inside-boostunorderedflatmap.html[external article].
|
||||
For more information on implementation rationale, read the
|
||||
xref:#rationale_open_addresing_containers[corresponding section].
|
||||
|
||||
== Concurrent Containers
|
||||
|
||||
`boost::concurrent_flat_set`/`boost::concurrent_node_set` and
|
||||
`boost::concurrent_flat_map`/`boost::concurrent_node_map` use the basic
|
||||
xref:#structures_open_addressing_containers[open-addressing layout] described above
|
||||
augmented with synchronization mechanisms.
|
||||
|
||||
|
||||
[#img-cfoa-layout]
|
||||
.Concurrent open-addressing layout used by Boost.Unordered.
|
||||
image::cfoa.png[align=center]
|
||||
|
||||
Two levels of synchronization are used:
|
||||
|
||||
* Container level: A read-write mutex is used to control access from any operation
|
||||
to the container. Typically, such access is in read mode (that is, concurrent) even
|
||||
for modifying operations, so for most practical purposes there is no thread
|
||||
contention at this level. Access is only in write mode (blocking) when rehashing or
|
||||
performing container-wide operations such as swapping or assignment.
|
||||
* Group level: Each 15-slot group is equipped with an 8-byte word containing:
|
||||
** A read-write spinlock for synchronized access to any element in the group.
|
||||
** An atomic _insertion counter_ used for optimistic insertion as described
|
||||
below.
|
||||
|
||||
By using atomic operations to access the group metadata, lookup is (group-level)
|
||||
lock-free up to the point where an actual comparison needs to be done with an element
|
||||
that has been previously SIMD-matched: only then is the group's spinlock used.
|
||||
|
||||
Insertion uses the following _optimistic algorithm_:
|
||||
|
||||
* The value of the insertion counter for the initial group in the probe
|
||||
sequence is locally recorded (let's call this value `c0`).
|
||||
* Lookup is as described above. If lookup finds no equivalent element,
|
||||
search for an available slot for insertion successively locks/unlocks
|
||||
each group in the probing sequence.
|
||||
* When an available slot is located, it is preemptively occupied (its
|
||||
reduced hash value is set) and the insertion counter is atomically
|
||||
incremented: if no other thread has incremented the counter during the
|
||||
whole operation (which is checked by comparing with `c0`), then we're
|
||||
good to go and complete the insertion, otherwise we roll back and start
|
||||
over.
|
||||
|
||||
This algorithm has very low contention both at the lookup and actual
|
||||
insertion phases in exchange for the possibility that computations have
|
||||
to be started over if some other thread interferes in the process by
|
||||
performing a succesful insertion beginning at the same group. In
|
||||
practice, the start-over frequency is extremely small, measured in the range
|
||||
of parts per million for some of our benchmarks.
|
||||
|
||||
For more information on implementation rationale, read the
|
||||
xref:#rationale_concurrent_containers[corresponding section].
|
||||
Reference in New Issue
Block a user