forked from boostorg/unordered
added unordered_node_[map|set] containers to the tutorial
This commit is contained in:
@ -134,7 +134,8 @@ h|*Method* h|*Description*
|
||||
|Changes the number of buckets so that there at least `n` buckets, and so that the load factor is less than the maximum load factor.
|
||||
|
||||
2+^h| *Open-addressing containers only* +
|
||||
`boost::unordered_flat_set`, `boost::unordered_flat_map`
|
||||
`boost::unordered_flat_set`, `boost::unordered_flat_map` +
|
||||
`boost::unordered_node_set`, `boost::unordered_node_map` +
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`size_type max_load() const`
|
||||
@ -160,8 +161,9 @@ change the number of buckets when this happens. Iterators can be
|
||||
invalidated by calls to `insert`, `rehash` and `reserve`.
|
||||
|
||||
As for pointers and references,
|
||||
they are never invalidated for closed-addressing containers (`boost::unordered_[multi]set`, `boost::unordered_[multi]map`),
|
||||
but they will when rehashing occurs for open-addressing
|
||||
they are never invalidated for node-based containers
|
||||
(`boost::unordered_[multi]set`, `boost::unordered_[multi]map`, `boost::unordered_node_set`, `boost::unordered_node_map`),
|
||||
but they will when rehashing occurs for
|
||||
`boost::unordered_flat_set` and `boost::unordered_flat_map`: this is because
|
||||
these containers store elements directly into their holding buckets, so
|
||||
when allocating a new bucket array the elements must be transferred by means of move construction.
|
||||
@ -252,15 +254,16 @@ xref:#rationale_boostunordered_multiset_and_boostunordered_multimap[correspondin
|
||||
|
||||
== Open Addressing Implementation
|
||||
|
||||
The diagram shows the basic internal layout of `boost::unordered_flat_map` and
|
||||
`boost:unordered_flat_set`.
|
||||
The diagram shows the basic internal layout of `boost::unordered_flat_map`/`unordered_node_map` and
|
||||
`boost:unordered_flat_set`/`unordered_node_set`.
|
||||
|
||||
|
||||
[#img-foa-layout]
|
||||
.Open-addressing layout used by Boost.Unordered.
|
||||
image::foa.png[align=center]
|
||||
|
||||
As with all open-addressing containers, elements are stored directly in the bucket array.
|
||||
As with all open-addressing containers, elements (or element nodes in the case of
|
||||
`boost::unordered_node_map` and `boost::unordered_node_set`) are stored directly in the bucket array.
|
||||
This array is logically divided into 2^_n_^ _groups_ of 15 elements each.
|
||||
In addition to the bucket array, there is an associated _metadata array_ with 2^_n_^
|
||||
16-byte words.
|
||||
|
@ -33,8 +33,8 @@
|
||||
|
||||
|Iterators, pointers and references to the container's elements are never invalidated.
|
||||
|<<buckets_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. +
|
||||
**Closed-addressing containers:** Pointers and references to the container's elements are never invalidated. +
|
||||
**Open-addressing containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|
||||
**Node-based containers:** Pointers and references to the container's elements are never invalidated. +
|
||||
**Flat containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|
||||
|
||||
|Iterators iterate through the container in the order defined by the comparison object.
|
||||
|Iterators iterate through the container in an arbitrary order, that can change as elements are inserted, although equivalent elements are always adjacent.
|
||||
|
@ -117,27 +117,29 @@ Variadic constructor arguments for `emplace` are only used when both
|
||||
rvalue references and variadic template parameters are available.
|
||||
Otherwise `emplace` can only take up to 10 constructors arguments.
|
||||
|
||||
== Open-addressing containers: unordered_flat_set, unordered_flat_map
|
||||
== Open-addressing containers: unordered_flat_set/unordered_node_set, unordered_flat_map/unordered_node_map
|
||||
|
||||
The C++ standard does not currently provide any open-addressing container
|
||||
specification to adhere to, so `boost::unordered_flat_set` and
|
||||
`boost::unordered_flat_map` take inspiration from `std::unordered_set` and
|
||||
specification to adhere to, so `boost::unordered_flat_set`/`unordered_node_set` and
|
||||
`boost::unordered_flat_map`/`unordered_node_map` take inspiration from `std::unordered_set` and
|
||||
`std::unordered_map`, respectively, and depart from their interface where
|
||||
convenient or as dictated by their internal data structure, which is
|
||||
radically different from that imposed by the standard (closed addressing, node based).
|
||||
radically different from that imposed by the standard (closed addressing).
|
||||
|
||||
`unordered_flat_set` and `unordered_flat_map` only work with reasonably
|
||||
Open-addressing containers provided by Boost.Unordered only work with reasonably
|
||||
compliant C++11 (or later) compilers. Language-level features such as move semantics
|
||||
and variadic template parameters are then not emulated.
|
||||
`unordered_flat_set` and `unordered_flat_map` are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
|
||||
The containers are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
|
||||
|
||||
The main differences with C++ unordered associative containers are:
|
||||
|
||||
* `value_type` must be move-constructible.
|
||||
* Pointer stability is not kept under rehashing.
|
||||
* `begin()` is not constant-time.
|
||||
* `erase(iterator)` returns `void` instead of an iterator to the following element.
|
||||
* There is no API for bucket handling (except `bucket_count`) or node extraction/insertion.
|
||||
* The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
|
||||
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
|
||||
|
||||
* In general:
|
||||
** `begin()` is not constant-time.
|
||||
** `erase(iterator)` returns `void` instead of an iterator to the following element.
|
||||
** There is no API for bucket handling (except `bucket_count`).
|
||||
** The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
|
||||
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
|
||||
* Flat containers (`boost::unordered_flat_set` and `boost::unordered_flat_map`):
|
||||
** `value_type` must be move-constructible.
|
||||
** Pointer stability is not kept under rehashing.
|
||||
** There is no API for node extraction/insertion
|
||||
|
@ -106,8 +106,69 @@ namespace boost {
|
||||
}
|
||||
----
|
||||
|
||||
`boost::unordered_flat_set` and `boost::unordered_flat_map` require a
|
||||
reasonably compliant C++11 compiler.
|
||||
Starting in Boost 1.82, the containers `boost::unordered_node_set` and `boost::unordered_node_map`
|
||||
are introduced: they use open addressing like `boost::unordered_flat_set` and `boost::unordered_flat_map`,
|
||||
but internally store element _nodes_, like `boost::unordered_set` and `boost::unordered_map`,
|
||||
which provide stability of pointers and references to the elements:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// #include <boost/unordered/unordered_node_set.hpp>
|
||||
//
|
||||
// Note: no multiset version
|
||||
|
||||
namespace boost {
|
||||
template <
|
||||
class Key,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<Key> >
|
||||
class unordered_node_set;
|
||||
}
|
||||
----
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// #include <boost/unordered/unordered_node_map.hpp>
|
||||
//
|
||||
// Note: no multimap version
|
||||
|
||||
namespace boost {
|
||||
template <
|
||||
class Key, class Mapped,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<std::pair<Key const, Mapped> > >
|
||||
class unordered_node_map;
|
||||
}
|
||||
----
|
||||
|
||||
These are all the containers provided by Boost.Unordered:
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Boost.Unordered containers']
|
||||
[cols="1,1,.^1", frame=all, grid=rows]
|
||||
|===
|
||||
^h|
|
||||
^h|*Node-based*
|
||||
^h|*Flat*
|
||||
|
||||
^.^h|*Closed addressing*
|
||||
^| `boost::unordered_set` +
|
||||
`boost::unordered_map` +
|
||||
`boost::unordered_multiset` +
|
||||
`boost::unordered_multimap`
|
||||
^|
|
||||
|
||||
^.^h|*Open addressing*
|
||||
^| `boost::unordered_node_set` +
|
||||
`boost::unordered_node_map`
|
||||
^| `boost::unordered_flat_set` +
|
||||
`boost::unordered_flat_map`
|
||||
|
||||
|===
|
||||
|
||||
Closed-addressing containers are pass:[C++]98-compatible. Open-addressing containers require a
|
||||
reasonably compliant pass:[C++]11 compiler.
|
||||
|
||||
Boost.Unordered containers are used in a similar manner to the normal associative
|
||||
containers:
|
||||
|
@ -64,8 +64,8 @@ of bits in the hash value, so it was only used when `size_t` was 64 bit.
|
||||
|
||||
Since release 1.79.0, https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing[Fibonacci hashing]
|
||||
is used instead. With this implementation, the bucket number is determined
|
||||
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is the golden
|
||||
ratio multiplied by `2^w`, `w` is the word size (32 or 64), and `2^k` is the
|
||||
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is `2^w` divided
|
||||
by the golden ratio, `w` is the word size (32 or 64), and `2^k` is the
|
||||
number of buckets. This provides a good compromise between speed and
|
||||
distribution.
|
||||
|
||||
@ -73,7 +73,7 @@ Since release 1.80.0, prime numbers are chosen for the number of buckets in
|
||||
tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
|
||||
the result of the user's hash function as was used for release 1.79.0.
|
||||
|
||||
== boost::unordered_flat_set and boost::unordered_flat_map
|
||||
== boost::unordered_flat_set/unordered_node_set and boost::unordered_flat_map/unordered_node_map
|
||||
|
||||
The C++ standard specification of unordered associative containers impose
|
||||
severe limitations on permissible implementations, the most important being
|
||||
@ -81,14 +81,14 @@ that closed addressing is implicitly assumed. Slightly relaxing this specificati
|
||||
opens up the possibility of providing container variations taking full
|
||||
advantage of open-addressing techniques.
|
||||
|
||||
The design of `boost::unordered_flat_set` and `boost::unordered_flat_map` has been
|
||||
The design of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` has been
|
||||
guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
|
||||
We discuss here the most relevant principles.
|
||||
|
||||
=== Hash function
|
||||
|
||||
Given its rich functionality and cross-platform interoperability,
|
||||
`boost::hash` remains the default hash function of `boost::unordered_flat_set` and `boost::unordered_flat_map`.
|
||||
`boost::hash` remains the default hash function of open-addressing containers.
|
||||
As it happens, `boost::hash` for integral and other basic types does not possess
|
||||
the statistical properties required by open addressing; to cope with this,
|
||||
we implement a post-mixing stage:
|
||||
@ -98,17 +98,15 @@ we implement a post-mixing stage:
|
||||
|
||||
where *mulx* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
|
||||
and *high* and *low* are the upper and lower halves of an extended word, respectively.
|
||||
In 64-bit architectures, _C_ is the integer part of
|
||||
(1 − https://en.wikipedia.org/wiki/Golden_ratio[_φ_])·2^64^,
|
||||
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from
|
||||
https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
|
||||
In 64-bit architectures, _C_ is the integer part of 2^64^∕https://en.wikipedia.org/wiki/Golden_ratio[_φ_],
|
||||
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
|
||||
|
||||
When using a hash function directly suitable for open addressing, post-mixing can be opted out by via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
|
||||
`boost::hash` specializations for string types are marked as avalanching.
|
||||
|
||||
=== Platform interoperability
|
||||
|
||||
The observable behavior of `boost::unordered_flat_set` and `boost::unordered_flat_map` is deterministically
|
||||
The observable behavior of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` is deterministically
|
||||
identical across different compilers as long as their ``std::size_type``s are the same size and the user-provided
|
||||
hash function and equality predicate are also interoperable
|
||||
—this includes elements being ordered in exactly the same way for the same sequence of
|
||||
|
Reference in New Issue
Block a user