Merge pull request #182 from boostorg/feature/unordered_node_map_docs
Feature/unordered node map docs
Before Width: | Height: | Size: 36 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 49 KiB After Width: | Height: | Size: 62 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 44 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 41 KiB After Width: | Height: | Size: 54 KiB |
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 43 KiB After Width: | Height: | Size: 53 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 36 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 39 KiB After Width: | Height: | Size: 44 KiB |
Before Width: | Height: | Size: 44 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 41 KiB After Width: | Height: | Size: 48 KiB |
Before Width: | Height: | Size: 44 KiB After Width: | Height: | Size: 55 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 44 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 47 KiB |
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 41 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 43 KiB After Width: | Height: | Size: 51 KiB |
@ -278,13 +278,14 @@ max load factor 5
|
||||
|
||||
|===
|
||||
|
||||
== boost::unordered_flat_map
|
||||
== boost::unordered_(flat|node)_map
|
||||
|
||||
All benchmarks were created using:
|
||||
|
||||
* `https://abseil.io/docs/cpp/guides/container[absl::flat_hash_map^]<uint64_t, uint64_t>`
|
||||
* `boost::unordered_flat_map<uint64_t, uint64_t>`
|
||||
* `boost::unordered_map<uint64_t, uint64_t>`
|
||||
* `boost::unordered_flat_map<uint64_t, uint64_t>`
|
||||
* `boost::unordered_node_map<uint64_t, uint64_t>`
|
||||
|
||||
The source code can be https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_unordered_flat_map[found here^].
|
||||
|
||||
|
@ -134,7 +134,8 @@ h|*Method* h|*Description*
|
||||
|Changes the number of buckets so that there at least `n` buckets, and so that the load factor is less than the maximum load factor.
|
||||
|
||||
2+^h| *Open-addressing containers only* +
|
||||
`boost::unordered_flat_set`, `boost::unordered_flat_map`
|
||||
`boost::unordered_flat_set`, `boost::unordered_flat_map` +
|
||||
`boost::unordered_node_set`, `boost::unordered_node_map` +
|
||||
h|*Method* h|*Description*
|
||||
|
||||
|`size_type max_load() const`
|
||||
@ -160,8 +161,9 @@ change the number of buckets when this happens. Iterators can be
|
||||
invalidated by calls to `insert`, `rehash` and `reserve`.
|
||||
|
||||
As for pointers and references,
|
||||
they are never invalidated for closed-addressing containers (`boost::unordered_[multi]set`, `boost::unordered_[multi]map`),
|
||||
but they will when rehashing occurs for open-addressing
|
||||
they are never invalidated for node-based containers
|
||||
(`boost::unordered_[multi]set`, `boost::unordered_[multi]map`, `boost::unordered_node_set`, `boost::unordered_node_map`),
|
||||
but they will when rehashing occurs for
|
||||
`boost::unordered_flat_set` and `boost::unordered_flat_map`: this is because
|
||||
these containers store elements directly into their holding buckets, so
|
||||
when allocating a new bucket array the elements must be transferred by means of move construction.
|
||||
@ -252,15 +254,16 @@ xref:#rationale_boostunordered_multiset_and_boostunordered_multimap[correspondin
|
||||
|
||||
== Open Addressing Implementation
|
||||
|
||||
The diagram shows the basic internal layout of `boost::unordered_flat_map` and
|
||||
`boost:unordered_flat_set`.
|
||||
The diagram shows the basic internal layout of `boost::unordered_flat_map`/`unordered_node_map` and
|
||||
`boost:unordered_flat_set`/`unordered_node_set`.
|
||||
|
||||
|
||||
[#img-foa-layout]
|
||||
.Open-addressing layout used by Boost.Unordered.
|
||||
image::foa.png[align=center]
|
||||
|
||||
As with all open-addressing containers, elements are stored directly in the bucket array.
|
||||
As with all open-addressing containers, elements (or pointers to the element nodes in the case of
|
||||
`boost::unordered_node_map` and `boost::unordered_node_set`) are stored directly in the bucket array.
|
||||
This array is logically divided into 2^_n_^ _groups_ of 15 elements each.
|
||||
In addition to the bucket array, there is an associated _metadata array_ with 2^_n_^
|
||||
16-byte words.
|
||||
|
@ -6,10 +6,12 @@
|
||||
:github-pr-url: https://github.com/boostorg/unordered/pull
|
||||
:cpp: C++
|
||||
|
||||
== Release 1.82.0
|
||||
== Release 1.82.0 - Major update
|
||||
|
||||
* Added node-based, open-addressing containers
|
||||
`boost::unordered_node_map` and `boost::unordered_node_set`.
|
||||
* Extended heterogeneous lookup to more member functions as specified in
|
||||
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2363r3.html[P2363].
|
||||
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2363r5.html[P2363].
|
||||
* Replaced the previous post-mixing process for open-addressing containers with
|
||||
a new algorithm based on extended multiplication by a constant.
|
||||
|
||||
|
@ -33,8 +33,8 @@
|
||||
|
||||
|Iterators, pointers and references to the container's elements are never invalidated.
|
||||
|<<buckets_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. +
|
||||
**Closed-addressing containers:** Pointers and references to the container's elements are never invalidated. +
|
||||
**Open-addressing containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|
||||
**Node-based containers:** Pointers and references to the container's elements are never invalidated. +
|
||||
**Flat containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|
||||
|
||||
|Iterators iterate through the container in the order defined by the comparison object.
|
||||
|Iterators iterate through the container in an arbitrary order, that can change as elements are inserted, although equivalent elements are always adjacent.
|
||||
|
@ -5,9 +5,9 @@
|
||||
|
||||
:cpp: C++
|
||||
|
||||
== Closed-addressing containers: unordered_[multi]set, unordered_[multi]map
|
||||
== Closed-addressing containers
|
||||
|
||||
The intent of Boost.Unordered is to provide a conformant
|
||||
`unordered_[multi]set` and `unordered_[multi]map` are intended to provide a conformant
|
||||
implementation of the {cpp}20 standard that will work with {cpp}98 upwards.
|
||||
This wide compatibility does mean some compromises have to be made.
|
||||
With a compiler and library that fully support {cpp}11, the differences should
|
||||
@ -117,27 +117,29 @@ Variadic constructor arguments for `emplace` are only used when both
|
||||
rvalue references and variadic template parameters are available.
|
||||
Otherwise `emplace` can only take up to 10 constructors arguments.
|
||||
|
||||
== Open-addressing containers: unordered_flat_set, unordered_flat_map
|
||||
== Open-addressing containers
|
||||
|
||||
The C++ standard does not currently provide any open-addressing container
|
||||
specification to adhere to, so `boost::unordered_flat_set` and
|
||||
`boost::unordered_flat_map` take inspiration from `std::unordered_set` and
|
||||
specification to adhere to, so `boost::unordered_flat_set`/`unordered_node_set` and
|
||||
`boost::unordered_flat_map`/`unordered_node_map` take inspiration from `std::unordered_set` and
|
||||
`std::unordered_map`, respectively, and depart from their interface where
|
||||
convenient or as dictated by their internal data structure, which is
|
||||
radically different from that imposed by the standard (closed addressing, node based).
|
||||
radically different from that imposed by the standard (closed addressing).
|
||||
|
||||
`unordered_flat_set` and `unordered_flat_map` only work with reasonably
|
||||
Open-addressing containers provided by Boost.Unordered only work with reasonably
|
||||
compliant C++11 (or later) compilers. Language-level features such as move semantics
|
||||
and variadic template parameters are then not emulated.
|
||||
`unordered_flat_set` and `unordered_flat_map` are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
|
||||
The containers are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
|
||||
|
||||
The main differences with C++ unordered associative containers are:
|
||||
|
||||
* `value_type` must be move-constructible.
|
||||
* Pointer stability is not kept under rehashing.
|
||||
* `begin()` is not constant-time.
|
||||
* `erase(iterator)` returns `void` instead of an iterator to the following element.
|
||||
* There is no API for bucket handling (except `bucket_count`) or node extraction/insertion.
|
||||
* The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
|
||||
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
|
||||
|
||||
* In general:
|
||||
** `begin()` is not constant-time.
|
||||
** `erase(iterator)` returns `void` instead of an iterator to the following element.
|
||||
** There is no API for bucket handling (except `bucket_count`).
|
||||
** The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
|
||||
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
|
||||
* Flat containers (`boost::unordered_flat_set` and `boost::unordered_flat_map`):
|
||||
** `value_type` must be move-constructible.
|
||||
** Pointer stability is not kept under rehashing.
|
||||
** There is no API for node extraction/insertion.
|
||||
|
@ -9,10 +9,10 @@ Copyright (C) 2003, 2004 Jeremy B. Maitin-Shepard
|
||||
|
||||
Copyright (C) 2005-2008 Daniel James
|
||||
|
||||
Copyright (C) 2022 Christian Mazakas
|
||||
Copyright (C) 2022-2023 Christian Mazakas
|
||||
|
||||
Copyright (C) 2022 Joaquín M López Muñoz
|
||||
Copyright (C) 2022-2023 Joaquín M López Muñoz
|
||||
|
||||
Copyright (C) 2022 Peter Dimov
|
||||
Copyright (C) 2022-2023 Peter Dimov
|
||||
|
||||
Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
||||
|
@ -106,8 +106,69 @@ namespace boost {
|
||||
}
|
||||
----
|
||||
|
||||
`boost::unordered_flat_set` and `boost::unordered_flat_map` require a
|
||||
reasonably compliant C++11 compiler.
|
||||
Starting in Boost 1.82, the containers `boost::unordered_node_set` and `boost::unordered_node_map`
|
||||
are introduced: they use open addressing like `boost::unordered_flat_set` and `boost::unordered_flat_map`,
|
||||
but internally store element _nodes_, like `boost::unordered_set` and `boost::unordered_map`,
|
||||
which provide stability of pointers and references to the elements:
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// #include <boost/unordered/unordered_node_set.hpp>
|
||||
//
|
||||
// Note: no multiset version
|
||||
|
||||
namespace boost {
|
||||
template <
|
||||
class Key,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<Key> >
|
||||
class unordered_node_set;
|
||||
}
|
||||
----
|
||||
|
||||
[source,c++]
|
||||
----
|
||||
// #include <boost/unordered/unordered_node_map.hpp>
|
||||
//
|
||||
// Note: no multimap version
|
||||
|
||||
namespace boost {
|
||||
template <
|
||||
class Key, class Mapped,
|
||||
class Hash = boost::hash<Key>,
|
||||
class Pred = std::equal_to<Key>,
|
||||
class Alloc = std::allocator<std::pair<Key const, Mapped> > >
|
||||
class unordered_node_map;
|
||||
}
|
||||
----
|
||||
|
||||
These are all the containers provided by Boost.Unordered:
|
||||
|
||||
[caption=, title='Table {counter:table-counter}. Boost.Unordered containers']
|
||||
[cols="1,1,.^1", frame=all, grid=rows]
|
||||
|===
|
||||
^h|
|
||||
^h|*Node-based*
|
||||
^h|*Flat*
|
||||
|
||||
^.^h|*Closed addressing*
|
||||
^| `boost::unordered_set` +
|
||||
`boost::unordered_map` +
|
||||
`boost::unordered_multiset` +
|
||||
`boost::unordered_multimap`
|
||||
^|
|
||||
|
||||
^.^h|*Open addressing*
|
||||
^| `boost::unordered_node_set` +
|
||||
`boost::unordered_node_map`
|
||||
^| `boost::unordered_flat_set` +
|
||||
`boost::unordered_flat_map`
|
||||
|
||||
|===
|
||||
|
||||
Closed-addressing containers are pass:[C++]98-compatible. Open-addressing containers require a
|
||||
reasonably compliant pass:[C++]11 compiler.
|
||||
|
||||
Boost.Unordered containers are used in a similar manner to the normal associative
|
||||
containers:
|
||||
|
@ -4,9 +4,10 @@
|
||||
|
||||
= Implementation Rationale
|
||||
|
||||
== boost::unordered_[multi]set and boost::unordered_[multi]map
|
||||
== Closed-addressing containers
|
||||
|
||||
These containers adhere to the standard requirements for unordered associative
|
||||
`boost::unordered_[multi]set` and `boost::unordered_[multi]map`
|
||||
adhere to the standard requirements for unordered associative
|
||||
containers, so the interface was fixed. But there are
|
||||
still some implementation decisions to make. The priorities are
|
||||
conformance to the standard and portability.
|
||||
@ -64,8 +65,8 @@ of bits in the hash value, so it was only used when `size_t` was 64 bit.
|
||||
|
||||
Since release 1.79.0, https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing[Fibonacci hashing]
|
||||
is used instead. With this implementation, the bucket number is determined
|
||||
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is the golden
|
||||
ratio multiplied by `2^w`, `w` is the word size (32 or 64), and `2^k` is the
|
||||
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is `2^w` divided
|
||||
by the golden ratio, `w` is the word size (32 or 64), and `2^k` is the
|
||||
number of buckets. This provides a good compromise between speed and
|
||||
distribution.
|
||||
|
||||
@ -73,7 +74,7 @@ Since release 1.80.0, prime numbers are chosen for the number of buckets in
|
||||
tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
|
||||
the result of the user's hash function as was used for release 1.79.0.
|
||||
|
||||
== boost::unordered_flat_set and boost::unordered_flat_map
|
||||
== Open-addresing containers
|
||||
|
||||
The C++ standard specification of unordered associative containers impose
|
||||
severe limitations on permissible implementations, the most important being
|
||||
@ -81,14 +82,14 @@ that closed addressing is implicitly assumed. Slightly relaxing this specificati
|
||||
opens up the possibility of providing container variations taking full
|
||||
advantage of open-addressing techniques.
|
||||
|
||||
The design of `boost::unordered_flat_set` and `boost::unordered_flat_map` has been
|
||||
The design of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` has been
|
||||
guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
|
||||
We discuss here the most relevant principles.
|
||||
|
||||
=== Hash function
|
||||
|
||||
Given its rich functionality and cross-platform interoperability,
|
||||
`boost::hash` remains the default hash function of `boost::unordered_flat_set` and `boost::unordered_flat_map`.
|
||||
`boost::hash` remains the default hash function of open-addressing containers.
|
||||
As it happens, `boost::hash` for integral and other basic types does not possess
|
||||
the statistical properties required by open addressing; to cope with this,
|
||||
we implement a post-mixing stage:
|
||||
@ -98,17 +99,15 @@ we implement a post-mixing stage:
|
||||
|
||||
where *mulx* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
|
||||
and *high* and *low* are the upper and lower halves of an extended word, respectively.
|
||||
In 64-bit architectures, _C_ is the integer part of
|
||||
(1 − https://en.wikipedia.org/wiki/Golden_ratio[_φ_])·2^64^,
|
||||
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from
|
||||
https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
|
||||
In 64-bit architectures, _C_ is the integer part of 2^64^∕https://en.wikipedia.org/wiki/Golden_ratio[_φ_],
|
||||
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
|
||||
|
||||
When using a hash function directly suitable for open addressing, post-mixing can be opted out by via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
|
||||
`boost::hash` specializations for string types are marked as avalanching.
|
||||
|
||||
=== Platform interoperability
|
||||
|
||||
The observable behavior of `boost::unordered_flat_set` and `boost::unordered_flat_map` is deterministically
|
||||
The observable behavior of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` is deterministically
|
||||
identical across different compilers as long as their ``std::size_type``s are the same size and the user-provided
|
||||
hash function and equality predicate are also interoperable
|
||||
—this includes elements being ordered in exactly the same way for the same sequence of
|
||||
|
@ -8,3 +8,5 @@ include::unordered_multiset.adoc[]
|
||||
include::hash_traits.adoc[]
|
||||
include::unordered_flat_map.adoc[]
|
||||
include::unordered_flat_set.adoc[]
|
||||
include::unordered_node_map.adoc[]
|
||||
include::unordered_node_set.adoc[]
|
||||
|