Merge pull request #182 from boostorg/feature/unordered_node_map_docs

Feature/unordered node map docs
This commit is contained in:
joaquintides
2023-02-25 10:20:05 +01:00
committed by GitHub
39 changed files with 2960 additions and 45 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 49 KiB

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 45 KiB

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 43 KiB

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 43 KiB

After

Width:  |  Height:  |  Size: 51 KiB

View File

@ -278,13 +278,14 @@ max load factor 5
|===
== boost::unordered_flat_map
== boost::unordered_(flat|node)_map
All benchmarks were created using:
* `https://abseil.io/docs/cpp/guides/container[absl::flat_hash_map^]<uint64_t, uint64_t>`
* `boost::unordered_flat_map<uint64_t, uint64_t>`
* `boost::unordered_map<uint64_t, uint64_t>`
* `boost::unordered_flat_map<uint64_t, uint64_t>`
* `boost::unordered_node_map<uint64_t, uint64_t>`
The source code can be https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_unordered_flat_map[found here^].

View File

@ -134,7 +134,8 @@ h|*Method* h|*Description*
|Changes the number of buckets so that there at least `n` buckets, and so that the load factor is less than the maximum load factor.
2+^h| *Open-addressing containers only* +
`boost::unordered_flat_set`, `boost::unordered_flat_map`
`boost::unordered_flat_set`, `boost::unordered_flat_map` +
`boost::unordered_node_set`, `boost::unordered_node_map` +
h|*Method* h|*Description*
|`size_type max_load() const`
@ -160,8 +161,9 @@ change the number of buckets when this happens. Iterators can be
invalidated by calls to `insert`, `rehash` and `reserve`.
As for pointers and references,
they are never invalidated for closed-addressing containers (`boost::unordered_[multi]set`, `boost::unordered_[multi]map`),
but they will when rehashing occurs for open-addressing
they are never invalidated for node-based containers
(`boost::unordered_[multi]set`, `boost::unordered_[multi]map`, `boost::unordered_node_set`, `boost::unordered_node_map`),
but they will when rehashing occurs for
`boost::unordered_flat_set` and `boost::unordered_flat_map`: this is because
these containers store elements directly into their holding buckets, so
when allocating a new bucket array the elements must be transferred by means of move construction.
@ -252,15 +254,16 @@ xref:#rationale_boostunordered_multiset_and_boostunordered_multimap[correspondin
== Open Addressing Implementation
The diagram shows the basic internal layout of `boost::unordered_flat_map` and
`boost:unordered_flat_set`.
The diagram shows the basic internal layout of `boost::unordered_flat_map`/`unordered_node_map` and
`boost:unordered_flat_set`/`unordered_node_set`.
[#img-foa-layout]
.Open-addressing layout used by Boost.Unordered.
image::foa.png[align=center]
As with all open-addressing containers, elements are stored directly in the bucket array.
As with all open-addressing containers, elements (or pointers to the element nodes in the case of
`boost::unordered_node_map` and `boost::unordered_node_set`) are stored directly in the bucket array.
This array is logically divided into 2^_n_^ _groups_ of 15 elements each.
In addition to the bucket array, there is an associated _metadata array_ with 2^_n_^
16-byte words.

View File

@ -6,10 +6,12 @@
:github-pr-url: https://github.com/boostorg/unordered/pull
:cpp: C++
== Release 1.82.0
== Release 1.82.0 - Major update
* Added node-based, open-addressing containers
`boost::unordered_node_map` and `boost::unordered_node_set`.
* Extended heterogeneous lookup to more member functions as specified in
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2363r3.html[P2363].
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2363r5.html[P2363].
* Replaced the previous post-mixing process for open-addressing containers with
a new algorithm based on extended multiplication by a constant.

View File

@ -33,8 +33,8 @@
|Iterators, pointers and references to the container's elements are never invalidated.
|<<buckets_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. +
**Closed-addressing containers:** Pointers and references to the container's elements are never invalidated. +
**Open-addressing containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
**Node-based containers:** Pointers and references to the container's elements are never invalidated. +
**Flat containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.
|Iterators iterate through the container in the order defined by the comparison object.
|Iterators iterate through the container in an arbitrary order, that can change as elements are inserted, although equivalent elements are always adjacent.

View File

@ -5,9 +5,9 @@
:cpp: C++
== Closed-addressing containers: unordered_[multi]set, unordered_[multi]map
== Closed-addressing containers
The intent of Boost.Unordered is to provide a conformant
`unordered_[multi]set` and `unordered_[multi]map` are intended to provide a conformant
implementation of the {cpp}20 standard that will work with {cpp}98 upwards.
This wide compatibility does mean some compromises have to be made.
With a compiler and library that fully support {cpp}11, the differences should
@ -117,27 +117,29 @@ Variadic constructor arguments for `emplace` are only used when both
rvalue references and variadic template parameters are available.
Otherwise `emplace` can only take up to 10 constructors arguments.
== Open-addressing containers: unordered_flat_set, unordered_flat_map
== Open-addressing containers
The C++ standard does not currently provide any open-addressing container
specification to adhere to, so `boost::unordered_flat_set` and
`boost::unordered_flat_map` take inspiration from `std::unordered_set` and
specification to adhere to, so `boost::unordered_flat_set`/`unordered_node_set` and
`boost::unordered_flat_map`/`unordered_node_map` take inspiration from `std::unordered_set` and
`std::unordered_map`, respectively, and depart from their interface where
convenient or as dictated by their internal data structure, which is
radically different from that imposed by the standard (closed addressing, node based).
radically different from that imposed by the standard (closed addressing).
`unordered_flat_set` and `unordered_flat_map` only work with reasonably
Open-addressing containers provided by Boost.Unordered only work with reasonably
compliant C++11 (or later) compilers. Language-level features such as move semantics
and variadic template parameters are then not emulated.
`unordered_flat_set` and `unordered_flat_map` are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
The containers are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
The main differences with C++ unordered associative containers are:
* `value_type` must be move-constructible.
* Pointer stability is not kept under rehashing.
* `begin()` is not constant-time.
* `erase(iterator)` returns `void` instead of an iterator to the following element.
* There is no API for bucket handling (except `bucket_count`) or node extraction/insertion.
* The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
* In general:
** `begin()` is not constant-time.
** `erase(iterator)` returns `void` instead of an iterator to the following element.
** There is no API for bucket handling (except `bucket_count`).
** The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
exposed through the public function `max_load`, may decrease on erasure under high-load conditions.
* Flat containers (`boost::unordered_flat_set` and `boost::unordered_flat_map`):
** `value_type` must be move-constructible.
** Pointer stability is not kept under rehashing.
** There is no API for node extraction/insertion.

View File

@ -9,10 +9,10 @@ Copyright (C) 2003, 2004 Jeremy B. Maitin-Shepard
Copyright (C) 2005-2008 Daniel James
Copyright (C) 2022 Christian Mazakas
Copyright (C) 2022-2023 Christian Mazakas
Copyright (C) 2022 Joaqu&iacute;n M L&oacute;pez Mu&ntilde;oz
Copyright (C) 2022-2023 Joaqu&iacute;n M L&oacute;pez Mu&ntilde;oz
Copyright (C) 2022 Peter Dimov
Copyright (C) 2022-2023 Peter Dimov
Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

View File

@ -106,8 +106,69 @@ namespace boost {
}
----
`boost::unordered_flat_set` and `boost::unordered_flat_map` require a
reasonably compliant C++11 compiler.
Starting in Boost 1.82, the containers `boost::unordered_node_set` and `boost::unordered_node_map`
are introduced: they use open addressing like `boost::unordered_flat_set` and `boost::unordered_flat_map`,
but internally store element _nodes_, like `boost::unordered_set` and `boost::unordered_map`,
which provide stability of pointers and references to the elements:
[source,c++]
----
// #include <boost/unordered/unordered_node_set.hpp>
//
// Note: no multiset version
namespace boost {
template <
class Key,
class Hash = boost::hash<Key>,
class Pred = std::equal_to<Key>,
class Alloc = std::allocator<Key> >
class unordered_node_set;
}
----
[source,c++]
----
// #include <boost/unordered/unordered_node_map.hpp>
//
// Note: no multimap version
namespace boost {
template <
class Key, class Mapped,
class Hash = boost::hash<Key>,
class Pred = std::equal_to<Key>,
class Alloc = std::allocator<std::pair<Key const, Mapped> > >
class unordered_node_map;
}
----
These are all the containers provided by Boost.Unordered:
[caption=, title='Table {counter:table-counter}. Boost.Unordered containers']
[cols="1,1,.^1", frame=all, grid=rows]
|===
^h|
^h|*Node-based*
^h|*Flat*
^.^h|*Closed addressing*
^| `boost::unordered_set` +
`boost::unordered_map` +
`boost::unordered_multiset` +
`boost::unordered_multimap`
^|
^.^h|*Open addressing*
^| `boost::unordered_node_set` +
`boost::unordered_node_map`
^| `boost::unordered_flat_set` +
`boost::unordered_flat_map`
|===
Closed-addressing containers are pass:[C++]98-compatible. Open-addressing containers require a
reasonably compliant pass:[C++]11 compiler.
Boost.Unordered containers are used in a similar manner to the normal associative
containers:

View File

@ -4,9 +4,10 @@
= Implementation Rationale
== boost::unordered_[multi]set and boost::unordered_[multi]map
== Closed-addressing containers
These containers adhere to the standard requirements for unordered associative
`boost::unordered_[multi]set` and `boost::unordered_[multi]map`
adhere to the standard requirements for unordered associative
containers, so the interface was fixed. But there are
still some implementation decisions to make. The priorities are
conformance to the standard and portability.
@ -64,8 +65,8 @@ of bits in the hash value, so it was only used when `size_t` was 64 bit.
Since release 1.79.0, https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing[Fibonacci hashing]
is used instead. With this implementation, the bucket number is determined
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is the golden
ratio multiplied by `2^w`, `w` is the word size (32 or 64), and `2^k` is the
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is `2^w` divided
by the golden ratio, `w` is the word size (32 or 64), and `2^k` is the
number of buckets. This provides a good compromise between speed and
distribution.
@ -73,7 +74,7 @@ Since release 1.80.0, prime numbers are chosen for the number of buckets in
tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
the result of the user's hash function as was used for release 1.79.0.
== boost::unordered_flat_set and boost::unordered_flat_map
== Open-addresing containers
The C++ standard specification of unordered associative containers impose
severe limitations on permissible implementations, the most important being
@ -81,14 +82,14 @@ that closed addressing is implicitly assumed. Slightly relaxing this specificati
opens up the possibility of providing container variations taking full
advantage of open-addressing techniques.
The design of `boost::unordered_flat_set` and `boost::unordered_flat_map` has been
The design of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` has been
guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
We discuss here the most relevant principles.
=== Hash function
Given its rich functionality and cross-platform interoperability,
`boost::hash` remains the default hash function of `boost::unordered_flat_set` and `boost::unordered_flat_map`.
`boost::hash` remains the default hash function of open-addressing containers.
As it happens, `boost::hash` for integral and other basic types does not possess
the statistical properties required by open addressing; to cope with this,
we implement a post-mixing stage:
@ -98,17 +99,15 @@ we implement a post-mixing stage:
where *mulx* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
and *high* and *low* are the upper and lower halves of an extended word, respectively.
In 64-bit architectures, _C_ is the integer part of
(1 &minus; https://en.wikipedia.org/wiki/Golden_ratio[_&phi;_])&middot;2^64^,
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from
https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
In 64-bit architectures, _C_ is the integer part of 2^64^&#8725;https://en.wikipedia.org/wiki/Golden_ratio[_&phi;_],
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
When using a hash function directly suitable for open addressing, post-mixing can be opted out by via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
`boost::hash` specializations for string types are marked as avalanching.
=== Platform interoperability
The observable behavior of `boost::unordered_flat_set` and `boost::unordered_flat_map` is deterministically
The observable behavior of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` is deterministically
identical across different compilers as long as their ``std::size_type``s are the same size and the user-provided
hash function and equality predicate are also interoperable
&#8212;this includes elements being ordered in exactly the same way for the same sequence of

View File

@ -8,3 +8,5 @@ include::unordered_multiset.adoc[]
include::hash_traits.adoc[]
include::unordered_flat_map.adoc[]
include::unordered_flat_set.adoc[]
include::unordered_node_map.adoc[]
include::unordered_node_set.adoc[]

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff