uploaded current status

2022-10-30 19:16:43 +01:00
parent 90f2f0f67d
commit 2068cf8d5b
7 changed files with 196 additions and 43 deletions
--- a/doc/unordered/buckets.adoc
+++ b/doc/unordered/buckets.adoc
@@ -5,7 +5,7 @@
 = The Data Structure

 The containers are made up of a number of 'buckets', each of which can contain
-any number of elements. For example, the following diagram shows an <<unordered_set,unordered_set>> with 7 buckets containing 5 elements, `A`,
+any number of elements. For example, the following diagram shows a <<unordered_set,`boost::unordered_set`>> with 7 buckets containing 5 elements, `A`,
 `B`, `C`, `D` and `E` (this is just for illustration, containers will typically
 have more buckets).

@@ -31,20 +31,34 @@ equality predicates in the next section>>.

 You can see in the diagram that `A` & `D` have been placed in the same bucket.
 When looking for elements in this bucket up to 2 comparisons are made, making
-the search slower. This is known as a collision. To keep things fast we try to
+the search slower. This is known as a *collision*. To keep things fast we try to
 keep collisions to a minimum.

+If instead of `boost::unordered_set` we had used <<unordered_flat_set,`boost::unordered_flat_set`>>, the
+diagram would look as follows:
+
+image::buckets oa.png[]
+
+In open-addressing containers, buckets can hold at most one element; if a collision happens
+(like is the case of `D` in the example), the element uses some other available bucket in
+the vicinity of the original position. Given this simpler scenario, Boost.Unordered
+open-addressing containers offer a very limited API for accessing buckets.
+
 [caption=, title='Table {counter:table-counter}. Methods for Accessing Buckets']
 [cols="1,.^1", frame=all, grid=rows]
 |===
-|Method |Description
+2+^h| *All containers*
+h|*Method* h|*Description*

 |`size_type bucket_count() const` 
 |The number of buckets.

+2+^h| *Closed-addressing containers only* +
+`boost::unordered_[multi]set`, `boost::unordered_[multi]map` 
+h|*Method* h|*Description*
+
 |`size_type max_bucket_count() const` 
 |An upper bound on the number of buckets.
-
 |`size_type bucket_size(size_type n) const` 
 |The number of elements in bucket `n`.

@@ -69,14 +83,14 @@ keep collisions to a minimum.
 == Controlling the number of buckets

 As more elements are added to an unordered associative container, the number
-of elements in the buckets will increase causing performance to degrade.
+of collisions will increase causing performance to degrade.
 To combat this the containers increase the bucket count as elements are inserted.
 You can also tell the container to change the bucket count (if required) by
 calling `rehash`.

 The standard leaves a lot of freedom to the implementer to decide how the
 number of buckets is chosen, but it does make some requirements based on the
-container's 'load factor', the average number of elements per bucket.
+container's 'load factor', the number of elements divided by the number of buckets.
 Containers also have a 'maximum load factor' which they should try to keep the
 load factor below.

@@ -97,7 +111,8 @@ or close to the hint - unless your hint is unreasonably small or large.
 [caption=, title='Table {counter:table-counter}. Methods for Controlling Bucket Size']
 [cols="1,.^1", frame=all, grid=rows]
 |===
-|Method |Description
+2+^h| *All containers*
+h|*Method* h|*Description*

 |`X(size_type n)` 
 |Construct an empty container with at least `n` buckets (`X` is the container type).
@@ -112,22 +127,45 @@ or close to the hint - unless your hint is unreasonably small or large.
 |Returns the current maximum load factor.

 |`float max_load_factor(float z)`
-|Changes the container's maximum load factor, using `z` as a hint.
+|Changes the container's maximum load factor, using `z` as a hint. +
+**Open-addressing containers:** this function does nothing: users are not allowed to change the maximum load factor.

 |`void rehash(size_type n)`
 |Changes the number of buckets so that there at least `n` buckets, and so that the load factor is less than the maximum load factor.

+2+^h| *Open-addressing containers only* +
+`boost::unordered_flat_set`, `boost::unordered_flat_map` 
+h|*Method* h|*Description*
+
+|`size_type max_load() const`
+|Returns the maximum number of allowed elements in the container before rehash.
+
 |===

+A note on `max_load` for open-addressing containers: the maximum load will naturally decrease when
+new insertions are performed, but _won't_ increase at the same rate when erasing: for instance,
+adding 1,000 elements to a <<unordered_flat_map,`boost::unordered_flat_map`>> and then
+erasing those 1,000 elements will typically reduce the maximum load by around 160 rather
+than restoring it to its original value. This is done internally by Boost.Unordered in order
+to keep its performance stable, and must be taken into account when planning for rehash-free insertions.
+The maximum load will be reset to its theoretical maximum
+(`max_load_factor() * bucket_count()`) right after `rehash`.
+
 == Iterator Invalidation

 It is not specified how member functions other than `rehash` and `reserve` affect
-the bucket count, although `insert` is only allowed to invalidate iterators
-when the insertion causes the load factor to be greater than or equal to the
-maximum load factor. For most implementations this means that `insert` will only
-change the number of buckets when this happens. While iterators can be
-invalidated by calls to `insert`, `rehash` and `reserve`, pointers and references to the
-container's elements are never invalidated.
+the bucket count, although `insert` can only invalidate iterators
+when the insertion causes the container's load to be greater than the maximum allowed.
+For most implementations this means that `insert` will only
+change the number of buckets when this happens. Iterators can be
+invalidated by calls to `insert`, `rehash` and `reserve`.
+
+As for pointers and references,
+they are never invalidated for closed-addressing containers (`boost::unordered_[multi]set`, `boost::unordered_[multi]map`),
+but they will when rehashing occurs for open-addressing
+`boost::unordered_flat_set` and `boost::unordered_flat_map`: this is because
+these containers store elements directly into their holding buckets, so
+when allocating a new bucket array the elements must be transferred by means of move construction.

 In a similar manner to using `reserve` for ``vector``s, it can be a good idea
 to call `reserve` before inserting a large number of elements. This will get
--- a/doc/unordered/comparison.adoc
+++ b/doc/unordered/comparison.adoc
@@ -25,19 +25,22 @@
 |No equivalent. Since the elements aren't ordered `lower_bound` and `upper_bound` would be meaningless.

 |`equal_range(k)` returns an empty range at the position that `k` would be inserted if `k` isn't present in the container.
-|`equal_range(k)` returns a range at the end of the container if `k` isn't present in the container. It can't return a positioned range as `k` could be inserted into multiple place. To find out the bucket that `k` would be inserted into use `bucket(k)`. But remember that an insert can cause the container to rehash - meaning that the element can be inserted into a different bucket.
+|`equal_range(k)` returns a range at the end of the container if `k` isn't present in the container. It can't return a positioned range as `k` could be inserted into multiple place. +
+**Closed-addressing containers:** To find out the bucket that `k` would be inserted into use `bucket(k)`. But remember that an insert can cause the container to rehash - meaning that the element can be inserted into a different bucket.

 |`iterator`, `const_iterator` are of the bidirectional category.
 |`iterator`, `const_iterator` are of at least the forward category.

 |Iterators, pointers and references to the container's elements are never invalidated.
-|<<buckets_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. Pointers and references to the container's elements are never invalidated.
+|<<buckets_iterator_invalidation,Iterators can be invalidated by calls to insert or rehash>>. +
+**Closed-addressing containers:** Pointers and references to the container's elements are never invalidated. +
+**Open-addressing containers:** Pointers and references to the container's elements are invalidated when rehashing occurs.

 |Iterators iterate through the container in the order defined by the comparison object.
 |Iterators iterate through the container in an arbitrary order, that can change as elements are inserted, although equivalent elements are always adjacent.

 |No equivalent
-|Local iterators can be used to iterate through individual buckets. (The order of local iterators and iterators aren't required to have any correspondence.)
+|**Closed-addressing containers:** Local iterators can be used to iterate through individual buckets. (The order of local iterators and iterators aren't required to have any correspondence.)

 |Can be compared using the `==`, `!=`, `<`, `\<=`, `>`, `>=` operators.
 |Can be compared using the `==` and `!=` operators.
@@ -45,9 +48,6 @@
 |
 |When inserting with a hint, implementations are permitted to ignore the hint.

-|`erase` never throws an exception
-|The containers' hash or predicate function can throw exceptions from `erase`.
-
 |===

 ---
--- a/doc/unordered/compliance.adoc
+++ b/doc/unordered/compliance.adoc
@@ -5,13 +5,15 @@

 :cpp: C++

+== Closed-addressing containers: unordered_[multi]set, unordered_[multi]map
+
 The intent of Boost.Unordered is to implement a close (but imperfect)
 implementation of the {cpp}17 standard, that will work with {cpp}98 upwards.
 The wide compatibility does mean some comprimises have to be made.
 With a compiler and library that fully support {cpp}11, the differences should
 be minor.

-== Move emulation
+=== Move emulation

 Support for move semantics is implemented using Boost.Move. If rvalue
 references are available it will use them, but if not it uses a close,
@@ -23,7 +25,7 @@ but imperfect emulation. On such compilers:
 * The containers themselves are not movable.
 * Argument forwarding is not perfect.

-== Use of allocators
+=== Use of allocators

 {cpp}11 introduced a new allocator system. It's backwards compatible due to
 the lax requirements for allocators in the old standard, but might need
@@ -56,7 +58,7 @@ Due to imperfect move emulation, some assignments might check
 `propagate_on_container_copy_assignment` on some compilers and
 `propagate_on_container_move_assignment` on others.

-== Construction/Destruction using allocators
+=== Construction/Destruction using allocators

 The following support is required for full use of {cpp}11 style
 construction/destruction:
@@ -76,7 +78,7 @@ constructing a `std::pair` using `boost::tuple` (see <<compliance_pairs,below>>)
 When support is not available `allocator_traits::construct` and
 `allocator_traits::destroy` are never called.

-== Pointer Traits
+=== Pointer Traits

 `pointer_traits` aren't used. Instead, pointer types are obtained from
 rebound allocators, this can cause problems if the allocator can't be
@@ -84,7 +86,7 @@ used with incomplete types. If `const_pointer` is not defined in the
 allocator, `boost::pointer_to_other<pointer, const value_type>::type`
 is used to obtain a const pointer.

-== Pairs
+=== Pairs

 Since the containers use `std::pair` they're limited to the version
 from the current standard library. But since {cpp}11 ``std::pair``'s
@@ -105,7 +107,7 @@ Older drafts of the standard also supported variadic constructors
 for `std::pair`, where the first argument would be used for the
 first part of the pair, and the remaining for the second part.

-== Miscellaneous
+=== Miscellaneous

 When swapping, `Pred` and `Hash` are not currently swapped by calling
 `swap`, their copy constructors are used. As a consequence when swapping
@@ -114,3 +116,28 @@ an exception may be thrown from their copy constructor.
 Variadic constructor arguments for `emplace` are only used when both
 rvalue references and variadic template parameters are available.
 Otherwise `emplace` can only take up to 10 constructors arguments.
+
+== Open-addressing containers: unordered_flat_set, unordered_flat_map
+
+The C++ standard does not currently provide any open-addressing container
+specification to adhere to, so `boost::unordered_flat_set` and
+`boost::unordered_flat_map` take inspiration from `std::unordered_set` and
+`std::unordered_map`, respectively, and depart from their interface where
+convenient or as dictated by their internal data structure, which is
+radically different from that imposed by the standard (closed addressing, node based).
+
+`unordered_flat_set` and `unordered_flat_map` only work with reasonably
+compliant C++11 (or later) compilers. Language-level features such as move semantics
+and variadic template parameters are then not emulated. 
+`unordered_flat_set` and `unordered_flat_map` are fully https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer[AllocatorAware^].
+
+The main differences with C++ unordered associative containers are:
+
+* `value_type` must be move-constructible.
+* Pointer stability is not kept under rehashing.
+* `begin()` is not constant-time.
+* `erase(iterator)` returns `void` instead of an iterator to the following element.
+* There is no API for bucket handling (except `bucket_count`) or node extraction/insertion.
+* The maximum load factor of the container is managed internally and can't be set by the user. The maximum load,
+exposed through the public function `max_load`, can not increase monotonically with the number of erasures.
+
--- a/doc/unordered/copyright.adoc
+++ b/doc/unordered/copyright.adoc
@@ -11,4 +11,8 @@ Copyright (C) 2005-2008 Daniel James

 Copyright (C) 2022 Christian Mazakas

+Copyright (C) 2022 Joaqu&iacute;n M L&oacute;pez Mu&ntilde;oz
+
+Copyright (C) 2022 Peter Dimov
+
 Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
--- a/doc/unordered/hash_traits.adoc
+++ b/doc/unordered/hash_traits.adoc
@@ -29,14 +29,14 @@ struct hash_is_avalanching;

 A hash function is said to have the _avalanching property_ if small changes in the input translate to
 large changes in the returned hash code &#8212;ideally, flipping one bit in the representation of
-the input value results in each bit of the hash code flipping with probability 50%. This property is
-critical for the proper behavior of open-addressing hash containers.
+the input value results in each bit of the hash code flipping with probability 50%. Approaching
+this property is critical for the proper behavior of open-addressing hash containers.

-`hash_is_avalanching<Hash>` derives from `std::true_type` if `Hash::is_avalanching` is a valid type,
-and derives from `std::false_type` otherwise.
+`hash_is_avalanching<Hash>::value` is `true` if `Hash::is_avalanching` is a valid type,
+and `false` otherwise.
 Users can then declare a hash function `Hash` as avalanching either by embedding an `is_avalanching` typedef
-into the definition of `Hash`, or directly by specializing `hash_is_avalanching<Hash>` to derive from
-`std::true_type`.
+into the definition of `Hash`, or directly by specializing `hash_is_avalanching<Hash>` to a class with
+an embedded compile-time constant `value` set to `true`.

 xref:unordered_flat_set[`boost::unordered_flat_set`] and xref:unordered_flat_map[`boost::unordered_flat_map`]
 use the provided hash function `Hash` as-is if `hash_is_avalanching<Hash>::value` is `true`; otherwise, they
--- a/doc/unordered/intro.adoc
+++ b/doc/unordered/intro.adoc
@@ -18,12 +18,12 @@ or isn't practical. In contrast, a hash table only needs an equality function
 and a hash function for the key.

 With this in mind, unordered associative containers were added to the {cpp}
-standard. This is an implementation of the containers described in {cpp}11,
+standard. Boost.Unordered provides an implementation of the containers described in {cpp}11,
 with some <<compliance,deviations from the standard>> in
 order to work with non-{cpp}11 compilers and libraries.

 `unordered_set` and `unordered_multiset` are defined in the header
-`<boost/unordered_set.hpp>`
+`<boost/unordered/unordered_set.hpp>`
 [source,c++]
 ----  
 namespace boost {
@@ -44,7 +44,7 @@ namespace boost {
 ----

 `unordered_map` and `unordered_multimap` are defined in the header
-`<boost/unordered_map.hpp>`
+`<boost/unordered/unordered_map.hpp>`

 [source,c++]
 ----
@@ -65,10 +65,51 @@ namespace boost {
 }
 ----

-When using Boost.TR1, these classes are included from `<unordered_set>` and
-`<unordered_map>`, with the classes added to the `std::tr1` namespace.
+These containers, and all other implementations of standard unordered associative
+containers, use an approach to its internal data structure design called
+*closed addressing*. Starting in Boost 1.81, Boost.Unordered also provides containers
+`boost::unordered_flat_set` and `boost::unordered_flat_map`, which use a
+different data structure strategy commonly known as *open addressing* and depart in
+a small number of ways from the standard so as to offer much better performance
+in exchange (more than 2 times faster in typical scenarios):

-The containers are used in a similar manner to the normal associative
+
+[source,c++]
+----
+// #include <boost/unordered/unordered_flat_set.hpp>
+//
+// Note: no multiset version
+
+namespace boost {
+    template <
+        class Key,
+        class Hash = boost::hash<Key>,
+        class Pred = std::equal_to<Key>,
+        class Alloc = std::allocator<Key> >
+    class unordered_flat_set;
+}
+----
+
+[source,c++]
+----
+// #include <boost/unordered/unordered_flat_map.hpp>
+//
+// Note: no multimap version
+
+namespace boost {
+    template <
+        class Key, class Mapped,
+        class Hash = boost::hash<Key>,
+        class Pred = std::equal_to<Key>,
+        class Alloc = std::allocator<std::pair<Key const, Mapped> > >
+    class unordered_flat_map;
+}
+----
+
+`boost::unordered_flat_set` and `boost::unordered_flat_map` require a
+reasonably compliant C++11 compiler.
+
+Boost.Unordered containers are used in a similar manner to the normal associative
 containers:

 [source,cpp]
@@ -87,7 +128,7 @@ But since the elements aren't ordered, the output of:

 [source,c++]
 ----
-BOOST_FOREACH(map::value_type i, x) {
+for(const map::value_type& i: x) {
    std::cout<<i.first<<","<<i.second<<"\n";
 }
 ----
--- a/doc/unordered/rationale.adoc
+++ b/doc/unordered/rationale.adoc
@@ -4,15 +4,17 @@

 = Implementation Rationale

-The intent of this library is to implement the unordered
-containers in the standard, so the interface was fixed. But there are
+== boost::unordered_[multi]set and boost::unordered_[multi]map
+
+These containers adhere to the standard requirements for unordered associative
+containers, so the interface was fixed. But there are
 still some implementation decisions to make. The priorities are
 conformance to the standard and portability.

 The http://en.wikipedia.org/wiki/Hash_table[Wikipedia article on hash tables^]
 has a good summary of the implementation issues for hash tables in general.

-== Data Structure
+=== Data Structure

 By specifying an interface for accessing the buckets of the container the
 standard pretty much requires that the hash table uses chained addressing.
@@ -37,7 +39,7 @@ bucket but there are some serious problems with this:

 So chained addressing is used.

-== Number of Buckets
+=== Number of Buckets

 There are two popular methods for choosing the number of buckets in a hash
 table. One is to have a prime number of buckets, another is to use a power
@@ -70,3 +72,44 @@ distribution.
 Since release 1.80.0, prime numbers are chosen for the number of buckets in
 tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
 the result of the user's hash function as was used for release 1.79.0.
+
+== boost::unordered_flat_set and boost::unordered_flat_map
+
+The C++ standard specification of unordered associative containers impose
+severe limitations on permissible implementations, the most important being
+that closed addressing is implicitly assumed. Slightly relaxing this specification
+opens up the possibility of providing container variations taking full
+advantage of open-addressing techniques.
+
+The design of `boost::unordered_flat_set` and `boost::unordered_flat_map` has been
+guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
+We discuss here the most relevant principles.
+
+=== Hash function
+
+Given its rich functionality and cross-platform interoperability,
+`boost::hash` remains the default hash function of `boost::unordered_flat_set` and `boost::unordered_flat_map`.
+As it happens, `boost::hash` for integral and other basic types does not provide
+the good statistical properties required by open addressing; to cope with this,
+we implement a post-mixing stage:
+
+*  64-bit architectures: we use the `xmx` function defined in
+Jon Maiga's http://jonkagstrom.com/bit-mixer-construction/index.html[The construct of a bit mixer^].
+*  32-bit architectures: the mixer used was selected from a set generated with https://github.com/skeeto/hash-prospector[Hash Function Prospector^]
+as the best overall performer in our internal benchmarks. Score assigned by Hash Prospector is 333.7934929677524.
+
+When using a hash function directly suitable for open addressing, post-mixing can be opted out by via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
+`boost::hash` specializations for string types are marked as avalanching.
+
+=== Platform interoperability
+
+The observable behavior of `boost::unordered_flat_set` and `boost::unordered_flat_map` is deterministically
+identical across different compilers as long as their ``std::size_type``s are the same size and the user-provided
+hash function and equality predicate are also interoperable
+&#8212;this includes elements being ordered in exactly the same way for the same sequence of
+operations.
+
+Although the implementation internally uses SIMD technologies, such as https://en.wikipedia.org/wiki/SSE2[SSE2^]
+and https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(NEON)[Neon^], when available,
+this does not affect interoperatility. For instance, the behavior is the same
+for Visual Studio on an Intel CPU with SSE2 in x64 and for GCC on an IBM s390x without any supported SIMD technology.