documented the new mulx-based mixing algorithm

This commit is contained in:
joaquintides
2023-02-08 20:07:23 +01:00
parent a74962bc3c
commit 14d80725eb
2 changed files with 11 additions and 4 deletions

View File

@ -10,6 +10,8 @@
* Extended heterogeneous lookup to more member functions as specified in
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2363r3.html[P2363].
* Replaced the previous post-mixing process for open-addressing containers with
a new algorithm based on extended multiplication by a constant.
== Release 1.81.0 - Major update

View File

@ -93,10 +93,15 @@ As it happens, `boost::hash` for integral and other basic types does not possess
the statistical properties required by open addressing; to cope with this,
we implement a post-mixing stage:
* 64-bit architectures: we use the `xmx` function defined in
Jon Maiga's http://jonkagstrom.com/bit-mixer-construction/index.html[The construct of a bit mixer^].
* 32-bit architectures: the mixer used was selected from a set generated with https://github.com/skeeto/hash-prospector[Hash Function Prospector^]
as the best overall performer in our internal benchmarks. Score assigned by Hash Prospector is 333.7934929677524.
{nbsp}{nbsp}{nbsp}{nbsp} _a_ <- _h_ *mulx* _C_, +
{nbsp}{nbsp}{nbsp}{nbsp} _h_ <- *high*(_a_) *xor* *low*(_a_),
where *mulx* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
and *high* and *low* are the upper and lower halves of an extended word, respectively.
In 64-bit architectures, _C_ is the integer part of
(1 &minus; https://en.wikipedia.org/wiki/Golden_ratio[_&phi;_])&middot;2^64^,
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from
https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
When using a hash function directly suitable for open addressing, post-mixing can be opted out by via a dedicated <<hash_traits_hash_is_avalanching,`hash_is_avalanching`>>trait.
`boost::hash` specializations for string types are marked as avalanching.