mirror of
https://github.com/boostorg/functional.git
synced 2025-07-30 04:27:18 +02:00
Hash: Improve rationale slightly.
[SVN r75542]
This commit is contained in:
@ -5,10 +5,10 @@
|
||||
|
||||
[section:rationale Rationale]
|
||||
|
||||
The rationale for the design can be found in the original design
|
||||
[footnote issue 6.18 of the __issues__ (page 63)], but an issue that
|
||||
occasionally comes up is the quality of the hash function, so that
|
||||
demands some more attention.
|
||||
The rationale can be found in the original design
|
||||
[footnote issue 6.18 of the __issues__ (page 63)].
|
||||
|
||||
[heading:quality Quality of the hash function]
|
||||
|
||||
Many hash functions strive to have little correlation between the input
|
||||
and output values. They attempt to uniformally distribute the output
|
||||
@ -16,23 +16,23 @@ values for very similar inputs. This hash function makes no such
|
||||
attempt. In fact, for integers, the result of the hash function is often
|
||||
just the input value. So similar but different input values will often
|
||||
result in similar but different output values.
|
||||
|
||||
This means that it is not appropriate as a general hash function. For
|
||||
example, a hash table may discard bits from the hash function resulting
|
||||
in likely collisions, or might have poor collision resolution when hash
|
||||
values are clustered together. In such cases this hash function will
|
||||
preform poorly.
|
||||
|
||||
So why not implement a higher quality hash function? Well, the standard
|
||||
makes no such guarantee, it just requires that the hashes of two
|
||||
different values are unlikely to collide. Containers or algorithms
|
||||
But the standard has no such requirement for the hash function,
|
||||
it just requires that the hashes of two different values are unlikely
|
||||
to collide. Containers or algorithms
|
||||
designed to work with the standard hash function will have to be
|
||||
implemented to work well when the hash function's output is correlated
|
||||
to its input. Since they are paying that cost a higher quality hash function
|
||||
would be wasteful.
|
||||
|
||||
For other use cases, if you do need a higher quality hash function,
|
||||
there are several options
|
||||
then neither the standard hash function or `boost::hash` are appropriate.
|
||||
There are several options
|
||||
available. One is to use a second hash on the output of this hash
|
||||
function, such as [@http://www.concentric.net/~ttwang/tech/inthash.htm
|
||||
Thomas Wang's hash function]. This this may not work as
|
||||
@ -47,4 +47,4 @@ your data - providing that all equal values have an equal
|
||||
representation, which is not always the case (e.g. for floating point
|
||||
values).
|
||||
|
||||
[endsect]
|
||||
[endsect]
|
||||
|
Reference in New Issue
Block a user