Rename Rationale section to Design and Implementation Notes

2025-09-27 08:50:56 +02:00 · 2022-09-19 12:29:38 +03:00
parent adcf81c732
commit 478730107d
3 changed files with 46 additions and 17 deletions
--- a/doc/hash.adoc
+++ b/doc/hash.adoc
@@ -22,7 +22,7 @@ include::hash/tutorial.adoc[]
 include::hash/user.adoc[]
 include::hash/combine.adoc[]
 include::hash/reference.adoc[]
-include::hash/rationale.adoc[]
+include::hash/notes.adoc[]
 include::hash/links.adoc[]
 include::hash/thanks.adoc[]
 include::hash/changes.adoc[]
--- a/doc/hash/notes.adoc
+++ b/doc/hash/notes.adoc
@@ -0,0 +1,45 @@
+////
+Copyright 2005-2008 Daniel James
+Copyright 2022 Christian Mazakas
+Copyright 2022 Peter Dimov
+Distributed under the Boost Software License, Version 1.0.
+https://www.boost.org/LICENSE_1_0.txt
+////
+
+[#notes]
+= Design and Implementation Notes
+:idprefix: notes_
+
+== Quality of the Hash Function
+
+Many hash functions strive to have little correlation between the input and
+output values. They attempt to uniformally distribute the output values for
+very similar inputs. This hash function makes no such attempt. In fact, for
+integers, the result of the hash function is often just the input value. So
+similar but different input values will often result in similar but different
+output values. This means that it is not appropriate as a general hash
+function. For example, a hash table may discard bits from the hash function
+resulting in likely collisions, or might have poor collision resolution when
+hash values are clustered together. In such cases this hash function will
+perform poorly.
+
+But the standard has no such requirement for the hash function, it just
+requires that the hashes of two different values are unlikely to collide.
+Containers or algorithms designed to work with the standard hash function will
+have to be implemented to work well when the hash function's output is
+correlated to its input. Since they are paying that cost a higher quality hash
+function would be wasteful.
+
+For other use cases, if you do need a higher quality hash function, then
+neither the standard hash function or `boost::hash` are appropriate. There are
+several options available. One is to use a second hash on the output of this
+hash function, such as
+http://web.archive.org/web/20121102023700/http://www.concentric.net/~Ttwang/tech/inthash.htm[Thomas Wang's hash function].
+This this may not work as well as a hash algorithm tailored for the input.
+
+For strings there are several fast, high quality hash functions available
+(for example http://code.google.com/p/smhasher/[MurmurHash3] and
+http://code.google.com/p/cityhash/[Google's CityHash]), although they tend to
+be more machine specific. These may also be appropriate for hashing a binary
+representation of your data - providing that all equal values have an equal
+representation, which is not always the case (e.g. for floating point values).
--- a/doc/hash/rationale.adoc
+++ b/doc/hash/rationale.adoc
@@ -1,16 +0,0 @@
-[#rationale]
-= Rationale
-
-:idprefix: rationale_
-
-The rationale can be found in the original designfootnote:[issue 6.18 of the http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1837.pdf[Library Extension Technical Report Issues List] (page 63)].
-
-== Quality of the hash function
-
-Many hash functions strive to have little correlation between the input and output values. They attempt to uniformally distribute the output values for very similar inputs. This hash function makes no such attempt. In fact, for integers, the result of the hash function is often just the input value. So similar but different input values will often result in similar but different output values. This means that it is not appropriate as a general hash function. For example, a hash table may discard bits from the hash function resulting in likely collisions, or might have poor collision resolution when hash values are clustered together. In such cases this hash function will preform poorly.
-
-But the standard has no such requirement for the hash function, it just requires that the hashes of two different values are unlikely to collide. Containers or algorithms designed to work with the standard hash function will have to be implemented to work well when the hash function's output is correlated to its input. Since they are paying that cost a higher quality hash function would be wasteful.
-
-For other use cases, if you do need a higher quality hash function, then neither the standard hash function or `boost::hash` are appropriate. There are several options available. One is to use a second hash on the output of this hash function, such as http://web.archive.org/web/20121102023700/http://www.concentric.net/~Ttwang/tech/inthash.htm[Thomas Wang's hash function]. This this may not work as well as a hash algorithm tailored for the input.
-
-For strings there are several fast, high quality hash functions available (for example http://code.google.com/p/smhasher/[MurmurHash3] and http://code.google.com/p/cityhash/[Google's CityHash]), although they tend to be more machine specific. These may also be appropriate for hashing a binary representation of your data - providing that all equal values have an equal representation, which is not always the case (e.g. for floating point values).