Merge unordered+hash documentation updates.

[SVN r75015]
2011-10-17 20:23:27 +00:00
parent 15bc3339e2
commit 58e42260d5
4 changed files with 62 additions and 15 deletions
--- a/doc/hash.qbk
+++ b/doc/hash.qbk
@@ -14,11 +14,16 @@
    ]
 ]

+[def __issues__
+    [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1837.pdf
+    Library Extension Technical Report Issues List]]
+
 [include:hash intro.qbk]
 [include:hash tutorial.qbk]
 [include:hash portability.qbk]
 [include:hash disable.qbk]
 [include:hash changes.qbk]
+[include:hash rationale.qbk]
 [xinclude ref.xml]
 [include:hash links.qbk]
 [include:hash thanks.qbk]
--- a/doc/intro.qbk
+++ b/doc/intro.qbk
@@ -18,9 +18,6 @@
 [def __multi-index-short__ [@boost:/libs/multi_index/doc/index.html
    Boost.MultiIndex]]
 [def __bimap__ [@boost:/libs/bimap/index.html Boost.Bimap]]
-[def __issues__
-    [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1837.pdf
-    Library Extension Technical Report Issues List]]
 [def __hash-function__ [@http://en.wikipedia.org/wiki/Hash_function hash function]]
 [def __hash-table__ [@http://en.wikipedia.org/wiki/Hash_table hash table]]

@@ -44,5 +41,12 @@ __issues__ (page 63), this adds support for:
 * the standard containers.
 * extending [classref boost::hash] for custom types.

+[note
+This hash function is designed to be used in containers based on
+the STL and is not suitable as a general purpose hash function.
+For more details see the [link hash.rationale rationale].
+]
+
+
 [endsect]

--- a/doc/portability.qbk
+++ b/doc/portability.qbk
@@ -90,16 +90,4 @@ boost namespace:
 Full code for this example is at
 [@boost:/libs/functional/hash/examples/portable.cpp /libs/functional/hash/examples/portable.cpp].

-[h2 Other Issues]
-
-On Visual C++ versions 6.5 and 7.0, `hash_value` isn't overloaded for built in
-arrays. __boost_hash__, [funcref boost::hash_combine] and [funcref boost::hash_range] all use a workaround to
-support built in arrays so this shouldn't be a problem in most cases.
-
-On Visual C++ versions 6.5 and 7.0, function pointers aren't currently supported.
-
-When using GCC on Solaris, `boost::hash_value(long double)` treats
-`long double`s as `double`s - so the hash function doesn't take into account the
-full range of values.
-
 [endsect]
--- a/doc/rationale.qbk
+++ b/doc/rationale.qbk
@@ -0,0 +1,50 @@
+
+[/ Copyright 2011 Daniel James.
+ / Distributed under the Boost Software License, Version 1.0. (See accompanying
+ / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ]
+
+[section:rationale Rationale]
+
+The rationale for the design can be found in the original design
+[footnote issue 6.18 of the __issues__ (page 63)], but an issue that
+occasionally comes up is the quality of the hash function, so that
+demands some more attention.
+
+Many hash functions strive to have little correlation between the input
+and output values. They attempt to uniformally distribute the output
+values for very similar inputs. This hash function makes no such
+attempt. In fact, for integers, the result of the hash function is often
+just the input value. So similar but different input values will often
+result in similar but different output values.
+
+This means that it is not appropriate as a general hash function. For
+example, a hash table may discard bits from the hash function resulting
+in likely collisions, or might have poor collision resolution when hash
+values are clustered together. In such cases this hash function will
+preform poorly.
+
+So why not implement a higher quality hash function? Well, the standard
+makes no such guarantee, it just requires that the hashes of two
+different values are unlikely to collide. Containers or algorithms
+designed to work with the standard hash function will have to be
+implemented to work well when the hash function's output is correlated
+to its input. Since they are paying that cost a higher quality hash function
+would be wasteful.
+
+For other use cases, if you do need a higher quality hash function,
+there are several options
+available. One is to use a second hash on the output of this hash
+function, such as [@http://www.concentric.net/~ttwang/tech/inthash.htm
+Thomas Wang's hash function]. This this may not work as
+well as a hash algorithm tailored for the input.
+
+For strings that are several fast, high quality hash functions
+available (for example [@http://code.google.com/p/smhasher/ MurmurHash3]
+and [@http://code.google.com/p/cityhash/ Google's CityHash]),
+although they tend to be more machine specific.
+These may also be appropriate for hashing a binary representation of
+your data - providing that all equal values have an equal
+representation, which is not always the case (e.g. for floating point
+values).
+
+[endsect]