Add the hash documentation.

[SVN r28135]
This commit is contained in:
Daniel James
2005-04-11 22:07:45 +00:00
parent 8b08528611
commit 763e59741a
2 changed files with 778 additions and 0 deletions

9
doc/Jamfile.v2 Normal file
View File

@@ -0,0 +1,9 @@
# Copyright Daniel James 2005. Use, modification, and distribution are
# subject to the Boost Software License, Version 1.0. (See accompanying
# file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
using quickbook ;
xml hash : hash.qbk ;
boostbook standalone : hash ;

769
doc/hash.qbk Normal file
View File

@@ -0,0 +1,769 @@
[library Boost.Hash
[authors [James, Daniel]]
[copyright 2005 Daniel James]
[purpose std::tr1 compliant hash function object]
[license
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
<ulink url="http://www.boost.org/LICENSE_1_0.txt">
http://www.boost.org/LICENSE_1_0.txt
</ulink>)
]
]
[/ QuickBook Document version 1.0 ]
[/ Feb 8, 2005 ]
[def __note__ [$images/note.png]]
[def __alert__ [$images/alert.png]]
[def __tip__ [$images/tip.png]]
[section:intro Introduction]
[def __tr1__ [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1745.pdf
C++ Standard Library Technical Report]]
[def __tr1-short__ [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1745.pdf
Technical Report]]
[def __multi-index__ [@../../libs/multi_index/doc/index.html
Boost Multi-Index Containers Library]]
[def __multi-index-short__ [@../../libs/multi_index/doc/index.html
Boost.MultiIndex]]
[def __issues__ [@http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf
Library Extension Technical Report Issues List]]
[def __hash-function__ [@http://en.wikipedia.org/wiki/Hash_function hash function]]
[def __hash-table__ [@http://en.wikipedia.org/wiki/Hash_table hash table]]
[classref boost::hash] is an implementation of the __hash-function__ object
specified by the __tr1__. It is intended for use as the default hash function
for unordered associative containers, and the __multi-index__'s hash indexes.
As it is compliant with the __tr1-short__, it will work with:
* integers
* floats
* pointers
* strings
It also implements the extension proposed by Peter Dimov in issue 6.18 of the
__issues__, this adds support for:
* `std::pair`
* the standard containers.
The extension also allows the user to customise the hash function for
user-defined types. Which means that you don't need to use a custom hash
function object with a container that uses [classref boost::hash] by default.
[endsect]
[section:tutorial Tutorial]
When using __multi-index-short__, you don't need to do anything, as it uses
[classref boost::hash] by default. To find out how to use a user-defined type,
read the [link section.tutorial.custom section on Extending Boost.Hash].
If your standard library supplies its own implementation of the unordered
associative containers and you wish to use
[classref boost::hash] you just supply the extra template parameter:
std::unordered_multiset<std::vector<int>, boost::hash<int> >
set_of_ints;
std::unordered_set<std::pair<int, int>, boost::hash<std::pair<int, int> >
set_of_pairs;
std::unordered_map<int, std::string, boost::hash<int> > map_int_to_string;
To use [classref boost::hash] directly, create an instance and call it as a function:
#include <boost/hash/hash.hpp>
int main()
{
boost::hash<std::string> string_hash;
std::size_t h = string_hash("Hash me");
}
If you wish to make use of the extensions, you will need to include the
appropriate header (see the
[link hash.reference.specification reference documentation] for the full list).
#include <boost/hash/pair.hpp>
int main()
{
boost::hash<std::pair<int, int> > pair_hash;
std::size_t h = pair_hash(std::make_pair(1, 2));
}
Or alternatively, include `<boost/hash.hpp>` for the full library.
For an example of generic use, here is a function to generate a vector
containing the hashes of the elements of a container:
template <class Container>
std::vector<std::size_t> get_hashes(Container const& x)
{
std::vector<std::size_t> hashes;
std::transform(x.begin(), x.end(), std::insert_iterator(hashes),
boost::hash<typename Container::value_type>());
return hashes;
}
[section:custom Extending Boost.Hash for a custom data type]
[classref boost::hash] is implemented by calling the function
[funcref boost::hash_value hash_value].
The namespace isn't specified so that it can detect overloads via argument
dependant lookup. So if there is a free function `hash_value` in the same
namespace as a custom type, it will get called.
If you have a structure `library::book`, where each `book` is uniquely
defined by it's member `id`:
namespace library
{
struct book
{
int id;
std::string author;
std::string title;
// ....
};
bool operator==(book const& a, book const& b)
{
return a.id == b.id;
}
}
Then all you would need to do is write the function `library::hash_value`:
namespace library
{
std::size_t hash_value(book const& b)
{
return boost::hash_value(b.id);
}
}
And you can now use [classref boost::hash] with book:
library::book knife(3458, "Zane Grey", "The Hash Knife Outfit");
library::book dandelion(1354, "Paul J. Shanley",
"Hash & Dandelion Greens");
boost::hash<library::book> book_hasher;
std::size_t knife_hash_value = book_hasher(knife);
boost::unordered_set<library::book> books;
books.insert(knife);
books.insert(library::book(2443, "Lindgren, Torgny", "Hash"));
books.insert(library::book(1953, "Snyder, Bernadette M.",
"Heavenly Hash: A Tasty Mix of a Mother's Meditations"));
assert(books.find(knife) != books.end());
assert(books.find(dandelion) == books.end());
Full code for this example is at
[@../../libs/functional/hash/examples/books.cpp /libs/functional/hash/examples/books.cpp].
[blurb
When writing a hash function, first look at how the equality function works.
Objects that are equal must generate the same hash value.
When objects are not equal the should generate different hash values.
In this object equality was based just on the id, if it was based
on the objects name and author the hash function should take them into account
(how to do this is discussed in the next section).
]
[endsect]
[section:combine Combining hash values]
Say you have a point class, representing a two dimensional location:
class point
{
int x;
int y;
public:
point() : x(0), y(0) {}
point(int x, int y) : x(x), y(y) {}
bool operator==(point const& other) const
{
return x = other.x && y == other.y;
}
};
and you wish to use it as the key for an `unordered_map`. You need to
customise the hash for this structure. To do this we need to combine
the hash values for `x` and `y`. Boost.Hash supplies the function
[funcref boost::hash_combine] for this purpose:
class point
{
...
friend std::size_t hash_value(point const& p)
{
std::size_t seed = 0;
boost::hash_combine(seed, p.x);
boost::hash_combine(seed, p.y);
return seed;
}
...
};
Calls to hash_combine incrementally build the hash from the different members
of point, it can be repeatedly called for any number of elements. It calls
[funcref boost::hash_value hash_value] on the supplied element, and combines
it with the seed.
Full code for this example is at
[@../../libs/functional/hash/examples/point.cpp /libs/functional/hash/examples/point.cpp].
[blurb
'''
When using <functionname>boost::hash_combine</functionname> the order of the
calls matters.
<programlisting>
std::size_t seed = 0;
boost::hash_combine(seed, 1);
boost::hash_combine(seed, 2);
</programlisting>
results in a different seed to:
<programlisting>
std::size_t seed = 0;
boost::hash_combine(seed, 2);
boost::hash_combine(seed, 1);
</programlisting>
If you are calculating a hash value for data where the order of the data
doesn't matter in comparisons (e.g. a set) you will have to ensure that the
data is always supplied in the same order.
'''
]
To calculate the hash of an iterator range you can use
[funcref boost::hash_range]:
std::vector<std::string> some_strings;
std::size_t hash = boost::hash_range(some_strings.begin(), some_strings.end());
[endsect]
[endsect]
[section:portability Portability]
Boost.Hash is written to be as portable as possible, but unfortunately, several
older compilers don't support argument dependent lookup (ADL) - the mechanism
used for customization. On those compilers custom overloads for hash_value
need to be declared in the boost namespace.
On a strictly standards compliant compiler, an overload defined in the
boost namespace won't be found when [classref boost::hash] is instantiated,
so for these compilers the overload should only be declared in the same
namespace as the class.
Let's say we have a simple custom type:
namespace foo
{
struct custom_type
{
int value;
friend inline std::size_t hash_value(custom_type x)
{
boost::hash<int> hasher;
return hasher(x.value);
}
};
}
On a compliant compiler, when `hash_value` is called for this type,
it will look at the namespace inside the type and find `hash_value`
but on a compiler which doesn't support ADL `hash_value` won't be found.
So on these compilers define a member function:
#ifndef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP
friend inline std::size_t hash_value(custom_type x)
{
boost::hash<int> hasher;
return hasher(x.value);
}
#else
std::size_t hash() const
{
boost::hash<int> hasher;
return hasher(value);
}
#endif
which will be called from the `boost` namespace:
#ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP
namespace boost
{
std::size_t hash_value(foo::custom_type x)
{
return x.hash();
}
}
#endif
Full code for this example is at
[@../../libs/functional/hash/examples/portable.cpp /libs/functional/hash/examples/portable.cpp].
[endsect]
[/ Quickbook insists on putting a paragraph around the escaped code, which
messes up the library section, so I wrap it in another section. This is no good,
but there you go. ]
[section:reference_ Reference]
'''
<library-reference>
<section id="hash.reference.specification">
<para>For the full specification, see section 6.3 of the
<ulink url="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1745.pdf">C++ Standard Library Technical Report</ulink>
and issue 6.18 of the
<ulink url="http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf">Library Extension Technical Report Issues List</ulink>.
</para>
</section>
<header name="boost/functional/hash.hpp">
<para>Includes all the following headers.</para>
</header>
<header name="boost/functional/hash/hash.hpp">
<para>
Defines <code><classname>boost::hash</classname></code>,
the implementation for built in types and
<code>std::string</code> and customisation functions.
</para>
<namespace name="boost">
<!--
boost::hash
-->
<struct name="hash">
<template>
<template-type-parameter name="T"/>
</template>
<inherit access="public">
<classname>std::unary_function&lt;T, std::size_t&gt;</classname>
</inherit>
<purpose>An STL compliant hash function object.</purpose>
<method name="operator()" cv="const">
<type>std::size_t</type>
<parameter name="val">
<paramtype>T const&amp;</paramtype>
</parameter>
<returns><para>
<programlisting><functionname>hash_value</functionname>(val)</programlisting>
</para></returns>
<notes><para>
The call to <code><functionname>hash_value</functionname></code>
is always unqualified, so that custom overloads can be
found via argument dependent lookup.
</para></notes>
<throws><para>
Only throws if
<code><functionname>hash_value</functionname>(T)</code> throws.
</para></throws>
</method>
</struct>
<!--
boost::hash_value
-->
<overloaded-function name="hash_value">
<purpose>
Implementation of a hash function for various types.
</purpose>
<description><para>
Should never be called directly by users, instead they should use
<classname>boost::hash</classname>, <functionname>boost::hash_range</functionname>
or <functionname>boost::hash_combine</functionname> which
call hash_value without namespace qualification so that overloads
for custom types are found via ADL.
</para></description>
<notes>
<para>Overloads for other types supplied in other headers.</para>
<para>This is an extension to TR1</para>
</notes>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>int</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>unsigned int</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>long</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>unsigned long</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>float</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>double</paramtype></parameter>
</signature>
<signature>
<type>std::size_t</type>
<parameter name="val"><paramtype>long double</paramtype></parameter>
</signature>
<signature>
<template><template-type-parameter name="T"/></template>
<type>std::size_t</type>
<parameter name="val"><paramtype>T*</paramtype></parameter>
</signature>
<signature>
<template>
<template-type-parameter name="Ch"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val">
<paramtype>std::basic_string&lt;Ch, std::char_traits&lt;Ch&gt;, A&gt; const&amp;</paramtype>
</parameter>
</signature>
<returns>
<para>
<informaltable>
<tgroup cols="2">
<tbody>
<row>
<entry>int, unsigned int, long, unsigned long</entry>
<entry>v</entry>
</row>
<row>
<entry>float, double, long double, T*</entry>
<entry>
An unspecified value, except that equal arguments shall yield the same
result
</entry>
</row>
<row>
<entry>std::basic_string&lt;Ch, std::char_traits&lt;Ch&gt;, A&gt; const&amp;</entry>
<entry>hash_range(v.begin(), v.end());</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</para>
</returns>
</overloaded-function>
<!--
boost::hash_combine
-->
<function name="hash_combine">
<template>
<template-type-parameter name="T"/>
</template>
<type>void</type>
<parameter name="seed"><paramtype>size_t &amp;</paramtype></parameter>
<parameter name="v"><paramtype>T const &amp;</paramtype></parameter>
<purpose>
Called repeatedly to incrementally create a hash value from
several variables.
</purpose>
<effects><programlisting>seed ^= <functionname>hash_value</functionname>(v) + 0x9e3779b9 + (seed &lt;&lt; 6) + (seed &gt;&gt; 2);</programlisting></effects>
<notes>
<para><functionname>hash_value</functionname> is called without
qualification, so that overloads can be found via ADL.</para>
<para>This is an extension to TR1</para>
</notes>
<throws>
Only throws if <functionname>hash_value</functionname>(T) throws.
Strong exception safety, as long as <functionname>hash_value</functionname>(T)
also has strong exception safety.
</throws>
</function>
<!--
boost::hash_range
-->
<overloaded-function name="hash_range">
<signature>
<template>
<template-type-parameter name="It"/>
</template>
<type>std::size_t</type>
<parameter name="first"><paramtype>It</paramtype></parameter>
<parameter name="last"><paramtype>It</paramtype></parameter>
</signature>
<signature>
<template>
<template-type-parameter name="It"/>
</template>
<type>void</type>
<parameter name="seed"><paramtype>std::size_t&amp;</paramtype></parameter>
<parameter name="first"><paramtype>It</paramtype></parameter>
<parameter name="last"><paramtype>It</paramtype></parameter>
</signature>
<purpose>
Calculate the combined hash value of the elements of an iterator
range.
</purpose>
<effects><programlisting>
size_t seed = 0;
for(; first != last; ++first)
{
<functionname>hash_combine</functionname>(seed, *first);
}
return seed;
</programlisting></effects>
<notes>
<para>
<code>hash_range</code> is sensitive to the order of the elements
so it wouldn't be appropriate to use this with an unordered
container.
</para>
<para>This is an extension to TR1</para>
</notes>
<throws><para>
Only throws if <code><functionname>hash_value</functionname>(std::iterator_traits&lt;It&gt;::value_type)</code>
throws. <code>hash_range(std::size_t&amp;, It, It)</code> has basic exception safety as long as
<code><functionname>hash_value</functionname>(std::iterator_traits&lt;It&gt;::value_type)</code>
has basic exception safety.
</para></throws>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/pair.hpp">
<para>
Hash implementation for <code>std::pair</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="A"/>
<template-type-parameter name="B"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::pair&lt;A, B&gt; const &amp;</paramtype></parameter>
</signature>
<effects><programlisting>
size_t seed = 0;
<functionname>hash_combine</functionname>(seed, val.first);
<functionname>hash_combine</functionname>(seed, val.second);
return seed;
</programlisting></effects>
<throws>
Only throws if <code><functionname>hash_value</functionname>(A)</code>
or <code><functionname>hash_value</functionname>(B)</code> throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/vector.hpp">
<para>
Hash implementation for <code>std::vector</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="T"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::vector&lt;T, A&gt; const &amp;</paramtype></parameter>
</signature>
<returns>
<code><functionname>hash_range</functionname>(val.begin(), val.end());</code>
</returns>
<throws>
Only throws if <code><functionname>hash_value</functionname>(T)</code> throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/list.hpp">
<para>
Hash implementation for <code>std::list</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="T"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::list&lt;T, A&gt; const &amp;</paramtype></parameter>
</signature>
<returns>
<code><functionname>hash_range</functionname>(val.begin(), val.end());</code>
</returns>
<throws>
Only throws if <code><functionname>hash_value</functionname>(T)</code> throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/deque.hpp">
<para>
Hash implementation for <code>std::deque</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="T"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::deque&lt;T, A&gt; const &amp;</paramtype></parameter>
</signature>
<returns>
<code><functionname>hash_range</functionname>(val.begin(), val.end());</code>
</returns>
<throws>
Only throws if <code><functionname>hash_value</functionname>(T)</code> throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/set.hpp">
<para>
Hash implementation for <code>std::set</code> and <code>std::multiset</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="K"/>
<template-type-parameter name="C"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::set&lt;K, C, A&gt; const &amp;</paramtype></parameter>
</signature>
<signature>
<template>
<template-type-parameter name="K"/>
<template-type-parameter name="C"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::multiset&lt;K, C, A&gt; const &amp;</paramtype></parameter>
</signature>
<returns>
<code><functionname>hash_range</functionname>(val.begin(), val.end());</code>
</returns>
<throws>
Only throws if <code><functionname>hash_value</functionname>(T)</code> throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
<header name="boost/functional/hash/map.hpp">
<para>
Hash implementation for <code>std::map</code> and <code>std::multimap</code>.
</para>
<namespace name="boost">
<overloaded-function name="hash_value">
<signature>
<template>
<template-type-parameter name="K"/>
<template-type-parameter name="T"/>
<template-type-parameter name="C"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::map&lt;K, T, C, A&gt; const &amp;</paramtype></parameter>
</signature>
<signature>
<template>
<template-type-parameter name="K"/>
<template-type-parameter name="T"/>
<template-type-parameter name="C"/>
<template-type-parameter name="A"/>
</template>
<type>std::size_t</type>
<parameter name="val"><paramtype>std::multimap&lt;K, T, C, A&gt; const &amp;</paramtype></parameter>
</signature>
<returns>
<code><functionname>hash_range</functionname>(val.begin(), val.end());</code>
</returns>
<throws>
Only throws if
<code><functionname>hash_value</functionname>(std::pair&lt;K const, T&gt;)</code>
throws.
</throws>
<notes><para>This is an extension to TR1</para></notes>
</overloaded-function>
</namespace>
</header>
</library-reference>
'''
[/ Remove this comment and everything goes wrong! ;) ]
[endsect]
[section:acknowledgements Acknowledgements]
This library is based on the design by Peter Dimov. During the inital development
Joaquín M López Muñoz made many useful suggestions and contributed fixes.
The original implementation came from Jeremy B. Maitin-Shepard's hash table
library, although this is a complete rewrite.
[endsect]