Utility - Lexicographic
The class boost::lexicographic provides an easy way
to avoid complex and errorprone if-else cascades to do lexicographic
comparisions on certain different criteria. The class is in the header
boost/utility/lexicographic.hpp and depends on no others headers.
The test code is in
lexicographic_test.cpp.
Often one has to write comparisions which give an ordering between
various kinds of data. When they look in a certain
specified order at one relation between two data items at a time and
result in a lexicographic comparision of all these relations the
programmer often has to write long if-else cascades. These cascades
are often complex and difficult to maintain. The class
boost::lexicographic helps in this scenario. Its constructor
and function call operator takes two data items which need to be
compared as arguments and performs to comparision. The order in which
the function call operators are called determine the lexicographic order
of the relations. Since the result of all further comparisions might not
be needed after a certain step, they are not executed.
The logic of the class assumes an ascending order as implied by the
operator <. If a descending order needs to be obtained
one can just switch the order of the arguments. Additionally, both the
constructor and the function call operator provide also a three argument
form which takes a functor for comparisions as a third argument.
std::lexicographic_compare
The standard C++ algorithm std::lexicographic_compare
does essentially the same thing but in a different situation. It compares
sequences of data items of equal type. Whereas boost::lexicographic
compares individual data items of different type, and every comparison
must be specified explicitly by using the function call operator of the class.
Advantages
boost::lexicographic as a functor.
boost::lexicographic cmp (complex_computation (a), complex_computation (b));
if (cmp.result () == lexicographic::equivalent)
{
cmp (complex_computation (c), complex_computation (d));
if (cmp.result () == lexicographic::equivalent)
{
cmp (complex_computation (e), complex_computation (f));
}
}
// do something with cmp
But this construct eats up many of the advantages of using
boost::lexicographic.
boost::lexicographic, besides
the lack of short-circuiting, is not negligible.
Tests with gcc 3.2.2 showed, that the algorithmic overhead
is about 40% in comparison to according to if-else-cascades.
Additionally gcc failed to inline everything properly, so that the
resulting performance overhead was about a factor two.
An example usage are special sorting operators, such as the lexicographic ordering of tuples:
struct position
{
double x, y, z;
};
bool operator < (position const &p1, position const &p2)
{
return boost::lexicographic (p1.x, p2.x)
(p1.y, p2.y)
(p1.z, p2.z);
}
An alternative form of writing this without boost::lexicographic
would be this:
bool operator < (position const &p1, position const &p2)
{
if (p1.x == p2.x)
if (p1.y == p2.y)
return p1.z < p2.z;
else
return p1.y < p2.y;
else
return p1.x < p2.x;
}
It is also easy to use different functor such as a case insensitive
comparision function object in the next example.
struct person
{
std::string firstname, lastname;
};
bool operator < (person const &p1, person const &p2)
{
return boost::lexicographic
(p1.lastname, p2.lastname, cmp_case_insensitive)
(p1.firstname, p2.firstname, cmp_case_insensitive);
}
namespace boost
{
class lexicographic
{
public:
enum result_type { minus = -1, equivalent, plus };
template <typename T1, typename T2>
lexicographic (T1 const &a, T2 const &b);
template <typename T1, typename T2, typename Cmp>
lexicographic (T1 const &a, T2 const &b, Cmp cmp);
template <typename T1, typename T2>
lexicographic &operator () (T1 const &a, T2 const &b);
template <typename T1, typename T2, typename Cmp>
lexicographic &operator () (T1 const &a, T2 const &b, Cmp cmp);
result_type result () const;
operator unspecified_bool_type () const;
};
bool operator == (lexicographic l1, lexicographic l2);
bool operator != (lexicographic l1, lexicographic l2);
}
enum result_type { minus = -1, equivalent = 0, plus = +1 };
Defines the result type of the class. It is kept as internal state and is returned by
result (). The integer representation of it is equivalent to the one returned bystd::strcmp.
minus- the sequence of the first arguments of constructor and function call operators is lexicographically less than the according sequence of the second arguments.equivalent- all elements of the sequences of the first and the second arguments are identical.plus- the sequence of the first arguments of constructor and function call operators is lexicographically greater than the according sequence of the second arguments.
template <typename T1, typename T2>
lexicographic (T1 const &a, T2 const &b);
Constructs new object and does the first comparision step between
aandb. It usesoperator <for comparisions.
template <typename T1, typename T2, typename Cmp>
lexicographic (T1 const &a, T2 const &b, Cmp cmp);
Constructs new object and does the first comparision step between
aandb. It usescmpfor comparisions.
template <typename T1, typename T2>
lexicographic &operator () (T1 const &a, T2 const &b);
Does next comparision step on object between
aandb. It usesoperator <for comparisions.
template <typename T1, typename T2, typename Cmp>
lexicographic &operator () (T1 const &a, T2 const &b, Cmp cmp);
Does next comparision step on object between
aandb. It usescmpfor comparisions.
result_type result () const;
Gives result of already done comparision steps.
operator unspecified_bool_type () const;
This conversion operator allows objects to be used in boolean contexts, like
if (lexicographic (a, b)) {}. The actual target type is typically a pointer to a member function, avoiding many of the implicit conversion pitfalls.
It evaluates totrueifresult () == minus, otherwise tofalse.
bool operator == (lexicographic l1, lexicographic l2);
Returns
l1.result () == l2.result (). That means it returnstrueif both objects are in the same state.
bool operator != (lexicographic l1, lexicographic l2);
Returns
l1.result () != l2.result (). That means it returnstrueif the two objects are in the a different state.
The author of boost::lexicographic is Jan Langer (jan@langernetz.de).
Ideas and suggestions from Steve Cleary, David Abrahams, Gennaro Proata, Paul Bristow, Daniel Frey, Daryle Walker and Brian McNamara were used.
October 5, 2003
© Copyright Jan Langer 2003
Use, modification, and distribution is subject to the Boost Software
License, Version 1.0. (See accompanying file
LICENSE_1_0.txt or copy at
www.boost.org/LICENSE_1_0.txt)