diff --git a/doc/Jamfile b/doc/Jamfile index 727d6c8..36104cb 100644 --- a/doc/Jamfile +++ b/doc/Jamfile @@ -11,12 +11,36 @@ import asciidoctor ; html mp11.html : mp11.adoc ; -install html_ : mp11.html : html ; +html simple_cxx11_metaprogramming.html : + article/simple_cxx11_metaprogramming.adoc ; + +html simple_cxx11_metaprogramming_2.html : + article/simple_cxx11_metaprogramming_2.adoc ; + +install html_ : + mp11.html + simple_cxx11_metaprogramming.html + simple_cxx11_metaprogramming_2.html : + html ; pdf mp11.pdf : mp11.adoc ; -explicit mp11.pdf ; -install pdf_ : mp11.pdf : pdf ; +pdf simple_cxx11_metaprogramming.pdf : + article/simple_cxx11_metaprogramming.adoc ; + +pdf simple_cxx11_metaprogramming_2.pdf : + article/simple_cxx11_metaprogramming_2.adoc ; + +explicit mp11.pdf + simple_cxx11_metaprogramming.pdf + simple_cxx11_metaprogramming_2.pdf ; + +install pdf_ : + mp11.pdf + simple_cxx11_metaprogramming.pdf + simple_cxx11_metaprogramming_2.pdf : + pdf ; + explicit pdf_ ; ############################################################################### diff --git a/doc/article/simple_cxx11_metaprogramming.adoc b/doc/article/simple_cxx11_metaprogramming.adoc new file mode 100644 index 0000000..c87b22e --- /dev/null +++ b/doc/article/simple_cxx11_metaprogramming.adoc @@ -0,0 +1,1215 @@ +//// +Copyright 2015-2017 Peter Dimov + +Distributed under the Boost Software License, Version 1.0. + +See accompanying file LICENSE_1_0.txt or copy at +http://www.boost.org/LICENSE_1_0.txt +//// + +# Simple {cpp}11 metaprogramming +Peter Dimov +2015-05-26 + +[.lead] +__With variadic templates, parameter packs and template aliases__ + +NOTE: I was motivated to write this after I read Eric Niebler's +thought-provoking +http://ericniebler.com/2014/11/13/tiny-metaprogramming-library/[Tiny +Metaprogramming] Library article. Thanks Eric. + +## {cpp}11 changes the playing field + +The wide acceptance of http://www.boost.org/libs/mpl[Boost.MPL] made {cpp} +metaprogramming seem a solved problem. Perhaps MPL wasn't ideal, but it was +good enough to the point that there wasn't really a need to seek or produce +alternatives. + +{cpp}11 changed the playing field. The addition of variadic templates with +their associated parameter packs added a compile-time list of types structure +directly into the language. Whereas before every metaprogramming library +defined its own type list, and MPL defined several, in {cpp}11, type lists are +as easy as +``` +// C++11 +template struct type_list {}; +``` +and there is hardly a reason to use anything else. + +Template aliases are another game changer. Previously, "metafunctions", that +is, templates that took one type and produced another, looked like +``` +// C++03 +template struct add_pointer { typedef T* type; }; +``` +and were used in the following manner: +``` +// C++03 +typedef typename add_pointer::type Xp; +``` +In {cpp}11, metafunctions can be template aliases, instead of class templates: +``` +// C++11 +template using add_pointer = T*; +``` +The above example use then becomes +``` +// C++11 +typedef add_pointer Xp; +``` +or, if you prefer to be seen as {cpp}11-savvy, +``` +// C++11 +using Xp = add_pointer; +``` +This is a considerable improvement in more complex expressions: +``` +// C++03 +typedef + typename add_reference< + typename add_const< + typename add_pointer::type + >::type + >::type Xpcr; +``` +``` +// C++11 +using Xpcr = add_reference>>; +``` +(The example also takes advantage of another {cpp}11 feature - you can now use +`>>` to close templates without it being interpreted as a right shift.) + +In addition, template aliases can be passed to template template parameters: +``` +// C++11 +template class F> struct X +{ +}; + +X; // works! +``` +These language improvements allow for {cpp}11 metaprogramming that is +substantially different than its idomatic {cpp}03 equivalent. Boost.MPL is no +longer good enough, and __something must be done__. But what? + +## Type lists and `mp_rename` + +Let's start with the basics. Our basic data structure will be the type list: +``` +template struct mp_list {}; +``` +Why the `mp_` prefix? mp obviously stands for metaprogramming, but could we not +have used a namespace? + +Indeed we could have. Past experience with Boost.MPL however indicates that +name conflicts between our metaprogramming primitives and standard identifiers +(such as `list`) and keywords (such as `if`, `int` or `true`) will be common +and will be a source of problems. With a prefix, we avoid all that trouble. + +So we have our type list and can put things into it: +``` +using list = mp_list; +``` +but can't do anything else with it yet. We'll need a library of primitives that +operate on ``mp_list``s. But before we get into that, let's consider another +interesting question first. + +Suppose we have our library of primitives that can do things with a `mp_list`, +but some other code hands us a type list that is not an `mp_list`, such as for +example an `std::tuple`, or +``http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4115.html[std::packer]``. + +Suppose we need to modify this external list of types in some manner (change +the types into pointers, perhaps) and give back the transformed result in the +form it was given to us, `std::tuple` in the first +case and `std::packer` in the second. + +To do that, we need to first convert `std::tuple` to +`mp_list`, apply `add_pointer` to each element obtaining +`mp_list`, then convert that back to `std::tuple`. + +These conversion steps are a quite common occurence, and we'll write a +primitive that helps us perform them, called `mp_rename`. We want +``` +mp_rename, mp_list> +``` +to give us +``` +mp_list +``` +and conversely, +``` +mp_rename, std::tuple> +``` +to give us +``` +std::tuple +``` +Here is the implementation of `mp_rename`: +``` +template class B> struct mp_rename_impl; + +template class A, class... T, template class B> + struct mp_rename_impl, B> +{ + using type = B; +}; + +template class B> + using mp_rename = typename mp_rename_impl::type; +``` +(This pattern of a template alias forwarding to a class template doing the +actual work is common; class templates can be specialized, whereas template +aliases cannot.) + +Note that `mp_rename` does not treat any list type as special, not even +`mp_list`; it can rename any variadic class template into any other. You could +use it to rename `std::packer` to `std::tuple` to `std::variant` (once there is +such a thing) and it will happily oblige. + +In fact, it can even rename non-variadic class templates, as in the following +examples: +``` +mp_rename, std::tuple> // -> std::tuple +mp_rename, std::pair> // -> std::pair +mp_rename, std::unique_ptr> // -> std::unique_ptr +``` +There is a limit to the magic; `unique_ptr` can't be renamed to `shared_ptr`: +``` +mp_rename, std::shared_ptr> // error +``` +because `unique_ptr` is actually `unique_ptr>` and `mp_rename` renames it to `shared_ptr>`, which doesn't compile. But it still works in many +more cases than one would naively expect at first. + +With conversions no longer a problem, let's move on to primitives and define a +simple one, `mp_size`, for practice. We want `mp_size>` to +give us the number of elements in the list, that is, the value of the +expression `sizeof$$...$$(T)`. +``` +template struct mp_size_impl; + +template struct mp_size_impl> +{ + using type = std::integral_constant; +}; + +template using mp_size = typename mp_size_impl::type; +``` +This is relatively straightforward, except for the `std::integral_constant`. +What is it and why do we need it? + +`std::integral_constant` is a standard {cpp}11 type that wraps an integral +constant (that is, a compile-time constant integer value) into a type. + +Since metaprogramming operates on type lists, which can only hold types, it's +convenient to represent compile-time constants as types. This allows us to +treat lists of types and lists of values in a uniform manner. It is therefore +idiomatic in metaprogramming to take and return types instead of values, and +this is what we have done. If at some later point we want the actual value, we +can use the expression `mp_size::value` to retrieve it. + +We now have our `mp_size`, but you may have noticed that there's an interesting +difference between `mp_size` and `mp_rename`. Whereas I made a point of +`mp_rename` not treating `mp_list` as a special case, `mp_size` very much does: +``` +template struct mp_size_impl> +``` +Is this really necessary? Can we not use the same technique in the +implementation of `mp_size` as we did in mp_rename? +``` +template struct mp_size_impl; + +template class L, class... T> struct mp_size_impl> +{ + using type = std::integral_constant; +}; + +template using mp_size = typename mp_size_impl::type; +``` +Yes, we very much can, and this improvement allows us to use `mp_size` on any +other type lists, such as `std::tuple`. It turns `mp_size` into a truly generic +primitive. + +This is nice. It is so nice that I'd argue that all our metaprogramming +primitives ought to have this property. If someone hands us a type list in the +form of an `std::tuple`, we should be able to operate on it directly, avoiding +the conversions to and from `mp_list`. + +So do we no longer have any need for `mp_rename`? Not quite. Apart from the +fact that sometimes we really do need to rename type lists, there is another +surprising task for which `mp_rename` is useful. + +To illustrate it, let me introduce the primitive `mp_length`. It's similar to +`mp_size`, but while `mp_size` takes a type list as an argument, `mp_length` +takes a variadic parameter pack and returns its length; or, stated differently, +it returns its number of arguments: +``` +template using mp_length = std::integral_constant; +``` +How would we implement `mp_size` in terms of `mp_length`? One option is to just +substitute the implementation of the latter into the former: +``` +template class L, class... T> struct mp_size_impl> +{ + using type = mp_length; +}; +``` +but there is another way, much less mundane. Think about what `mp_size` does. +It takes the argument +[subs=+quotes] +``` +**mp_list** +``` +and returns +[subs=+quotes] +``` +**mp_length** +``` +Do we already have a primitive that does a similar thing? + +(Not much of a choice, is there?) + +Indeed we have, and it's called `mp_rename`. +``` +template using mp_size = mp_rename; +``` +I don't know about you, but I find this technique fascinating. It exploits the +structural similarity between a list, `L`, and a metafunction "call", +`F`, and the fact that the language sees the things the same way and +allows us to pass the template alias `mp_length` to `mp_rename` as if it were +an ordinary class template such as `mp_list`. + +(Other metaprogramming libraries provide a dedicated `apply` primitive for +this job. `apply` calls the metafunction `F` with the contents of the +list `L`. We'll add an alias `mp_apply` that calls `mp_rename` for +readability.) +``` +template class F, class L> using mp_apply = mp_rename; +``` + +## `mp_transform` + +Let's revisit the example I gave earlier - someone hands us `std::tuple` and we need to compute `std::tuple`. We already have +`add_pointer`: +``` +template using add_pointer = T*; +``` +so we just need to apply it to each element of the input tuple. + +The algorithm that takes a function and a list and applies the function to each +element is called `transform` in Boost.MPL and the STL and `map` in functional +languages. We'll use `transform`, for consistency with the established {cpp} +practice (`map` is a data structure in both the STL and Boost.MPL.) + +We'll call our algorithm `mp_transform`, and `mp_transform` will apply +`F` to each element of `L` and return the result. Usually, the argument order +is reversed and the function comes last. Our reasons to put it at the front +will become evident later. + +There are many ways to implement `mp_transform`; the one we'll pick will make +use of another primitive, `mp_push_front`. `mp_push_front`, as its name +implies, adds `T` as a first element in `L`: +``` +template struct mp_push_front_impl; + +template class L, class... U, class T> + struct mp_push_front_impl, T> +{ + using type = L; +}; + +template + using mp_push_front = typename mp_push_front_impl::type; +``` +There is no reason to constrain `mp_push_front` to a single element though. In +{cpp}11, variadic templates should be our default choice, and the +implementation of `mp_push_front` that can take an arbitrary number of elements +is almost identical: +``` +template struct mp_push_front_impl; + +template class L, class... U, class... T> + struct mp_push_front_impl, T...> +{ + using type = L; +}; + +template + using mp_push_front = typename mp_push_front_impl::type; +``` +On to `mp_transform`: +``` +template class F, class L> struct mp_transform_impl; + +template class F, class L> + using mp_transform = typename mp_transform_impl::type; + +template class F, template class L> + struct mp_transform_impl> +{ + using type = L<>; +}; + +template class F, template class L, class T1, class... T> + struct mp_transform_impl> +{ + using _first = F; + using _rest = mp_transform>; + + using type = mp_push_front<_rest, _first>; +}; +``` +This is a straightforward recursive implementation that should be familiar to +people with functional programming background. + +Can we do better? It turns out that in {cpp}11, we can. +``` +template class F, class L> struct mp_transform_impl; + +template class F, class L> + using mp_transform = typename mp_transform_impl::type; + +template class F, template class L, class... T> + struct mp_transform_impl> +{ + using type = L...>; +}; +``` +Here we take advantage of the fact that pack expansion is built into the +language, so the `F$$...$$` part does all the iteration work for us. + +We can now solve our original challenge: given an `std::tuple` of types, return +an `std::tuple` of pointers to these types: +``` +using input = std::tuple; +using expected = std::tuple; + +using result = mp_transform; + +static_assert( std::is_same::value, "" ); +``` + +## `mp_transform`, part two + +What if we had a pair of tuples as input, and had to produce the corresponding +tuple of pairs? For example, given +``` +using input = std::pair, std::tuple>; +``` +we had to produce +``` +using expected = std::tuple, std::pair, std::pair>; +``` +We need to take the two lists, represented by tuples in the input, and combine +them pairwise by using `std::pair`. If we think of `std::pair` as a function +`F`, this task appears very similar to `mp_transform`, except we need to use a +binary function and two lists. + +Changing our unary transform algorithm into a binary one isn't hard: +``` +template class F, class L1, class L2> + struct mp_transform2_impl; + +template class F, class L1, class L2> + using mp_transform2 = typename mp_transform2_impl::type; + +template class F, + template class L1, class... T1, + template class L2, class... T2> + struct mp_transform2_impl, L2> +{ + static_assert( sizeof...(T1) == sizeof...(T2), + "The arguments of mp_transform2 should be of the same size" ); + + using type = L1...>; +}; +``` +and we can now do +``` +using input = std::pair, std::tuple>; +using expected = std::tuple, std::pair, std::pair>; + +using result = mp_transform2; + +static_assert( std::is_same::value, "" ); +``` +again exploiting the similarity between metafunctions and ordinary class +templates such as `std::pair`, this time in the other direction; we pass +`std::pair` where `mp_transform2` expects a metafunction. + +Do we _have_ to use separate transform algorithms for each arity though? If we +need a transform algorithm that takes a ternary function and three lists, +should we name it `mp_transform3`? No, this is exactly why we put the function +first. We just have to change `mp_transform` to be variadic: +``` +template class F, class... L> struct mp_transform_impl; + +template class F, class... L> + using mp_transform = typename mp_transform_impl::type; +``` +and then add the unary and binary specializations: +``` +template class F, template class L, class... T> + struct mp_transform_impl> +{ + using type = L...>; +}; + +template class F, + template class L1, class... T1, + template class L2, class... T2> + struct mp_transform_impl, L2> +{ + static_assert( sizeof...(T1) == sizeof...(T2), + "The arguments of mp_transform should be of the same size" ); + + using type = L1...>; +}; +``` +We can also add ternary and further specializations. + +Is it possible to implement the truly variadic `mp_transform`, one that works +with an arbitrary number of lists? It is in principle, and I'll show one +possible abridged implementation here for completeness: +``` +template class F, class E, class... L> + struct mp_transform_impl; + +template class F, class... L> + using mp_transform = typename mp_transform_impl, L...>::type; + +template class F, class L1, class... L> + struct mp_transform_impl +{ + using type = mp_clear; +}; + +template class F, class... L> + struct mp_transform_impl +{ + using _first = F< typename mp_front_impl::type... >; + using _rest = mp_transform< F, typename mp_pop_front_impl::type... >; + + using type = mp_push_front<_rest, _first>; +}; +``` +but will omit the primitives that it uses. These are + +* `mp_true` -- an alias for `std::integral_constant`. +* `mp_false` -- an alias for `std::integral_constant`. +* `mp_empty` -- returns `mp_true` if all lists are empty, `mp_false` + otherwise. +* `mp_clear` -- returns an empty list of the same type as `L`. +* `mp_front` -- returns the first element of `L`. +* `mp_pop_front` -- returns `L` without its first element. + +There is one interesting difference between the recursive `mp_transform` +implementation and the language-based one. `mp_transform>` works with the `F$$...$$` implementation and fails +with the recursive one, because `std::pair` is not a real type list and can +only hold exactly two types. + +## The infamous tuple_cat challenge + +Eric Niebler, in his +http://ericniebler.com/2014/11/13/tiny-metaprogramming-library/[Tiny +Metaprogramming Library] article, gives the function +http://en.cppreference.com/w/cpp/utility/tuple/tuple_cat[`std::tuple_cat`] as a +kind of a metaprogramming challenge. `tuple_cat` is a variadic template +function that takes a number of tuples and concatenates them into another +`std::tuple`. This is Eric's solution: +``` +namespace detail +{ + template + Ret tuple_cat_(typelist, typelist, + Tuples tpls) + { + return Ret{std::get( + std::get(tpls))...}; + } +} + +template, + typelist_cat_t...> > > > +Res tuple_cat(Tuples &&... tpls) +{ + static constexpr std::size_t N = sizeof...(Tuples); + // E.g. [0,0,0,2,2,2,3,3] + using inner = + typelist_cat_t< + typelist_transform_t< + typelist...>, + typelist_transform_t< + as_typelist_t >, + meta_quote >, + meta_quote > >; + // E.g. [0,1,2,0,1,2,0,1] + using outer = + typelist_cat_t< + typelist_transform_t< + typelist...>, + meta_compose< + meta_quote, + meta_quote_i, + meta_quote > > >; + return detail::tuple_cat_( + inner{}, + outer{}, + std::forward_as_tuple(std::forward(tpls)...)); +} +``` +All right, challenge accepted. Let's see what we can do. + +As Eric explains, this implementation relies on the clever trick of packing the +input tuples into a tuple, creating two arrays of indices, `inner` and `outer`, +then indexing the outer tuple with the outer indices and the result, which is +one of our input tuples, with the inner indices. + +So, for example, if tuple_cat is invoked as +``` +std::tuple t1; +std::tuple<> t2; +std::tuple t3; +std::tuple t4; + +auto res = tuple_cat(t1, t2, t3, t4); +``` +we'll create the tuple +``` +std::tuple, std::tuple<>, + std::tuple, std::tuple> t{t1, t2, t3, t4}; +``` +and then extract the elements of t via +``` +std::get<0>(std::get<0>(t)), // t1[0] +std::get<1>(std::get<0>(t)), // t1[1] +std::get<2>(std::get<0>(t)), // t1[2] +std::get<0>(std::get<2>(t)), // t3[0] +std::get<1>(std::get<2>(t)), // t3[1] +std::get<2>(std::get<2>(t)), // t3[2] +std::get<0>(std::get<3>(t)), // t4[0] +std::get<1>(std::get<3>(t)), // t4[1] +``` +(`t2` is empty, so we take nothing from it.) + +The first column of integers is the `outer` array, the second one - the `inner` +array, and these are what we need to compute. But first, let's deal with the +return type of `tuple_cat`. + +The return type of `tuple_cat` is just the concatenation of the arguments, +viewed as type lists. The metaprogramming algorithm that concatenates lists is +called +https://ericniebler.github.io/meta/group__transformation.html[`meta::concat`] +in Eric Niebler's https://github.com/ericniebler/meta[Meta] library, but I'll +call it `mp_append`, after its classic Lisp name. + +(Lisp is today's equivalent of Latin. Educated people are supposed to have +studied and forgotten it.) +``` +template struct mp_append_impl; + +template using mp_append = typename mp_append_impl::type; + +template<> struct mp_append_impl<> +{ + using type = mp_list<>; +}; + +template class L, class... T> struct mp_append_impl> +{ + using type = L; +}; + +template class L1, class... T1, + template class L2, class... T2, class... Lr> + struct mp_append_impl, L2, Lr...> +{ + using type = mp_append, Lr...>; +}; +``` +That was fairly easy. There are other ways to implement `mp_append`, but this +one demonstrates how the language does most of the work for us via pack +expansion. This is a common theme in {cpp}11. + +Note how `mp_append` returns the same list type as its first argument. Of +course, in the case in which no arguments are given, there is no first argument +from which to take the type, so I've arbitrarily chosen to return an empty +`mp_list`. + +We're now ready with the declaration of `tuple_cat`: +``` +template::type...>> + R tuple_cat( Tp &&... tp ); +``` +The reason we need `remove_reference` is because of the rvalue reference +parameters, used to implement perfect forwarding. If the argument is an lvalue, +such as for example `t1` above, its corresponding type will be a reference to a +tuple -- `std::tuple&` in ``t1``'s case. Our primitives do +not recognize references to tuples as type lists, so we need to strip them off. + +There are two problems with our return type computation though. One, what if +`tuple_cat` is called without any arguments? We return `mp_list<>` in that +case, but the correct result is `std::tuple<>`. + +Two, what if we call `tuple_cat` with a first argument that is a `std::pair`? +We'll try to append more elements to `std::pair`, and it will fail. + +We can solve both our problems by using an empty tuple as the first argument to +`mp_append`: +``` +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ); +``` +With the return type taken care of, let's now move on to computing inner. We +have +``` +[x1, x2, x3], [], [y1, y2, y3], [z1, z2] +``` +as input and we need to output +``` +[0, 0, 0, 2, 2, 2, 3, 3] +``` +which is the concatenation of +``` +[0, 0, 0], [], [2, 2, 2], [3, 3] +``` +Here each tuple is the same size as the input, but is filled with a constant +that represents its index in the argument list. The first tuple is filled with +0, the second with 1, the third with 2, and so on. + +We can achieve this result if we first compute a list of indices, in our case +`[0, 1, 2, 3]`, then use binary `mp_transform` on the two lists +``` +[[x1, x2, x3], [], [y1, y2, y3], [z1, z2]] +[0, 1, 2, 3] +``` +and a function which takes a list and an integer (in the form of an +`std::integral_constant`) and returns a list that is the same size as the +original, but filled with the second argument. + +We'll call this function `mp_fill`, after `std::fill`. + +Functional programmers will immediately realize that `mp_fill` is +`mp_transform` with a function that returns a constant, and here's an +implementation along these lines: +``` +template struct mp_constant +{ + template using apply = V; +}; + +template + using mp_fill = mp_transform::template apply, L>; +``` +Here's an alternate implementation: +``` +template struct mp_fill_impl; + +template class L, class... T, class V> + struct mp_fill_impl, V> +{ + template using _fv = V; + using type = L<_fv...>; +}; + +template using mp_fill = typename mp_fill_impl::type; +``` +These demonstrate different styles and choosing one over the other is largely a +matter of taste here. In the first case, we combine existing primitives; in the +second case, we "inline" `mp_const` and even `mp_transform` in the body of +`mp_fill_impl`. + +Most {cpp}11 programmers will probably find the second implementation easier to +read. + +We can now `mp_fill`, but we still need the `[0, 1, 2, 3]` index sequence. We +could write an algorithm `mp_iota` for that (named after +http://en.cppreference.com/w/cpp/algorithm/iota[`std::iota`]), but it so +happens that {cpp}14 already has a standard way of generating an index +sequence, called +http://en.cppreference.com/w/cpp/utility/integer_sequence[`std::make_index_sequence`]. +Since Eric's original solution makes use of `make_index_sequence`, let's follow +his lead. + +Technically, this takes us outside of {cpp}11, but `make_index_sequence` is not +hard to implement (if efficiency is not a concern): +``` +template struct integer_sequence +{ +}; + +template struct next_integer_sequence; + +template struct next_integer_sequence> +{ + using type = integer_sequence; +}; + +template struct make_int_seq_impl; + +template + using make_integer_sequence = typename make_int_seq_impl::type; + +template struct make_int_seq_impl +{ + using type = typename next_integer_sequence< + typename make_int_seq_impl::type>::type; +}; + +template struct make_int_seq_impl +{ + using type = integer_sequence; +}; + +template + using index_sequence = integer_sequence; + +template + using make_index_sequence = make_integer_sequence; +``` +We can now obtain an `index_sequence<0, 1, 2, 3>`: +``` +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ) +{ + std::size_t const N = sizeof...(Tp); + + // inner + + using seq = make_index_sequence; +} +``` +but `make_index_sequence<4>` returns `integer_sequence`, which is not a type list. In order to work with it, we need to convert it +to a type list, so we'll introduce a function `mp_from_sequence` that does +that. +``` +template struct mp_from_sequence_impl; + +template class S, class U, U... J> + struct mp_from_sequence_impl> +{ + using type = mp_list...>; +}; + +template using mp_from_sequence = typename mp_from_sequence_impl::type; +``` +We can now compute the two lists that we wanted to transform with `mp_fill`: +``` +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ) +{ + std::size_t const N = sizeof...(Tp); + + // inner + + using list1 = mp_list::type...>; + using list2 = mp_from_sequence>; + + // list1: [[x1, x2, x3], [], [y1, y2, y3], [z1, z2]] + // list2: [0, 1, 2, 3] + + return R{}; +} +``` +and finish the job of computing `inner`: +``` +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ) +{ + std::size_t const N = sizeof...(Tp); + + // inner + + using list1 = mp_list::type...>; + using list2 = mp_from_sequence>; + + // list1: [[x1, x2, x3], [], [y1, y2, y3], [z1, z2]] + // list2: [0, 1, 2, 3] + + using list3 = mp_transform; + + // list3: [[0, 0, 0], [], [2, 2, 2], [3, 3]] + + using inner = mp_rename; // or mp_apply + + // inner: [0, 0, 0, 2, 2, 2, 3, 3] + + return R{}; +} +``` +For `outer`, we again have +``` +[x1, x2, x3], [], [y1, y2, y3], [z1, z2] +``` +as input and we need to output +``` +[0, 1, 2, 0, 1, 2, 0, 1] +``` +which is the concatenation of +``` +[0, 1, 2], [], [0, 1, 2], [0, 1] +``` +The difference here is that instead of filling the tuple with a constant value, +we need to fill it with increasing values, starting from 0, that is, with the +result of `make_index_sequence`, where `N` is the number of elements. + +The straightforward way to do that is to just define a metafunction `F` that +does what we want, then use `mp_transform` to apply it to the input: +``` +template using mp_iota = mp_from_sequence>; + +template using F = mp_iota>; + +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ) +{ + std::size_t const N = sizeof...(Tp); + + // outer + + using list1 = mp_list::type...>; + using list2 = mp_transform; + + // list2: [[0, 1, 2], [], [0, 1, 2], [0, 1]] + + using outer = mp_rename; + + // outer: [0, 1, 2, 0, 1, 2, 0, 1] + + return R{}; +} +``` +Well that was easy. Surprisingly easy. The one small annoyance is that we can't +define `F` inside `tuple_cat` - templates can't be defined in functions. + +Let's put everything together. +``` +template using mp_iota = mp_from_sequence>; + +template using F = mp_iota>; + +template +R tuple_cat_( mp_list, mp_list, Tp tp ) +{ + return R{ std::get(std::get(tp))... }; +} + +template, typename std::remove_reference::type...>> + R tuple_cat( Tp &&... tp ) +{ + std::size_t const N = sizeof...(Tp); + + // inner + + using list1 = mp_list::type...>; + using list2 = mp_from_sequence>; + + // list1: [[x1, x2, x3], [], [y1, y2, y3], [z1, z2]] + // list2: [0, 1, 2, 3] + + using list3 = mp_transform; + + // list3: [[0, 0, 0], [], [2, 2, 2], [3, 3]] + + using inner = mp_rename; // or mp_apply + + // inner: [0, 0, 0, 2, 2, 2, 3, 3] + + // outer + + using list4 = mp_transform; + + // list4: [[0, 1, 2], [], [0, 1, 2], [0, 1]] + + using outer = mp_rename; + + // outer: [0, 1, 2, 0, 1, 2, 0, 1] + + return tuple_cat_( inner(), outer(), + std::forward_as_tuple( std::forward(tp)... ) ); +} +``` +This almost compiles, except that our `inner` happens to be a `std::tuple`, but +our helper function expects an `mp_list`. (`outer` is already an `mp_list`, by +sheer luck.) We can fix that easily enough. +``` +return tuple_cat_( mp_rename(), outer(), + std::forward_as_tuple( std::forward(tp)... ) ); +``` +Let's define a `print_tuple` function and see if everything checks out. +``` +template struct print_tuple_ +{ + void operator()( std::tuple const & tp ) const + { + using Tp = typename std::tuple_element>::type; + + print_type( " ", ": " ); + + std::cout << std::get( tp ) << ";"; + + print_tuple_< I+1, N, T... >()( tp ); + } +}; + +template struct print_tuple_ +{ + void operator()( std::tuple const & ) const + { + } +}; + +template void print_tuple( std::tuple const & tp ) +{ + std::cout << "{"; + print_tuple_<0, sizeof...(T), T...>()( tp ); + std::cout << " }\n"; +} + +int main() +{ + std::tuple t1{ 1, 2 }; + std::tuple<> t2; + std::tuple t3{ 3, 4, 5 }; + std::pair t4{ "pv", "test" }; + + using expected = std::tuple; + + auto result = ::tuple_cat( t1, t2, t3, t4 ); + + static_assert( std::is_same::value, "" ); + + print_tuple( result ); +} +``` +Output: +``` +{ int: 1; long: 2; float: 3; double: 4; long double: 5; void const*: 0x407086; + char const*: test; } +``` +Seems to work. But there's at least one error left. To see why, replace the +first tuple +``` +std::tuple t1{ 1, 2 }; +``` +with a pair: +``` +std::pair t1{ 1, 2 }; +``` +We now get an error at +``` +using inner = mp_rename; +``` +because the first element of `list3` is an `std::pair`, which `mp_append` tries +and fails to use as its return type. + +There are two ways to fix that. The first one is to apply the same trick we +used for the return type, and insert an empty `mp_list` at the front of +`list3`, which `mp_append` will use as a return type: +``` +using inner = mp_rename>, mp_append>; +``` +The second way is to just convert all inputs to mp_list: +``` +using list1 = mp_list< + mp_rename::type, mp_list>...>; +``` +In both cases, inner will now be an `mp_list`, so we can omit the `mp_rename` +in the call to `tuple_cat_`. + +We're done. The results hopefully speak for themselves. + +## Higher order metaprogramming, or lack thereof + +Perhaps by now you're wondering why this article is called "Simple {cpp}11 +metaprogramming", since what we covered so far wasn't particularly simple. + +The _relative_ simplicity of our approach stems from the fact that we've not +been doing any higher order metaprogramming, that is, we haven't introduced any +primitives that return metafunctions, such as `compose`, `bind`, or a lambda +library. + +I posit that such higher order metaprogramming is, in the majority of cases, +not necessary in {cpp}11. Consider, for example, Eric Niebler's solution given +above: +``` +using outer = + typelist_cat_t< + typelist_transform_t< + typelist...>, + meta_compose< + meta_quote, + meta_quote_i, + meta_quote > > >; +``` +The `meta_compose` expression takes three other ("quoted") metafunctions and +creates a new metafunction that applies them in order. Eric uses this example +as motivation to introduce the concept of a "metafunction class" and then to +supply various primitives that operate on metafunction classes. + +But when we have metafunctions `F`, `G` and `H`, instead of using +`meta_compose`, in {cpp}11 we can just do +``` +template using Fgh = F>>; +``` +and that's it. The language makes defining composite functions easy, and there +is no need for library support. If the functions to be composed are +`as_typelist_t`, `std::make_index_sequence` and `typelist_size_t`, we just +define +``` +template + using F = as_typelist_t::value>>; +``` +Similarly, if we need a metafunction that will return `sizeof(T) < sizeof(U)`, +there is no need to enlist a metaprogramming lambda library as in +``` +lambda<_a, _b, less, sizeof_<_b>>>> +``` +We could just define it inline: +``` +template using sizeof_less = mp_bool<(sizeof(T) < sizeof(U))>; +``` + +## One more thing + +Finally, let me show the implementations of `mp_count` and `mp_count_if`, for +no reason other than I find them interesting. `mp_count` returns the +number of occurences of the type `V` in the list `L`; `mp_count_if` +counts the number of types in `L` for which `P` is `true`. + +As a first step, I'll implement `mp_plus`. `mp_plus` is a variadic (not just +binary) metafunction that returns the sum of its arguments. +``` +template struct mp_plus_impl; + +template using mp_plus = typename mp_plus_impl::type; + +template<> struct mp_plus_impl<> +{ + using type = std::integral_constant; +}; + +template struct mp_plus_impl +{ + static constexpr auto _v = T1::value + mp_plus::value; + + using type = std::integral_constant< + typename std::remove_const::type, _v>; +}; +``` +Now that we have `mp_plus`, `mp_count` is just +``` +template struct mp_count_impl; + +template class L, class... T, class V> + struct mp_count_impl, V> +{ + using type = mp_plus...>; +}; + +template using mp_count = typename mp_count_impl::type; +``` +This is another illustration of the power of parameter pack expansion. It's a +pity that we can't use pack expansion in `mp_plus` as well, to obtain +``` +T1::value + T2::value + T3::value + T4::value + ... +``` +directly. It would have been nice for `T::value + $$...$$` to have been +supported, and it appears that in {cpp}17, it will be. + +`mp_count_if` is similarly straightforward: +``` +template class P> struct mp_count_if_impl; + +template class L, class... T, template class P> + struct mp_count_if_impl, P> +{ + using type = mp_plus...>; +}; + +template class P> + using mp_count_if = typename mp_count_if_impl::type; +``` +at least if we require `P` to return `bool`. If not, we'll have to coerce +`P::value` to 0 or 1, or the count will not be correct. +``` +template using mp_bool = std::integral_constant; + +template class P> struct mp_count_if_impl; + +template class L, class... T, template class P> + struct mp_count_if_impl, P> +{ + using type = mp_plus::value != 0>...>; +}; + +template class P> + using mp_count_if = typename mp_count_if_impl::type; +``` +The last primitive I'll show is `mp_contains`. `mp_contains` returns +whether the list `L` contains the type `V`: +``` +template using mp_contains = mp_bool::value != 0>; +``` +At first sight, this implementation appears horribly naive and inefficient -- +why would we need to count all the occurences just to throw that away if we're +only interested in a boolean result -- but it's actually pretty competitive and +perfectly usable. We just need to add one slight optimization to `mp_plus`, the +engine behind `mp_count` and `mp_contains`: +``` +template + struct mp_plus_impl +{ + static constexpr auto _v = T1::value + T2::value + T3::value + T4::value + + T5::value + T6::value + T7::value + T8::value + T9::value + T10::value + + mp_plus::value; + + using type = std::integral_constant< + typename std::remove_const::type, _v>; +}; +``` +This cuts the number of template instantiations approximately tenfold. + +## Conclusion + +I have outlined an approach to metaprogramming in {cpp}11 that + +* takes advantage of variadic templates, parameter pack expansion, and template + aliases; +* operates on any variadic template `L`, treating it as its + fundamental data structure, without mandating a specific type list + representation; +* uses template aliases as its metafunctions, with the expression `F` + serving as the equivalent of a function call; +* exploits the structural similarity between the data structure `L` + and the metafunction call `F`; +* leverages parameter pack expansion as much as possible, instead of using the + traditional recursive implementations; +* relies on inline definitions of template aliases for function composition, + instead of providing library support for this task. + +## Further reading + +<>, in which I +show algorithms that allow us to treat type lists as sets, maps, and vectors, +and demonstrate various {cpp}11 implementation techniques in the process. diff --git a/doc/article/simple_cxx11_metaprogramming_2.adoc b/doc/article/simple_cxx11_metaprogramming_2.adoc new file mode 100644 index 0000000..c176818 --- /dev/null +++ b/doc/article/simple_cxx11_metaprogramming_2.adoc @@ -0,0 +1,981 @@ +//// +Copyright 2015-2017 Peter Dimov + +Distributed under the Boost Software License, Version 1.0. + +See accompanying file LICENSE_1_0.txt or copy at +http://www.boost.org/LICENSE_1_0.txt +//// + +# Simple {cpp}11 metaprogramming, part 2 +Peter Dimov +2015-06-20 + +[.lead] +__Efficient algorithms for membership testing, random access, and retrieval by +key__ + +NOTE: Being late to the metaprogramming party, I make no claim of having +invented the techniques in this article. A quick look at the implementations +of, for example, Louis Dionne's https://github.com/ldionne/mpl11[mpl11] and +Eric Niebler's https://github.com/ericniebler/meta[meta], shows that most of +these tricks are already known. Dave Abrahams +https://github.com/dabrahams/mpl11[has experimented] along these lines in 2012. +The original inventor of the multiple inheritance trick and the `void*` +arguments trick is probably Richard Smith, who has posted +https://llvm.org/bugs/attachment.cgi?id=8825[two] +https://llvm.org/bugs/attachment.cgi?id=8838[examples] in response to +https://llvm.org/bugs/show_bug.cgi?id=13263[a Clang bug report]. + +## Vectors, sets, and maps + +<>, I outlined a style of +metaprogramming that operated on type lists -- variadic class templates: +``` +template struct mp_list {}; +``` +Classic Lisp uses lists as its only data structure, but operating on a list is +usually linear in the number of its elements. + +In addition to `list`, the STL has `vector`, `set`, and `map`. `vector` +supports random access by index; `set` has efficient test for membership; `map` +associates keys with values and has efficient lookup based on key. + +Instead of introducing separate data structure such as `mp_vector`, `mp_set`, +`mp_map`, we'll keep our data in a list form, and attempt to provide efficient +algorithms for random access, membership testing, and lookup. + +## mp_contains + +Let's starts with sets. A set is just a list with unique elements. To obtain a +set from an arbitrary list, we'll need an algorithm that removes the +duplicates. Let's call it `mp_unique`: +[subs=+quotes] +``` +// mp_if + +template struct mp_if_c_impl; + +template struct mp_if_c_impl +{ + using type = T; +}; + +template struct mp_if_c_impl +{ + using type = E; +}; + +template + using mp_if_c = typename mp_if_c_impl::type; + +template + using mp_if = typename mp_if_c_impl::type; + +// mp_unique + +template struct mp_unique_impl; + +template using mp_unique = typename mp_unique_impl::type; + +template class L> struct mp_unique_impl> +{ + using type = L<>; +}; + +template class L, class T1, class... T> + struct mp_unique_impl> +{ + using _rest = mp_unique>; + using type = mp_if<**mp_contains**<_rest, T1>, _rest, mp_push_front<_rest, T1>>; +}; +``` +For membership testing, we've introduced an algorithm `mp_contains` that +returns `true` when `L` contains `V`. The straightforward recursive +implementation of `mp_contains` is: +``` +template struct mp_contains_impl; + +template using mp_contains = typename mp_contains_impl::type; + +template class L, class V> struct mp_contains_impl, V> +{ + using type = std::false_type; +}; + +template class L, class... T, class V> + struct mp_contains_impl, V> +{ + using type = std::true_type; +}; + +template class L, class T1, class... T, class V> + struct mp_contains_impl, V>: mp_contains_impl, V> +{ +}; +``` +Note that `mp_unique` makes `N` calls to `mp_contains`, where `N` is the +length of the list `L`. This means that `mp_contains` needs to be as fast as +possible, which the above implementation, well, isn't. + +Here are the compile times in seconds for invoking `mp_unique` on a list with +`N` (distinct) elements: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |2.1 |DNF |||||| + +|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF ||| + +|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF ||| +|=== +(Tests done under Windows/Cygwin. All compilers are 32 bit. No optimizations. +DNF stands for "did not finish", which usually means that the compiler ran out +of heap space or crashed.) + +We clearly need a better alternative. + +I ended the previous article with an implementation of `mp_contains` that +relied on `mp_count`, which in turn relied on `mp_plus`. Let's see how it +fares: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF |||| + +|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF ||| + +|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3 +|=== +Not _that_ bad, at least if your compiler happens to be `g$$++$$`. Still, there +ought to be room for improvement here. + +To do better, we have to somehow leverage the language features, such as pack +expansion, to do more of the work for us. For inspiration, let's turn to +section 14.5.3 paragraph 4 of the {cpp}11 standard, which explains that pack +expansions can occur in the following contexts: + +* **In a function parameter pack (8.3.5); the pattern is the + __parameter-declaration__ without the ellipsis.** +* In a template parameter pack that is a pack expansion (14.1): +* **In an __initializer-list__ (8.5); the pattern is an + __initializer-clause__.** +* **In a __base-specifier-list__ (Clause 10); the pattern is a + __base-specifier__.** +* In a __mem-initializer-list__ (12.6.2); the pattern is a + __mem-initializer__. +* In a __template-argument-list__ (14.3); the pattern is a + __template-argument__. +* In a __dynamic-exception-specification__ (15.4); the pattern is a + __type-id__. +* In an __attribute-list__ (7.6.1); the pattern is an __attribute__. +* In an __alignment-specifier__ (7.6.2); the pattern is the + __alignment-specifier__ without the ellipsis. +* In a __capture-list__ (5.1.2); the pattern is a __capture__. +* In a `sizeof$$...$$` expression (5.3.3); the pattern is an __identifier__. + +The **emphasis** is mine and indicates possible leads. + +Our first option is to expand the parameter pack into arguments for a function +call. Since we're interested in operations that occur at compile time, calling +a function may not appear useful; but {cpp}11 functions can be `constexpr`, and +`constexpr` function "calls" do occur at compile time. + +Recall our `mp_count`: +``` +template struct mp_count_impl; + +template class L, class... T, class V> + struct mp_count_impl, V> +{ + using type = mp_plus...>; +}; + +template using mp_count = typename mp_count_impl::type; +``` +Instead of using the template alias `mp_plus` to sum the `is_same` expressions, +we can use a `constexpr` function: +``` +constexpr std::size_t cx_plus() +{ + return 0; +} + +template constexpr std::size_t cx_plus(T1 t1, T... t) +{ + return t1 + cx_plus(t...); +} + +// mp_size_t + +template using mp_size_t = std::integral_constant; + +// mp_count + +template struct mp_count_impl; + +template class L, class... T, class V> + struct mp_count_impl, V> +{ + using type = mp_size_t::value...)>; +}; + +template using mp_count = typename mp_count_impl::type; +``` +with the following results: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF ||| + +|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8 +|=== +We've improved the times, but lost VC$$++$$ 2013 due to its not implementing +`constexpr`. + +Let's try pack expansion into an __initializer-list__. Instead of passing the +`is_same` expressions to a function, we can build a constant array out of them, +then sum the array with a `constexpr` function: +``` +constexpr std::size_t cx_plus2(bool const * first, bool const * last) +{ + return first == last? 0: *first + cx_plus2(first + 1, last); +} + +// mp_count + +template struct mp_count_impl; + +template class L, class... T, class V> + struct mp_count_impl, V> +{ + static constexpr bool _v[] = { std::is_same::value... }; + using type = mp_size_t; +}; + +template using mp_count = typename mp_count_impl::type; +``` +This is a neat trick, but is it fast? +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF |||| + +|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7 +|=== +That's a bit disappointing. Let's see what can we do with expanding a parameter +pack into a base-specifier-list. We would be able to define a class that +derives from every element of the pack: +``` +struct U: T... {}; +``` +We can then use `std::is_base_of` to test whether a type `V` is a base of +`U`, that is, whether it's one of the elements of the parameter pack. Which is +exactly what we need. + +Arbitrary types such as `void`, `int`, or `void(int)` can't be used as base +classes, but we'll wrap the types in an empty class template, which we'll call +`mp_identity`. +``` +template struct mp_identity +{ + using type = T; +}; + +template struct mp_contains_impl; + +template using mp_contains = typename mp_contains_impl::type; + +template class L, class... T, class V> + struct mp_contains_impl, V> +{ + struct U: mp_identity... {}; + using type = std::is_base_of, U>; +}; +``` +Performance? +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF ||| + +|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF ||| + +|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0 +|=== +This implementation is a clear winner. + +In fairness, we ought to note that the first four implementations of +`mp_contains` do not rely on the list elements being unique. This makes +`mp_contains` an algorithm that supports arbitrary lists, not just sets. + +The `is_base_of` implementation, however, does not support lists that contain +duplicates, because it's not possible to inherit directly from the same type +twice. So it does not implement the general `mp_contains`, but something that +should probably be named `mp_set_contains`. + +We can avoid the "no duplicates" requirement by modifying the implementation to +inherit from `mp_identity` indirectly, via an intermediate base class: +[subs=+macros] +``` +// indirect_inherit + +template struct inherit_second: T {}; + +template struct indirect_inherit_impl; + +template class L, class... T, std::size_t... J> + struct indirect_inherit_impl, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]>: + inherit_second>... {}; + +template using indirect_inherit = + indirect_inherit_impl::value>>; + +// mp_contains + +template struct mp_contains_impl +{ + using U = indirect_inherit; + using type = std::is_base_of, U>; +}; + +template using mp_contains = typename mp_contains_impl::type; +``` +This, however, pretty much nullifies the spectacular performance gains we've +observed with the original `is_base_of`-based implementation: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |2.1 |DNF |||||| + +|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF |||| + +|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF ||| + +|VC$$++$$ 2013, is_base_of/indirect |1.0 |9.3 |49.5 |153.8 |DNF ||| +|=== +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF ||| + +|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF ||| + +|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF ||| + +|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF |||| + +|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF ||| + +|clang$$++$$ 3.5.1, is_base_of/indirect |0.4 |0.9 |1.6 |2.5 |DNF ||| +|=== +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF ||| + +|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3 + +|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8 + +|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7 + +|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0 + +|g$$++$$ 4.9.2, is_base_of/indirect |0.5 |1.1 |2.3 |4.0 |6.6 |9.8 |13.6 |18.2 +|=== + +## mp_map_find + +A map, in the STL sense, is a data structure that associates keys with values +and can efficiently retrieve, given a key, its associated value. For our +purposes, a map will be any list of lists for which the inner lists have at +least one element, the key; the rest of the elements we'll consider to be the +associated value. For example, the list +``` +[[A, B], [C, D, E], [F], [G, H]] +``` +is a map with keys `A`, `C`, `F`, and `G`, with associated values `[B]`, +`[D, E]`, `[]`, and `[H]`, respectively. We'll require unique keys, for reasons +that'll become evident later. + +I'll show two other examples of maps, this time using real {cpp} code: +``` +using Map = mp_list, mp_list, mp_list>; +``` +``` +using Map2 = std::tuple, std::pair>; +``` +The Lisp name of the algorithm that performs retrieval based on key is `ASSOC`, +but I'll call it `mp_map_find`. `mp_map_find` returns the element of `M` +whose first element is `K`. For example, `mp_map_find` would return +`std::pair`. If there's no such key, it returns `void`. + +There's almost no need to implement and benchmark the recursive version of +`mp_map_find` -- we can be pretty sure it will perform horribly. Still, +``` +template struct mp_map_find_impl; + +template using mp_map_find = typename mp_map_find_impl::type; + +template class M, class K> struct mp_map_find_impl, K> +{ + using type = void; +}; + +template class M, class T1, class... T, class K> + struct mp_map_find_impl, K> +{ + using type = mp_if, K>, T1, mp_map_find, K>>; +}; +``` +The compile time, in seconds, for `N` lookups into a map of size `N`, is as +follows: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |38.2 |DNF |||||| + +|clang$$++$$ 3.5.1, recursive |2.5 |13.7 |DNF ||||| + +|g$$++$$ 4.9.2, recursive |1.9 |10.2 |28.8 |DNF |||| +|=== +I told you there was no point. + +But, I hear some of you say, you're evaluating the else branch even if the +condition is true, and that's horribly inefficient! + +Well, this would only improve the performance by a factor of approximately two +on average, and only if the element is present, but fine, let's try it. The +element happens to be present in the benchmark, so let's see. +``` +// mp_eval_if + +template class E, class... A> + struct mp_eval_if_c_impl; + +template class E, class... A> + struct mp_eval_if_c_impl +{ + using type = T; +}; + +template class E, class... A> + struct mp_eval_if_c_impl +{ + using type = E; +}; + +template class E, class... A> + using mp_eval_if_c = typename mp_eval_if_c_impl::type; + +template class E, class... A> + using mp_eval_if = typename mp_eval_if_c_impl::type; + +// mp_map_find + +template struct mp_map_find_impl; + +template using mp_map_find = typename mp_map_find_impl::type; + +template class M, class K> struct mp_map_find_impl, K> +{ + using type = void; +}; + +template class M, class T1, class... T, class K> + struct mp_map_find_impl, K> +{ + using type = mp_eval_if, K>, T1, mp_map_find, M, K>; +}; +``` +There you go: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |15.6 |DNF |||||| + +|clang$$++$$ 3.5.1, recursive |1.8 |9.5 |DNF ||||| + +|g$$++$$ 4.9.2, recursive |1.4 |7.0 |19.7 |DNF |||| +|=== +I told you there was no point. + +Point or no, to establish that the recursive implementation is inefficient is +not the same as to come up with an efficient one. There are two things that +make the `mp_contains` techniques inapplicable to our present case: first, +`mp_contains` only had to return true or false, whereas `mp_map_find` returns a +type, and second, in `mp_contains` we knew the exact type of the element for +which we were looking, whereas here, we only know its `mp_front`. + +Fortunately, there does exist a language feature that can solve both: {cpp} can +deduce the template parameters of base classes when passed a derived class. In +this example, +``` +struct K1 {}; +struct V1 {}; + +struct X: std::pair {}; + +template void f(std::pair const & p); + +int main() +{ + f(X()); +} +``` +the call to `f(X())` deduces `A` as `K1` and `B` as `V1`. If we have more than +one `std::pair` base class, we can fix `A` to be `K1`: +``` +struct K1 {}; +struct V1 {}; + +struct K2 {}; +struct V2 {}; + +struct X: std::pair, std::pair {}; + +template void f(std::pair const & p); + +int main() +{ + f(X()); +} +``` +and `B` will be deduced as `V1`. + +We can retrieve the results of the deduction by returning the type we want: +``` +template std::pair f(std::pair const & p); +``` +and then using `decltype(f(X()))` to obtain this return type. + +What if `X` doesn't have a base of type `std::pair`? The deduction will +fail and we'll get an error that `f(X())` cannot be called. To avoid it, we can +add an overload of `f` that takes anything and returns `void`. But in this +case, what will happen if `X` has two bases of the form that match the first +`f` overload, such as for example `std::pair` and `std::pair`? + +The deduction will fail, the second overload will again be chosen and we'll get +`void`. This is why we require maps to have unique keys. + +Here's an implementation of `mp_map_find` based on this technique: +``` +template struct mp_map_find_impl; + +template + using mp_map_find = typename mp_map_find_impl::type; + +template class M, class... T, class K> + struct mp_map_find_impl, K> +{ + struct U: mp_identity... {}; + + template class L, class... U> + static mp_identity> + f( mp_identity>* ); + + static mp_identity f( ... ); + + using V = decltype( f((U*)0) ); + + using type = typename V::type; +}; +``` +and its corresponding compile times: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, deduction |0.3 |0.7 |1.8 |3.6 |6.4 |10.4 |16.2 |DNF + +|clang$$++$$ 3.5.1, deduction |0.3 |0.4 |0.6 |0.9 |1.2 |1.6 |2.2 |2.7 + +|g$$++$$ 4.9.2, deduction |0.3 |0.5 |0.9 |1.6 |2.3 |3.4 |4.7 |6.3 +|=== +This looks ready to ship. + +The implementation contains one inefficiency though. If we evaluate +`mp_map_find`, then `mp_map_find`, the two nested `U` types are +the same as they only depend on `M`, but the compiler doesn't know that and +will instantiate each one separately. We should move this type outside +`mp_map_find_impl` so that it can be reused: +[subs=+quotes] +``` +template struct **mp_inherit**: T... {}; + +template struct mp_map_find_impl; + +template + using mp_map_find = typename mp_map_find_impl::type; + +template class M, class... T, class K> + struct mp_map_find_impl, K> +{ + **using U = mp_inherit...>;** + + template class L, class... U> + static mp_identity> + f( mp_identity>* ); + + static mp_identity f( ... ); + + using V = decltype( f((U*)0) ); + + using type = typename V::type; +}; +``` +(This same optimization, by the way, applies to our `is_base_of` implementation +of `mp_contains`.) + +The improvement in compile times on our benchmark is measurable: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, deduction+mp_inherit |0.3 |0.6 |1.4 |2.6 |4.5 |7.1 |10.7 |DNF + +|clang$$++$$ 3.5.1, deduction+mp_inherit |0.3 |0.4 |0.6 |0.8 |1.0 |1.4 |1.8 |2.2 + +|g$$++$$ 4.9.2, deduction+mp_inherit |0.3 |0.4 |0.6 |0.9 |1.3 |1.8 |2.3 |2.9 +|=== + +## mp_at + +With sets and maps covered, it's time to tackle vectors. Vectors for us are +just lists, to which we'll need to add the ability to efficiently access an +element based on its index. The customary name for this accessor is +`mp_at`, where `L` is a list and `I` is an `integral_constant` that +represents the index. We'll also follow the Boost.MPL convention and add +`mp_at_c`, where `I` is the index of type `size_t`. + +The recursive implementation of `mp_at` is: +``` +template struct mp_at_c_impl; + +template using mp_at_c = typename mp_at_c_impl::type; + +template using mp_at = typename mp_at_c_impl::type; + +template class L, class T1, class... T> + struct mp_at_c_impl, 0> +{ + using type = T1; +}; + +template class L, class T1, class... T, std::size_t I> + struct mp_at_c_impl, I> +{ + using type = mp_at_c, I-1>; +}; +``` +and the compile times for making `N` calls to `mp_at` with a list of size `N` +as the first argument are: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |3.6 |DNF |||||| + +|clang$$++$$ 3.5.1, recursive |1.0 |5.1 |15.3 |DNF |||| + +|g$$++$$ 4.9.2, recursive |0.9 |4.7 |14.2 |32.4 |DNF ||| +|=== +To improve upon this apalling result, we'll again exploit pack expansion into a +function call, but in a novel way. Let's suppose that we need to access the +fourth element (`I = 3`). We'll generate the function signature +``` +template W f( void*, void*, void*, W*, ... ); +``` +and then, given a list `L`, we'll evaluate the +expression +``` +decltype( f( (T1*)0, (T2*)0, (T3*)0, (T4*)0, (T5*)0, (T6*)0, (T7*)0 ) ) +``` +The three `void*` parameters will eat the first three elements, `W` will be +deduced as the fourth, and the ellipsis will take care of the rest. + +A working implementation based on this technique is shown below: +``` +// mp_repeat_c + +template struct mp_repeat_c_impl +{ + using _l1 = typename mp_repeat_c_impl::type; + using _l2 = typename mp_repeat_c_impl::type; + + using type = mp_append<_l1, _l1, _l2>; +}; + +template struct mp_repeat_c_impl<0, T...> +{ + using type = mp_list<>; +}; + +template struct mp_repeat_c_impl<1, T...> +{ + using type = mp_list; +}; + +template using mp_repeat_c = + typename mp_repeat_c_impl::type; + +// mp_at + +template struct mp_at_c_impl; + +template class L, class... T, + template class L2, class... U> + struct mp_at_c_impl, L2> +{ + template static W f( U*..., W*, ... ); + + using R = decltype( f( (mp_identity*)0 ... ) ); + + using type = typename R::type; +}; + +template using mp_at_c = + typename mp_at_c_impl>::type; + +template using mp_at = mp_at_c; +``` +and the compile times in the following table show it to be good enough for most +practical purposes. +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, void* |0.4 |1.1 |2.4 |4.7 |DNF ||| + +|clang$$++$$ 3.5.1, void* |0.4 |0.7 |1.2 |1.9 |2.7 |3.8 |5.0 |6.6 + +|g$$++$$ 4.9.2, void* |0.3 |0.5 |0.9 |1.3 |2.1 |3.0 |4.2 |5.5 +|=== +Are we done with `mp_at`, then? + +Let's try something else -- transform the input list `[T1, T2, T3]` into a map +`[[0, T1], [1, T2], [2, T3]]`, then use `mp_map_find` for the lookup: +[subs=+macros] +``` +// mp_map_from_list + +template struct mp_map_from_list_impl; + +template class L, class... T, std::size_t... J> + struct mp_map_from_list_impl, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]> +{ + using type = mp_list, T>...>; +}; + +template using mp_map_from_list = typename mp_map_from_list_impl::value>>::type; + +// mp_at + +template struct mp_at_c_impl +{ + using map = mp_map_from_list; + using type = mp_second>>; +}; + +template using mp_at_c = typename mp_at_c_impl::type; + +template using mp_at = typename mp_at_c_impl::type; +``` +At first sight, this looks ridiculous, but metaprogramming has its own rules. +Let's measure: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, map |0.3 |0.7 |1.5 |2.9 |5.0 |7.8 |11.9 |DNF + +|clang$$++$$ 3.5.1, map |0.3 |0.4 |0.6 |0.8 |1.1 |1.5 |1.8 |2.3 + +|g$$++$$ 4.9.2, map |0.3 |0.4 |0.7 |1.0 |1.4 |1.9 |2.5 |3.2 +|=== +Surprise, this is the best implementation. + +## mp_drop + +It turned out that we didn't need the `void*` trick for `mp_at`, but I'll show +an example where we do: `mp_drop`. `mp_drop` returns the list `L` without +its first `N` elements; or, in other words, it drops the first `N` elements +(presumably on the cutting room floor) and returns what's left. + +To implement `mp_drop`, we just need to change +``` +template W f( void*, void*, void*, W*, ... ); +``` +from above to return the rest of the elements, rather than just one: +``` +template mp_list f( void*, void*, void*, W*... ); +``` +Adding the necessary `mp_identity` seasoning produces the following working +implementation: +``` +template struct mp_drop_c_impl; + +template class L, class... T, + template class L2, class... U> + struct mp_drop_c_impl, L2> +{ + template static mp_identity> f( U*..., mp_identity*... ); + + using R = decltype( f( (mp_identity*)0 ... ) ); + + using type = typename R::type; +}; + +template using mp_drop_c = + typename mp_drop_c_impl>::type; + +template using mp_drop = mp_drop_c; +``` +I'll skip the recursive implementation and the performance comparison for this +one. We can pretty much tell who's going to win, and by how much. + +## mp_find_index + +The final algorithm that I'll bring to your attention is `mp_find_index`. +`mp_find_index` returns an integral constant of type `size_t` with a +value that is the index of the first occurence of `V` in `L`. If `V` is not in +`L`, the return value is the size of `L`. + +We'll start with the recursive implementation, as usual: +``` +template struct mp_find_index_impl; + +template using mp_find_index = typename mp_find_index_impl::type; + +template class L, class V> struct mp_find_index_impl, V> +{ + using type = mp_size_t<0>; +}; + +template class L, class... T, class V> + struct mp_find_index_impl, V> +{ + using type = mp_size_t<0>; +}; + +template class L, class T1, class... T, class V> + struct mp_find_index_impl, V> +{ + using type = mp_size_t<1 + mp_find_index, V>::value>; +}; +``` +and will continue with the compile times for `N` calls to `mp_find_index` on a +list with `N` elements, as usual: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, recursive |3.5 |DNF |||||| + +|clang$$++$$ 3.5.1, recursive |1.1 |5.5 |DNF ||||| + +|g$$++$$ 4.9.2, recursive |0.8 |4.6 |13.6 |DNF |||| +|=== +What can we do here? + +Let's go back to `mp_contains` and look at the "mp_count/cx_plus2" +implementation which we rejected in favor of the inheritance-based one. It +built a `constexpr` array of booleans and summed them in a `constexpr` +function. We can do the same here, except instead of summing the array, we can +find the index of the first true value: +``` +template struct mp_find_index_impl; + +template using mp_find_index = typename mp_find_index_impl::type; + +template class L, class V> struct mp_find_index_impl, V> +{ + using type = mp_size_t<0>; +}; + +constexpr std::size_t cx_find_index( bool const * first, bool const * last ) +{ + return first == last || *first? 0: 1 + cx_find_index( first + 1, last ); +} + +template class L, class... T, class V> + struct mp_find_index_impl, V> +{ + static constexpr bool _v[] = { std::is_same::value... }; + + using type = mp_size_t< cx_find_index( _v, _v + sizeof...(T) ) >; +}; +``` +The performance of this version is: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|clang$$++$$ 3.5.1, constexpr |0.5 |1.3 |2.9 |DNF |||| + +|g$$++$$ 4.9.2, constexpr |0.5 |1.4 |3.1 |5.5 |8.7 |13.0 |18.0 |DNF +|=== +which, while not ideal, is significantly better than our recursive punching +bag. But if our compiler of choice is VC$$++$$ 2013, we can't use `constexpr`. + +We may attempt an implementation along the same lines, but with the `constexpr` +function replaced with ordinary metaprogramming: +``` +template struct mp_find_index_impl; + +template using mp_find_index = typename mp_find_index_impl::type; + +template class L, class V> struct mp_find_index_impl, V> +{ + using type = mp_size_t<0>; +}; + +template struct find_index_impl_; + +template<> struct find_index_impl_<> +{ + static const std::size_t value = 0; +}; + +template struct find_index_impl_ +{ + static const std::size_t value = B1? 0: 1 + find_index_impl_::value; +}; + +template + struct find_index_impl_ +{ + static const std::size_t value = B1? 0: B2? 1: B3? 2: B4? 3: B5? 4: + B6? 5: B7? 6: B8? 7: B9? 8: B10? 9: 10 + find_index_impl_::value; +}; + +template class L, class... T, class V> + struct mp_find_index_impl, V> +{ + using type = mp_size_t::value...>::value>; +}; +``` +This is still recursive, so we don't expect miracles, but it wouldn't hurt to +measure: +|=== +||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800 + +|VC$$++$$ 2013, bool... |4.7 |94.5 |488.3 |XFA |||| + +|clang$$++$$ 3.5.1, bool... |0.6 |2.2 |5.8 |12.0 |21.7 |35.2 |DNF | + +|g$$++$$ 4.9.2, bool... |0.6 |2.4 |6.5 |13.2 |23.8 |39.1 |59.0 |DNF +|=== +(where XFA stands for "experimenter fell asleep".) + +This is an interesting tradeoff for VC$$++$$ 2013 and Clang. On the one hand, +this implementation is slower; on the other, it doesn't crash the compiler as +easily. Which to prefer is a matter of taste and of stern evaluation of one's +needs to manipulate type lists of length 300. + +Note that once we have `mp_drop` and `mp_find_index`, we can derive the +`mp_find` algorithm, which returns the suffix of `L` starting with the +first occurence of `V`, if any, and an empty list otherwise, by using +`mp_drop>`. + +## Conclusion + +In this article, I have shown efficient algorithms that allow us to treat type +lists as sets, maps and vectors, demonstrating various {cpp}11 implementation +techniques in the process. diff --git a/doc/mp11/overview.adoc b/doc/mp11/overview.adoc index 284d33a..ef8eb36 100644 --- a/doc/mp11/overview.adoc +++ b/doc/mp11/overview.adoc @@ -13,8 +13,8 @@ http://www.boost.org/LICENSE_1_0.txt Mp11 is a C++11 metaprogramming library for compile-time manipulation of data structures that contain types. It's based on template aliases and variadic templates and implements the approach outlined in the article -http://pdimov.com/cpp2/simple_cxx11_metaprogramming.html["Simple {cpp} metaprogramming"] -and http://pdimov.com/cpp2/simple_cxx11_metaprogramming_2.html[its sequel]. Reading these +<> +and <>. Reading these articles before proceeding with this documentation is _highly_ recommended. The general principles upon which Mp11 is built are that algorithms and metafunctions are