diff --git a/doc/compliant/typeof_internals.htm b/doc/compliant/typeof_internals.htm new file mode 100755 index 0000000..4562b20 --- /dev/null +++ b/doc/compliant/typeof_internals.htm @@ -0,0 +1,1566 @@ + + +
+ + + + + +
Encoding +and decoding templates
+ +Different +kinds of template parameters or polymorphism with macros
+ +Handling +unused sequence elements
+ + + +
This document describes the internals of the TYPEOF macro implementation. It is related to so called “compliant” +implementation – one that uses partial template specializations to encode and +decode types, and is not to be confused with the other two implementations that +currently exist (or will soon exist) under the umbrella of the proposed +BOOST_TYPEOF macro – Peder Holt’s “vintage” implementation, that trades partial +template specialization for function overloading and compile time constants, as +well as recently invented by Igor Chesnokov MSVC-specific typeof trick.
+ +
The code in this document is provided for the explanation +purpose only. While it does reflect +the actual code pretty closely, it differs in a number of ways. First, the BOOST_TYPEOF prefix has been +omitted from all the macros to make the code smaller. The namespaces have been omitted for the same reason. Second, the code fragments were entered by +hand, and were not compiled, so I apologize in advance for any typos made. I hope these typos will not prevent the +reader from understanding the material, but would be happy to correct them as +they are found and reported.
+ +
It has to be stressed that the idea of breaking a type into multiple +compile-time integers by using partial template specializations is not new, and +belongs, to the best of my knowledge, to Steve Dewhurst, who described it in +his famous CUJ article “A BIT-Wise Typeof Operator”. The idea of applying MPL to this problem belongs to David +Abrahams, see http://thread.gmane.org/gmane.comp.lib.boost.devel/76208.
+ +
The main thing that distinguishes this implementation from +others available is the ease of definition of new specializations for +complicated templates. For example:
+ +
template<class T, int n, template<class, unsigned int> +class Tpl>
+ +class foo; /* a template with rather involved template id */
+ +
REGISTER_TEMPLATE(foo, (class)(int)(TEMPLATE((class)(unsigned +int)))) /* now foo can be handled by TYPEOF */
+ +
The implementation of this REGISTER_TEMPLATE macro, as well +as many other useful specializations (for functions, arrays, etc.), has become +possible because of extensive usage of the Boost Preprocessor Library.
+ +Let’s say we have an expression “expr”. The first step would be to pass it to a +function template, thus utilizing the built-in type deduction capabilities:
+ +
template<class T>
+ +unspecified foo(const T&);
+ +
foo(expr);
+ +
Inside foo() the type of the expression is known (T), so the +return type can be constructed in such a way that its size depends on the type +T. One of possible ways of doing this +is to return a reference to a character array:
+ +
template<class T>
+ +char(& foo(const T&) )[
integral-const-depends-on-T
];
+ +
sizeof( foo(expr) );
+ +
Now let’s assume that a type can be encoded into a sequence +of integer numbers. We will later +explore how to do this. Let’s just say +for now that it can be done, and looks like following:
+ +
template<class T>
+ +struct encode_type
+ +{
+ +typedef unspecified +type; // sequence of integer numbers
+ +};
+ +
Since sizeof(foo(expr)) is just one integer, we cannot +handle the whole sequence. Let’s then +return the Nth element of such sequence. +Accordingly, we add a parameter to “foo”, and rename it into more +descriptive “at”:
+ +
template<class T, class N>
+ +char(& at(const T&, const N&) )[
+ +mpl::at<encode_type<T>::type,
+N>::type::value
];
+ +
We can now reconstruct the sequence like this:
+ +
mpl::vector<
mpl::int_<sizeof(at(expr, +mpl::int_< 0 >()))>,
+ +mpl::int_<sizeof(at(expr, +mpl::int_< 1 >()))>,
+ +mpl::int_<sizeof(at(expr, +mpl::int_< 2 >()))>,
+ +…
+ +mpl::int_<sizeof(at(expr, +mpl::int_< N >()))>
+ +>
If we take a big enough N, we can hope that our type will
+fit. We will also let alone for now the
+issue of how unused elements are handled.
+Assuming now that it’s possible to decode this into the original type,
+we can write:
#define TYPEOF(expr)\
+ +decode_type<mpl::vector<\
+ +mpl::int_<sizeof(at(expr, mpl::int_<0>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<1>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<2>()))>,\
+ +…\
+ +mpl::int_<sizeof(at(expr, mpl::int_<N>()))>\
+ +> >::type
+ +
Let’s understand where we are. We just implemented the simplified typeof facility assuming the +following:
+ +
1. +It’s possible to encode a type into a compile-time sequence of +integer numbers;
+ +2. +It’s possible to decode it back;
+ +3. +It’s possible to gracefully handle the unused elements of the +sequence.
+ +
Let’s now explore these three issues in more detail.
+ +Let’s consider the following type:
+ +
const std::pair<int*, std::string>*
+ +
This type can be represented as a tree where each node is +either a type or a template or a modifier of the original type:
+ +
++-- pointer -- int
+ +pointer -- const -- std::pair --+
+ +++-- std::string
+ +
Let’s assign unique integer identifiers like following:
+ +
pointer 1
+ +const 2
+ +std::pair 3
+ +int 4
+ +std::string 5
+ +
Now the above type can be encoded as:
+ +
1 2 3 1 4 5
+ +
Once identifiers are assigned, any type containing these +items can be encoded, such as:
+ +
+ std::pair< |
+
+ 3 5 1 2 5 |
+
+ const
+ std::string* const |
+
+ 2 1 2 5 |
+
+ std::pair< |
+
+ 3 3 4 4 3 5 5 |
+
Decoding is also simple. +Let’s decode the following sequence: 1 3 4 1 2 5
+ +
decode(1 3 4 1 2 5)
+ +
The first item, 1, tells us that this is a pointer:
+ +
decode(3 4 1 2 5)*
+ +
3 is an std::pair, and this is a template with two +parameters:
+ +
std::pair<decode-2(4 1 2 5)>*
+ +
4 is an integer:
+ +
std::pair<int, decode(125)>*
+ +
1 is a pointer:
+ +
std::pair<int, decode(25)*>*
+ +
2 is const:
+ +
std::pair<int, const decode(5)*>*
+ +
5 is std::string:
+ +
std::pair<int, const std::string*>*
+ +
We are done.
+ +
Having figured out how types can be encoded into a sequence +of integers, and then decoded back, let’s now see how this all can be +implemented.
+ +The described type encoding can be implemented with partial +template specialization. For now let’s +ignore the issue of generating unique identifiers. Let’s assume we have a UNIQUE_ID() macro that does the job. Also, from the compile-time performance +point of view, it makes sense to append the encoding to a given sequence (which +we’ll denote by “V” since this is an mpl::vector):
+ +
template<class V, class T>
+ +struct encode_type; //not implemented
+ +
We can encode a type, for instance an integer, with the +following specialization:
+ +
template<class V>
+ +struct encode_type<V, int> : mpl::push_back<
+ +V,
+ +mpl::int_<4>
+ +>
+ +{};
+ +
When decoding a type, we will accept an iterator into +original sequence, extract the first identifier, use it to match partial +template specialization, and forward the rest of the sequence to this +specialization:
+ +
template<class Iter>
+ +struct decode_type : decode_type_impl<
+ +typename mpl::deref<Iter>::type,
+ +typename mpl::next<Iter>::type
+ +>
+ +{};
+ +
template<class ID, class Iter>
+ +struct decode_type_impl; //not implemented
+ +
The implementation will return the decoded type and the +position into original sequence where the decoding stopped. Again, for integer, it will look like this:
+ +
template<class Iter>
+ +struct decode_type_impl<mpl::int_<4>, Iter>
+ +{
+ +typedef int type;
+ +typedef Iter iter;
+ +};
+ +
Both specializations for the same type can be combined into +a single macro:
+ +
#define REGISTER_TYPE_IMPL(Name, ID) \
+ +template<class +V> \
+ +struct encode_type<V, +Name> : mpl::push_back< \
+ +V, \
+ +mpl::int_<ID> + \
+ +> +\
+ +{}; \
+ +template<class +Iter> \
+ +struct +decode_type_impl<mpl::int_<ID>, +Iter> \
+ +{ \
+ +typedef Name type; \
+ +typedef Iter +iter; \
+ +};
+ +
#define REGISTER_TYPE(Name)\
+REGISTER_TYPE_IMPL(Name, UNIQUE_ID())
+ +
REGISTER_TYPE(int)
+ +REGISTER_TYPE(char)
+ +REGISTER_TYPE(short)
+ +REGISTER_TYPE(long)
+ +...
+ +
Let’s consider std::pair class template. Its encoding will put its ID, 3, into the +vector, and then forward to encoding of its first, and then second template +parameter:
+ +
template<class V, class P0, class P1>
+ +struct encode_type<V, std::pair<P0, P1> >
+ +{
+ +typedef typename +mpl::push_back<
+ +V,
+ +mpl::int_<3>
+ +>::type v0;
+ +
typedef typename +encode_type<
+ +v0,
+ +P0
+ +>::type v1;
+ +
typedef typename +encode_type<
+ +v1,
+ +P1
+ +>::type v2;
+ +
typedef v2 type;
+ +};
+ +
Decoding will decode the parameters, and re-construct the +pair:
+ +
template<class Iter>
+ +struct decode_type_impl<mpl::int_<3>, Iter>
+ +{
+ +typedef +decode_type<Iter> d0;
+ +typedef +decode_type<typename d0::iter> d1;
+ ++ +
typedef std::pair<
+ +typename d0::type,
+ +typename d1::type
+ +> type;
+ ++ +
typedef typename +d1::iter iter;
+ +};
+ +
With a little bit of preprocessor magic, these two can be +combined into a single macro that can be used like this:
+ +
REGISTER_TEMPLATE(std::pair, 2)
+ +
This is all there is to say about templates as long as they +only have type parameters. Things get +more interesting however once we get to consider integral and template template +parameters.
+ +Let’s say we have the following class template:
+ +
template<class T, unsigned int n> class x;
+ +
First, how do we describe such a template to the +preprocessor? This can be done with a +preprocessor sequence:
+ +
REGISTER_TEMPLATE(x, (class)(unsigned int))
+ +
(Note that this is the same REGISTER_TEMPLATE macro, only +now the second macro parameter describes what template parameters are used, +rather than just providing their number. +The macro is overloaded using some preprocessor magic.)
+ +
We already discussed how a type template parameter is +encoded. Simplifying things for the +purpose of clarity, we can assume that an integral template parameter is just +placed as is into the vector, although this is not exactly true because the +range of integers that can be returned via sizeof(character-array) is +limited. This forces us to use two +vector elements in some cases.
+ +
The encoding now might look like this (assuming ID of 21):
+ +
template<class V, class P0, unsigned int P1>
+ +struct encode_type<V, x<P0, P1> >
+ +{
+ +typedef typename +mpl::push_back<
+ +V,
+ +mpl::int_<21>
+ +>::type v0;
+ +
+typedef typename encode_type<
+v0,
P0
+>::type v1;
+ +
+typedef typename mpl::push_back<
+v1,
+mpl::int_<P1>
+>::type v2;
+ +
typedef v2 type;
+ +};
+ +
This really begins looking like polymorphism! But first we need objects.
+ +
Objects are combination of properties. When we are talking about the preprocessor, +we can use sequences. Besides regular +properties we need type information inside objects. This type information can later be used for dispatching:
+ +
#define TYPE_PARAM (TYPE_PARAM)
#define INTEGRAL_PARAM(Type)
+(INTEGRAL_PARAM)(Type)
Let’s now define “virtual functions”:
+ +
#define TYPE_PARAM_TYPE(This) class
#define TYPE_PARAM_ENCODE(This, n)\
typedef
+typename encode_type<v ## n, P ## n>::type\
+BOOST_PP_CAT(v, BOOST_PP_INC(n))
#define INTEGRAL_PARAM_TYPE(This)
+BOOST_PP_SEQ_ELEM(1, This)
#define INTEGRAL_PARAM_ENCODE(This, n)\
+typedef typename mpl::push_back<v ## n, mpl::int_<P ## n>
+>::type\
+BOOST_PP_CAT(v, BOOST_PP_INC(n))
Now we need a virtual function:
+ +
#define VIRTUAL(Fname, This)\
+ ++BOOST_PP_SEQ_CAT((BOOST_PP_SEQ_HEAD(This))(_)(Fname))
+ +
As you can see, the head of the object (sequence) is used +for dispatching.
+ +
Before we can finish conversion of our encode_type +specialization, we need to transform
+ +
(class)(unsigned int)
+ +
into
+ +
(TYPE_PARAM)(INTEGRAL_PARAM(unsigned int))
+ +
Without going into too much detail, here is an example of +transformation sequence:
+ +
unsigned int à PREFIX_unsigned int_SUFFIX à (unsigned)(int) à +MACRO_unsigned_int à INTEGRAL_PARAM(unsigned int)
+ +
class à PREFIX_class_SUFFIX à (class) à +MACRO_class à +TYPE_PARAM
+ +
Assuming this transformation is done with the macro called TRANSFORM_PARAMS, +we can define our encoding specialization like this:
+ +
#define REGISTER_TEMPLATE_PARAM_PAIR(z, n, elem) \
+ +VIRTUAL(TYPE, +elem)(elem) BOOST_PP_CAT(P, n)
+ +
#define REGISTER_TEMPLATE_ENCODE_PARAM(r, data, n, elem)\
+ +VIRTUAL(ENCODE, elem)(elem, +n)
+ +
#define REGISTER_TEMPLATE_IMPL(Name, ID, Params, Size)\
+ +. . .
+ +template<class V\
+ ++SEQ_ENUM_TRAILING(Params, REGISTER_TEMPLATE_PARAM_PAIR)\
+ +>\
+ +struct +encode_type_impl<V, Name<BOOST_PP_ENUM_PARAMS(Size, P)> >\
+ +{\
+ +typedef typename mpl::push_back<V, +mpl::int_<ID> >::type V0;\
+ ++BOOST_PP_SEQ_FOR_EACH_I(REGISTER_TEMPLATE_ENCODE_PARAM, ~, Params)\
+ +typedef +BOOST_PP_CAT(V, Size) type;\
+ +};\
+ +. . .
+ +
#define REGISTER_TEMPLATE(Name, Params)\
+ +REGISTER_TEMPLATE_IMPL(\
+ +Name,\
+ +UNIQUE_ID,\
+ ++TRANSFORM_PARAMS(Params),\
+ ++BOOST_PP_SEQ_SIZE(Params))
+ +
(SEQ_ENUM_TRAILING is our own macro with, hopefully, obvious +meaning)
+ +
It’s worth noting here that we also support the third +template parameter type, template template parameters. With three different types, and half a dozen +“virtual functions”, such polymorphic approach really pays off.
+ +Let’s revisit our TYPEOF macro implementation. We left it in the following state:
+ +
template<class T, class N>
+ +char(& at(const T&, const N&) )[
+ +mpl::at<encode_type<T>::type, +N>::type::value
+ +];
+ +
#define TYPEOF(expr)\
+ +decode_type<mpl::vector<\
+ +mpl::int_<sizeof(at(expr, mpl::int_<0>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<1>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<2>()))>,\
+ +…\
+ +mpl::int_<sizeof(at(expr, mpl::int_<N>()))>\
+ +> >::type
+ +
Considering a few things discussed in the previous section, +we should now rewrite it like this:
+ +
template<class T, class N>
+ +char(& at(const T&, const N&) )[
+ +mpl::at<encode_type<mpl::vector0<>, T>::type, N>::type::value
+ +];
+ +
#define TYPEOF(expr)\
+ +decode_type<mpl::begin<mpl::vector<\
+ +mpl::int_<sizeof(at(expr, mpl::int_<0>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<1>()))>,\
+ +mpl::int_<sizeof(at(expr, mpl::int_<2>()))>,\
+ +…\
+ +mpl::int_<sizeof(at(expr, mpl::int_<N>()))>\
+ +>::type>::type
+ +
We don’t want the function template at() to be instantiated +for N greater than the size of the encoded vector for at least two reasons:
+ +
1. +Unnecessary template instantiations have a negative effect on +compile-time performance;
+ +2. +mpl::at<> will fail.
+ +
So, let’s start with determining the size of the encoded
+vector:
template<class T>
+ +char(& size(const T&) )[
+ +mpl::size<encode_type<mpl::vector0<>, +T>::type>::type::value
+ +];
+ +
Now, for the N greater than the size of the encoded vector, +we will simply substitute zero for N, thus reusing the instantiation of at() +that returns the first element of the encoded sequence:
+ +
mpl::int_<sizeof(at(expr, mpl::int_<(i +< sizeof(size(expr)) ? i : 0)>()))>
+ +
Let’s define the encoded vector size limit, and put +everything tohether:
+ +
#ifndef BOOST_TYPEOF_LIMIT_SIZE
#
+define BOOST_TYPEOF_LIMIT_SIZE 50
#endif
template<class T, class N>
+ +char(& at(const T&, const N&) )[
+ +mpl::at<encode_type<mpl::vector0<>, +T>::type, N>::type::value
+ +];
+ +
template<class T>
+ +char(& size(const T&) )[
+ +mpl::size<encode_type<mpl::vector0<>, +T>::type>::type::value
+ +];
+ +
#define TYPEOF(expr)\
+ ++decode_type<mpl::begin<mpl::vector<\
+ ++mpl::int_<sizeof(at(expr, mpl::int_<(\
+ +0 +< sizeof(size(expr)) ? 0 : 0\
+ +)>()))>,\
+ ++mpl::int_<sizeof(at(expr, mpl::int_<(\
+ +1 +< sizeof(size(expr)) ? 1 : 0\
+ +)>()))>,\
+ ++mpl::int_<sizeof(at(expr, mpl::int_<(\
+ +2 +< sizeof(size(expr)) ? 2 : 0\
+ +)>()))>,\
+ +. . .\
+ ++mpl::int_<sizeof(at(expr, mpl::int_<(\
+ ++BOOST_TYPEOF_LIMIT_SIZE < sizeof(size(expr)) ?\
+ ++BOOST_TYPEOF_LIMIT_SIZE : 0\
+ +)>()))>,\
+ +>::type>::type
+ +
It’s now trivial for anybody familiar with the Boost +Preprocessor Library to re-write this nicely, so let’s omit this. You can always see the result at +boost/typeof/compliant/typeof_impl.hpp.
+ +Looking at the resulting TYPEOF macro, it may seem that the +type of our expression is encoded many times, since functions size() and at() +are mentioned BOOST_TYPEOF_LIMIT_SIZE times each. However, the template encode_type<mpl::vector0<>, T> +is always same for the same expression, so it is instantiated only once, and +then just looked up. Hence, we can +roughly state that the compile-time complexity of our TYPEOF is O(m), where m +is the size of the encoded vector. In +practice this means that TYPEOF compiles slowly for more complicated types than +for simple types.
+ +
Copyright © Arkadiy Vertleyb, 2005
+ ++ +