Blag

He's not dead, he's resting

Tag Archives: rtti

Runtime Type Checking in C++ without RTTI

A technique I always seem to forget is how to map C++ types to an integer without relying upon RTTI. A variation on this is used in <locale> in standard library, for std::use_facet<>. But let’s take a much simpler, and highly contrived, example.

Let’s say we’ve got some values of different types, and we want to give those types to a library to store somewhere, and then we later want to get them back again. Crucially, the library itself doesn’t know anything about the types in question. So, for a very simple case:

#include <vector>
#include <iostream>
#include <string>

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    /* ... */
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

Note the gratuitous use of c++0x initialiser lists, just because we can.

Those familiar with Boost might think that Something is like boost::any. However, boost::any uses RTTI, which is slow and completely unnecessary.

A first implementation of Something might look like this:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return static_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

This works, but has a major flaw: if you get the types wrong when calling Something.as<>, you’ll get a segfault or something similarly horrible. We’d like to replace that with something safer.

One way to do it is to use runtime type information. The simplest variation on this is to replace the static_cast with a dynamic_cast. However, we can only do this if SomethingValueBase is a polymorphic type, which it isn’t. We can make it so by adding in a virtual destructor:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return dynamic_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

Now, if we get the types wrong, a std::bad_cast will be thrown. Alternatively, we can use our own exception type:

class SomethingIsSomethingElse
{
};

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(dynamic_cast<const SomethingValue<T_> *>(_value.get()));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

We can also make use of std::dynamic_pointer_cast, which is possibly slightly less ugly syntactically:

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(std::dynamic_pointer_cast<const SomethingValue<T_> >(_value));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

All of this is using RTTI, though, and RTTI is a huge amount of overkill for what we need. Before eliminating the RTTI, though, we’ll switch to using it in a different way:

#include <memory>
#include <string>
#include <typeinfo>

class Something
{
    private:
        template <typename T_>
        struct SomethingValueType
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        struct SomethingValueBase
        {
            std::string type_info_name;

            SomethingValueBase(const std::string & t) :
                type_info_name(t)
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(typeid(SomethingValueType<T_>()).name()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (typeid(SomethingValueType<T_>()).name() != _value->type_info_name)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Here we make use of typeid explicitly, which is widely considered to be about on par with use of goto. However, it paves the way for our next step. Can we replace typeid(SomethingValueType<T_>()).name() with a different, non-evil expression? Let’s think about what properties the result of that expression must have:

  • We must be able to store it, so it needs to be a regular type.
  • We must be able to compare values of it, and be guaranteed true if and only if the two types used to create the value are the same, and false if and only if they are different. (Note that RTTI doesn’t even provide this guarantee.)

Let’s try this:

#include <memory>
#include <string>

class SomethingIsSomethingElse
{
};

template <typename T_>
struct SomethingTypeTraits;

class Something
{
    private:
        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(SomethingTypeTraits<T_>::magic_number),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (SomethingTypeTraits<T_>::magic_number != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Now, our library user has to provide specialisations of SomethingTypeTraits for every type they wish to use:

#include <string>
#include <iostream>
#include <vector>

template <>
struct SomethingTypeTraits<int>
{
    enum { magic_number = 1 };
};

template <>
struct SomethingTypeTraits<std::string>
{
    enum { magic_number = 2 };
};

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

No RTTI at all there, and it is type safe, but it relies upon a lot of boilerplate from the library user, and that boilerplate is very easy to screw up. So, we’ll allocate magic numbers automatically instead:

#include <memory>

class Something
{
    private:
        static int next_magic_number()
        {
            static int magic(0);
            return magic++;
        }

        template <typename T_>
        static int magic_number_for()
        {
            static int result(next_magic_number());
            return result;
        }

        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(magic_number_for<T_>()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (magic_number_for<T_>() != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

How does this work? Each instantiation of the magic_number_for<T_> function needs to return the same magic number every time it is called. The first time any particular instantiation is called, its static int result requests the next magic number. On subsequent calls, the allocated number is remembered. (Note that static values inside a template are not shared between different instantiations of that template.) Finally, next_magic_number just returns a new magic number every time it is called.

And there we have it: fast runtime type checking with no boilerplate and no RTTI. What we’ve done here is more or less useless, but the techniques do have other applications. For the curious, std::use_facet<> is probably the most common, and anyone brave enough to delve into its design will eventually see why this isn’t either pointless wankery or reinventing the wheel. For the rest, if you think that using RTTI can solve your problem adequately, then it probably can, and you don’t need to go into the kind of devious trickery the standard library uses internally.