Ciaran McCreesh’s Blag

Now with 17% more caffeine

Category Archives: c++

C++ Explicit Template Instantiation Hate

In part two of a never ending series of why I hate C++ but have to use it anyway because there’s nothing else, we come to explicit template instantiation.

Explicit template instantiation is a nuisance that only exists because all build systems suck. Unfortunately, doing everything implicitly makes compiles take way too long, so explicit instantiation is a pragmatic nuisance. So let’s have a header that doesn’t slow down the compiler too much:

#include <string>

template <typename Item_>
struct ItemMaker
{
    Item_ make_me_an_item() const;
};

typedef ItemMaker<int> IntMaker;
typedef ItemMaker<std::string> StringMaker;

And then an implementation:

template <typename Item_>
Item_
ItemMaker<Item_>::make_me_an_item() const
{
    return Item_();
}

But then we need explicit instantiation. Should be no problem, right?

template class IntMaker;
template class StringMaker;

Wrong! Explicit instantiation only works for declarations, not typedefs, so we have to copy things out all over again:

template class ItemMaker<int>;
template class ItemMaker<std::string>;

Yes, thanks for that, Standard guys.

C++ Overload Resolution Hate

Sometimes, C++’s overload resolution rules are a pain in the ass.

Let’s say we have the following:

#include <tr1/memory>

struct Bar
{
    Bar()
    {
    }
};

struct Baz
{
    Baz()
    {
    }
};

struct Foo
{
    explicit Foo(const std::tr1::shared_ptr<const Bar> &)
    {
    }

    explicit Foo(const std::tr1::shared_ptr<const Baz> &)
    {
    }
};

Then the following is ambiguous, and won’t compile:

Foo foo(std::tr1::shared_ptr<Bar>(new Bar));

Here’s why: Neither constructor exactly matches the argument given, so the compiler falls back to construction and type conversions. std::tr1::shared_ptr<T_> has an implicit constructor template <typename U_> shared_ptr(const shared_ptr<U_> &), which is good because it lets you use a shared pointer to a derived class when a shared pointer to a base class is expected. But that conversion can take place for all U_, which means the compiler doesn’t know whether you want to convert to a shared pointer to const Bar or const Baz — it isn’t until the constructor body is instantiated that the compiler finds that only one of the two conversions will compile successfully.

So, one has to be explicit when creating the shared pointer:

Foo foo(std::tr1::shared_ptr<const Bar>(new Bar));

Except, usually we create shared pointers using a helper function, to avoid specifying the type name twice:

template <typename T_>
std::tr1::shared_ptr<T_> make_shared_ptr(T_ * const t)
{
    return std::tr1::shared_ptr<T_>(t);
}

So we’re stuck having to use a slightly weird looking allocation:

Foo foo(make_shared_ptr(new const Bar));

Incidentally, C++0x has a std::make_shared which is a lot better than this, but it requires rvalue references and std::forward to work. It would look like this:

Foo foo(std::make_shared<Bar>());

Or, if we still need the const:

Foo foo(std::make_shared<const Bar>());

Why might we not need the const? The current C++0x draft standard includes the wording “[the template] constructor shall not participate in the overload resolution unless U_ * is implicitly convertible to T_ *“, which presumably means implementations have to solve the problem using concepts to restrict the template constructor.

And, of course, there’s one final gotcha. The new const Foo form is only legal if Foo has a user defined constructor. I have no idea why, but C++03 explicitly says so.

Making Paludis Compile with C++0x

I managed to get gcc 4.4 svn to compile, so I decided to see just how badly the experimental C++0x support would break Paludis. Turns out, not too badly. Firstly, things caught by increased strictness or general rearrangement of headers:

  • We had a few extra semicolons lying around. These now generate warnings, so we might as well shut them up. [fix]
  • We weren’t including <stdint.h> to get uintptr_t. Things were working by fluke because other headers were including it. [fix]
  • We were using ::rename rather than std::rename. [fix]

Then, the real issues:

  • n2246 adds a std::next. Paludis has a paludis::next. ADL means this sometimes causes confusion. To keep compatibility with non-0x compilers, we use using to get std::next into paludis:: where necessary. [fix]
  • std::list<>::push_back is now overloaded on rvalue references, so we can no longer easily get a PMF. If we were only interested in 0x, we’d use a lambda, but for backwards compatibility we write a wrapper function instead. (Or we could use the static_cast hack, but that’s horribly unreadable.) [fix]

All in all, not too bad. I suspect things will get a bit messier if a concept-enabled standard library makes it into the final proposal, but that can be dealt with later…

Bedtime Reading III

On-Demand Loading using Smart Pointers

Previously, I explained how to implement something like the Active Object thread pattern using smart pointers. Next we’ll use the same trick to implement on-demand, lazy construction.

There’s nothing difficult here, once we realise we can reuse the ‘return a different pointer’ trick. We make use of std::tr1::function rather than a raw function pointer so that parameter values can be bound at pointer-construction time. Again, we parameterise on pointer type, not raw type.

template <typename T_>
class DeferredConstructionPtr
{
    private:
        mutable T_ _ptr;
        std::tr1::function<T_ ()> _f;
        mutable bool _done;

    public:
        DeferredConstructionPtr(const std::tr1::function<T_ ()> & f) :
            _ptr(),
            _f(f),
            _done(false)
        {
        }

        DeferredConstructionPtr(const DeferredConstructionPtr & other) :
            _ptr(other._ptr),
            _f(other._f),
            _done(other._done)
        {
        }

        DeferredConstructionPtr &
        operator= (const DeferredConstructionPtr & other)
        {
            if (this != &other)
            {
                _ptr = other._ptr;
                _f = other._f;
                _done = other._done;
            }
            return *this;
        }

        T_ operator-> () const
        {
            if (! _done)
            {
                _ptr = _f();
                _done = true;
            }

            return _ptr;
        }
};

Again, some caveats:

  • Dealing with const is left to the reader.
  • Dealing with thread safety is left to the next article.
  • If the constructor in question can throw exceptions, the exception will be thrown at what could be a rather unobvious place. This may or may not be a problem.

Bedtime Reading II

Implementing Active Objects using Smart Pointers

An Active Object is, essentially, a threaded design pattern where an object is always called from the same thread, and where a proxy provides synchronised access to that object. It’s useful in cases where there’s little or no parallelism possible due to the underlying object’s state, and where implementing manual locking would be a nuisance.

The problem in C++, generally, is implementing the proxy. If the underlying object’s class has twenty methods, you have to implement twenty trivial wrapper methods for the proxy that obtain the lock and then forward. Whilst not difficult to do, it’s rather tedious.

But what if the proxy is a smart pointer? Then you could do proxy->method() for any method the underlying class has, and you wouldn’t have to worry about writing wrapper methods. The question then is how to write Proxy<UnderlyingClass>::operator-> ().

It can’t simply return the underlying instance, since it needs to do locking. And it can’t obtain a lock and then return the underlying instance, since the lock would never be released.

But all is not lost. The standard has some rather interesting wording:

An expression x->m is interpreted as (x.operator->())->m for a class object x of type T if T::operator-> () exists and if the operator is selected as the best match function by the overload resolution mechanism.

Usually, implementations of operator-> () simply return SomeType *. But with the way the standard is worded, they could instead return a second class instance, so long as that new class has operator-> () defined.

So we make Proxy<UnderlyingClass>::operator-> () return a temporary that, upon construction, obtains the shared lock, and upon destruction releases it. Then we rely upon the temporary’s operator-> () returning a pointer to the underlying object to get the actual method call.

Except… It’s not that simple. With current C++, the temporary has to be copyable (even though the compiler optimises out the copy), but typically mutex locks are noncopyable. With C++0x we’ll be able to return by rvalue reference, but until then we have to store the lock in a shared pointer rather than directly.

We’re going to start making use of this in Paludis to cut down on boilerplate code. A working implementation follows. We have a couple of refinements: the class is called ActiveObjectPtr, and it’s parameterised by a pointer to the underlying class (we’ll see why in a later post — generally it’ll just be a std::tr1::shared_ptr, but leaving it as a parameter lets us compose smart pointer types).

template <typename T_>
class ActiveObjectPtr
{
    private:
        T_ _ptr;
        std::tr1::shared_ptr<Mutex> _mutex;

        class Deref
        {
            private:
                const ActiveObjectPtr * _ptr;
                std::tr1::shared_ptr<Lock> _lock;

            public:
                Deref(const ActiveObjectPtr * p) :
                    _ptr(p),
                    _lock(make_shared_ptr(new Lock(*p->_mutex)))
                {
                }

                const T_ & operator-> () const
                {
                    return _ptr->_ptr;
                }
        };

        friend class Deref;

    public:
        ActiveObjectPtr(const T_ & t) :
            _ptr(t),
            _mutex(new Mutex)
        {
        }

        ActiveObjectPtr(const ActiveObjectPtr & other) :
            _ptr(other._ptr),
            _mutex(other._mutex)
        {
        }

        ~ActiveObjectPtr()
        {
        }

        ActiveObjectPtr &
        operator= (const ActiveObjectPtr & other)
        {
            if (this != &other)
            {
                _ptr = other._ptr;
                _mutex = other._mutex;
            }
            return *this;
        }

        Deref operator-> () const
        {
            return Deref(this);
        }
};

At this stage, it’s not really a proper smart pointer. It doesn’t (and probably shouldn’t) have operator*, and it ignores the const issue. Handling these is left as an easy exercise for the reader.

Highly Evil C++

“Elements of Programming” and the ↦ symbol

I’m working my way through the draft of Elements of Programming. And I have a gripe. A small gripe, admittedly, but a gripe non-the-less.

The book uses a lot of symbols. I don’t have a problem with that, and in general it’s more readable that way than spelling everything out. Most of the symbols are defined in an Appendix as well, which is also good. But there is one that is not: the ↦ symbol (or in ascii art, |-->). Here’s an example of its use, taken from page 123:

Definition
abstraction(Op : BinaryOperation)
associative : Op → bool
    op ↦ (∀a, b, c ∈ Domain(op)) op(a, op(b, c)) = op(op(a, b), c)

Whilst the meaning is obvious, I can’t work out how the symbol is supposed to be read. I can’t find that symbol used in reference literature either. The best I can come up with is “Is such that”, but that’s rather clumsy…

Update: Looks like it’s “maps to”. Essentially the associative : line defines the type signature for the associative property, and the last line defines it in terms of a lambda. Looks like mathematicians came up with yet another way of defining functions that I managed to miss. Give me back my λ!

C++0x Concepts

Some interesting material on C++0x concepts:

Follow

Get every new post delivered to your Inbox.