Ciaran McCreesh’s Blag

Now with 17% more caffeine

Archive for November, 2009

C++ Explicit Template Instantiation Hate Redux

Posted by Ciaran McCreesh on November 27, 2009

Today’s hatred of C++ is brought to you by the section [temp.explicit]:

A definition of a class template or class member template shall be in scope at the point of the explicit instantiation of the class template or class member template.

Unfortunately, it doesn’t “explicit instantiation definition” there, so you can’t do an explicit instantiation declaration when you only have a class declaration available. I can’t figure out what changing this would break, and whether it’s just an omission (explicit instantiation declarations are new in C++0x, but explicit instantiations are not) or a deliberate restriction.

Whilst we’re on the subject, not being able to use typedef names when explicitly instantiating is still a pain in the arse too, although the implications of allowing that are almost certainly moderately icky.

Posted in hate | Tagged: , | Leave a Comment »

C++ Template Specialisation Hate

Posted by Ciaran McCreesh on November 26, 2009

Today’s annoying C++ feature is that partial specialisations of a nested type of a template class don’t work:

template <typename T_>
struct S;

template <typename T_>
struct T
{
    struct U;
};

template <typename T_>
struct S<T<T_>::U>
{
};

Depending upon your compiler, the specialisation will either be rejected with a highly cryptic error message, or accepted but ignored. I don’t seem to be able to find the part of the standard that bans doing this, either, but that doesn’t necessarily mean it’s legal…

The solution, in any case, is to hoist the nested class out of the template, and use a typedef instead:

template <typename T_>
struct S;

template <typename T_>
struct T_U;

template <typename T_>
struct T
{
    typedef T_U<T_> U;
};

template <typename T_>
struct S<T_U<T_> >
{
};

I’ve been of the opinion that nested classes are generally far more pain than they’re worth for a while now (they also can’t be forward-declared); I’m highly tempted to just stop using them anywhere at all, and switch exclusively to using typedefs.

Posted in hate | Tagged: , | Leave a Comment »

This Week in Python Stupidity: os.stat, os.utime and Sub-Second Timestamps

Posted by Ciaran McCreesh on November 15, 2009

The primary design principle behind the Python programming language is to take everything that’s horrible and wrong with Perl and get it horrible and wrong in a completely different and even more hideous way. Today, however, we shall be looking at a particularly egregious case of stupidity the likes of which not even PHP has managed to replicate.

On Unix, timestamps have traditionally been held as an integer number of seconds since the epoch. The modification time for a file is one place such a timestamp has been used. Two groups of system calls are of interest to us here.

First, stat (and its fstat and lstat variants). The stat system call places information about a file into a struct also named stat (which is possible thanks to a lesser case of brain damage in C’s design). To get the mtime of a file, historically we would have used the st_mtime field, which is of type time_t, which is an integer of some kind:

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char * argv[])
{
    struct stat s;
    if (-1 == stat("timmy", &s))
        return EXIT_FAILURE;

    printf("stat.st_mtime for timmy is %ld\n", s.st_mtime);
    return EXIT_SUCCESS;
}

Sometimes we might want to modify a file, but not affect its mtime. Thus, we need a way to set a file’s mtime to a given value, and to do this we would historically have used a function from the utime family:

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <utime.h>
#include <fcntl.h>

int main(int argc, char * argv[])
{
    struct stat s;
    if (-1 == stat("timmy", &s))
        return EXIT_FAILURE;

    int fd;
    fd = open("timmy", O_WRONLY, O_TRUNC | O_CREAT);
    if (-1 == fd)
        return EXIT_FAILURE;
    if (0 != close(fd))
        return EXIT_FAILURE;

    struct utimbuf times = { .actime = s.st_atime, .modtime = s.st_mtime };
    if (-1 == utime("timmy", &times))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

(Sidenote: the above almost certainly should be using fstat and futimes instead to avoid race conditions, but this is irrelevant for our examples.)

But all of this operates only on a second-precision basis. For many applications this is no longer sufficient. Fortunately, some kernels and filesystems now support nanosecond-resolution timestamps.

First, for the stat family: rather than using st_mtime, we now use st_mtim, which is a struct timespec:

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char * argv[])
{
    struct stat s;
    if (-1 == stat("timmy", &s))
        return EXIT_FAILURE;

    printf("stat.st_mtim for timmy is %lds %ldns\n",
            s.st_mtim.tv_sec, s.st_mtim.tv_nsec);
    return EXIT_SUCCESS;
}

And if our filesystem supports it, we get something like:

$ ./mtimens
stat.st_mtim for timmy is 1258321672s 173919603ns

As we can see, running our old utime-using code preserves the seconds but not the nanoseconds:

$ touch timmy
$ ./mtimens
stat.st_mtim for timmy is 1258321978s 62671870ns
$ ./utime
$ ./mtimens
stat.st_mtim for timmy is 1258321978s 0ns

To modify preserving nanoseconds, we use either utimensat or futimens:

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <utime.h>
#include <fcntl.h>

int main(int argc, char * argv[])
{
    struct stat s;
    if (-1 == stat("timmy", &s))
        return EXIT_FAILURE;

    int fd;
    fd = open("timmy", O_WRONLY, O_TRUNC | O_CREAT);
    if (-1 == fd)
        return EXIT_FAILURE;
    if (0 != close(fd))
        return EXIT_FAILURE;

    struct timespec times[2] = { s.st_atim, s.st_mtim };
    if (-1 == utimensat(AT_FDCWD, "timmy", times, 0))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

And now it works as expected:

$ touch timmy
$ ./mtimens
stat.st_mtim for timmy is 1258322326s 852774523ns
$ ./utimens
$ ./mtimens
stat.st_mtim for timmy is 1258322326s 852774523ns

Incidentally, POSIX.1-2008 considers the non-nanosecond-resolution functions and members to be deprecated, although since the nanosecond resolution functions aren’t universally available yet, a certain amount of autovoodoo is generally required…

Now we shall look at some Python. First, the old way:

import os

s = os.stat("timmy")

f = open("timmy", "w+")
f.close()

os.utime("timmy", (s.st_atime, s.st_mtime))

Now, to see if we can guess how the new way works:

>>> import os
>>> os.stat("timmy").st_mtim
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'posix.stat_result' object has no attribute 'st_mtim'

Mmm, nope. Time to consult the documentation. Nothing under stat, but there’s something interesting called stat_float_times:

stat_float_times([newvalue])

Determine whether stat_result represents time stamps as float objects. If newvalue is True, future calls to stat() return floats, if it is False, future calls return ints. If newvalue is omitted, return the current setting.

Uh oh. This can’t be good. Let’s look more closely at what happens when we run our code that uses stat.st_mtime and os.utime:

$ touch timmy
$ ./mtimens
stat.st_mtim for timmy is 1258324320s 762942258ns
$ python utime.py
$ ./mtimens
stat.st_mtim for timmy is 1258324320s 762942000ns
$ ./utimens 12345678901 111111111
$ ./mtimens
stat.st_mtim for timmy is 12345678901s 111111111ns
$ python utime.py
$ ./mtimens
stat.st_mtim for timmy is 12345678901s 111110000ns

What’s that, Lassie? Timmy has lost several significant digits of its sub-second mtime? Oh noes!

Yup, that’s right, Python’s underlying type for floats is an IEEE 754 double, which is only good for about sixteen decimal digits. With ten digits before the decimal point, that leaves six for sub-second resolutions, which is three short of the range required to preserve POSIX nanosecond-resolution timestamps. With dates after the year 2300 or so, that leaves only five accurate digits, which isn’t even enough to deal with microseconds correctly. Brilliant.

Posted in python | Tagged: | 3 Comments »

How Round Is Your Circle?

Posted by Ciaran McCreesh on November 12, 2009

As everyone knows, a point is that which has no part, a line is a breadthless length, and it is possible to cut a sphere up into a finite number of pieces, move them around and put them back together again to get two spheres identical in every way to the original sphere.

Unfortunately, in reality, points have a size, lines have a width, spheres aren’t spheres and they can’t be cut up into infinitely complicated pieces. How Round Is Your Circle, Where Engineering and Mathematics Meet by John Bryant and Chris Sangwin is an attempt by engineers to convince mathematicians that caring about real world issues can be interesting.

The book covers various practical issues, such as:

  • How to draw a straight line, and how to make a ruler
  • How to test how circular a circle is
  • How to measure area

In the process, it discusses all kinds of cunning gadgetry used in the olden days before digital computers and mass production, from steam engine linkages and pistons to slide rules and draughting devices. It also covers various physical demonstrations of geometric problems, and illustrates what happens when physical inaccuracies are ignored:

64 = 65

It’s certainly an interesting read, although I would have preferred more emphasis on the tools and gadgets used than on demonstrations of things they can be used to make. The maths isn’t particularly heavy, and shouldn’t put too many people off. Similarly, there’s nothing on the engineering side that would be inaccessible to anyone with no engineering background.

Posted in hardware | Tagged: | 2 Comments »

Automake and Parallel Tests

Posted by Ciaran McCreesh on November 5, 2009

One complaint occasionally encountered is that package test suites take too long to run. One of the packages for which that complaint is sometimes encountered is Paludis. The complainers rarely mention the difficulty of recovering a system from an error that should have been caught before installation, but still, running test suites faster cannot be a bad thing. One way to go about that is to make better use of parallelism.

There are a number of issues involved here. The first of these doesn’t apply to Paludis, since we use our own src_test in ebuilds and exhereses (for unrelated reasons), but other package maintainers may find it interesting: the default src_test in Gentoo EAPIs calls emake check -j1.

The reasons for this are historical: when Nick and I worked out the original src_test, a good number of the packages upon which we tested it hated parallel tests. Admittedly, it didn’t help that nearly all of the packages in question were using hand-rolled test suite runners… Also, those were the bad old days when nearly everyone was using a single core x86 CPU, and very few people cared enough to make sure their packages built in parallel.

Even now, the -j1 isn’t something we could just remove arbitrarily on Gentoo. When looking at it for Exherbo’s Exheres format, Ingmar found a non-trivial number of packages that hated not having -j1. For Gentoo, removing the -j1 would certainly require EAPI control.

But here’s the problem: the -j1 doesn’t just affect running the tests. It also affects building the tests. For Paludis, over half the compile time is spent building tests. Were that done with -j1 on a typical quad core box, it would double the compile time. In contrast to running the tests, building usually is parallel-safe, so the -j1 has considerable impact. Alas, most build systems don’t provide a target for building tests without running them.

Even without the -j1, the second issue looms large. The Automake test runner doesn’t parallelise test execution. Tests are run one after another in strict sequence regardless of the number of jobs make is allowed to use.

Fortunately, Automake 1.11 includes a new test runner that does support parallel execution. Alas, this test runner is only used by packages that explicitly request it, and it isn’t something that can be shoved into packages externally by the package manager.

Since every Paludis test is already safely parallelisable (no test does any work outside its individual temporary test directory), I’ve switched us over to using the new test runner. Doing so consisted of the following:

  • Adding AUTOMAKE_OPTIONS = parallel-tests to every Makefile.am. Note that this option cannot simply be set in the top level makefile.
  • Switching from using TESTS_ENVIRONMENT to LOG_COMPILER. The former still works, but is marked for end user use.
  • Working around an annoying ‘feature’ that prevents rules being generated for running tests for TESTS that include a variable set by configure.ac.
  • Doing some ungodly hacks with file descriptors to be able to output to stdout from the test script runner.
  • Splitting up some of the larger tests into multiple smaller tests, to avoid having no output for several minutes, and to increase parallelisability.

The LOG_COMPILER deserves further comment. To avoid massive confusion, when running tests in parallel, output is automatically redirected to a .log file. That’s all very well, but Automake is excessively quiet on this, and it looks a lot like nothing is happening whilst tests run. To work around this, the following ungodly hack appears to work:

LOG_COMPILER = \
    test "x$$BASH_VERSION" == x || \
        eval "exec 3<&1 ; export PALUDIS_TESTS_REAL_STDOUT_FD=3" ; \
    env \
        VARS_NEEDED_FOR_TESTS="whatever" \
        sh $(top_srcdir)/test/run_test.sh

Then, the test runner can output progress messages to stdout by using the $PALUDIS_TESTS_REAL_STDOUT_FD file descriptor (if the environment variable is set; unfortunately, some shells won’t let you do this), while leaving more verbose information to be logged as normal. Since POSIX guarantees writes to pipes of no more than a certain size to be atomic, we don’t have to worry about intermingled output so long as we keep our lines short.

As for the annoying ‘feature’, things like this work fine with the old test runner, but not the new one:

TESTS = $(variable_set_by_configure)
LOG_COMPILER = blah

For Paludis, we use this kind of construct to run tidy on HTML files, where the list of HTML files is taken from some configure.ac voodoo. Fortunately, we can work around it, so long as all of the tests have a common file extension:

TESTS = $(variable_set_by_configure)
TEST_EXTENSIONS = .html
HTML_LOG_COMPILER = blah

Note the arbitrary and annoying case change for HTML.

The end result of this tinkering is that Paludis tests on a quad core box now take around two minutes to run rather than six.

I understand it is traditional when writing long, rambling and largely pointless blog posts about parallel builds to end with shameless whoring of an Amazon wishlist.

Posted in build systems, paludis internals | Tagged: , | 1 Comment »

Exherbo Development Workflow

Posted by Ciaran McCreesh on November 3, 2009

In answer to what appears to be becoming a frequently asked question in #exherbo, my development workflow (and by extension, the one true workflow, any deviation from which is clearly heresy) is as follows:

For any repository I consider interesting, I have a local copy (but not a clone, because git clone is Satan’s work) in my home directory.

For Paludis, every repository it sees has location under /var somewhere. Paludis is never pointed at a repository that is modified by anything other than itself.

For syncing, any repository I consider interesting is synced using sync = git+file:///home/users/ciaranm/repos/blah, with sync_options = --reset (as previously described). Others are synced as normal.

Before syncing normally, I pull all of the interesting repositories I have checked out in my home directory, so that Paludis ends up with everything up to date. The shell one-liner to do this is in my history, so it’s no additional work thanks to the wonder that is reverse-i-search.

On those rare occasions when I have to do some work on Exherbo that I can’t either just yell about until someone fixes it for me or force to be fixed by making Paludis reject it, I work as follows:

  • Changes are made and committed in my home directory copy of the repository.
  • Paludis is synced, picking up those changes.
  • Testing is done.
  • More changes are made and committed, since things never work as expected the first time.
  • Paludis is synced, picking up those changes.
  • And so on.
  • When things finally work, git rebase -i is used to turn all my messy work-in-progress commits into something suitable for pushing. Given that other people are often working on the repositories in question, this also rebases my changes against current master.
  • Things are pushed.
  • When syncing again, the --reset ensures that Paludis ends up with the history-rewritten result, not some horrible automatic merge of the end result and previous works in progress.

Fortunately, Git is easily powerful enough to handle this kind of thing, meaning Exherbo development workflows are designed around what works best, not around what is possible.

On a related note, I am still strongly considering making --reset the default one of these days. Anyone using paludis --sync on a repository they themselves modify should quickly justify their iniquity or risk being horribly surprised when the default changes.

Posted in exherbo | Tagged: , , | 6 Comments »

Paludis 0.42.2 Released

Posted by Ciaran McCreesh on November 3, 2009

Paludis 0.42.2 has been released:

  • An obscure resolver bug that results in ‘evolution’ trying to downgrade ‘gnupg’ on Gentoo has been fixed.
  • paludis -i foo::installed now gives an explanation of why it doesn’t do what some people seem to expect.
  • Assorted documentation tweaks.

Posted in paludis releases | Tagged: | Leave a Comment »