Ciaran McCreesh’s Blag

Now with 17% more caffeine

Posts Tagged ‘paludis’

Automake and Parallel Tests

Posted by Ciaran McCreesh on November 5, 2009

One complaint occasionally encountered is that package test suites take too long to run. One of the packages for which that complaint is sometimes encountered is Paludis. The complainers rarely mention the difficulty of recovering a system from an error that should have been caught before installation, but still, running test suites faster cannot be a bad thing. One way to go about that is to make better use of parallelism.

There are a number of issues involved here. The first of these doesn’t apply to Paludis, since we use our own src_test in ebuilds and exhereses (for unrelated reasons), but other package maintainers may find it interesting: the default src_test in Gentoo EAPIs calls emake check -j1.

The reasons for this are historical: when Nick and I worked out the original src_test, a good number of the packages upon which we tested it hated parallel tests. Admittedly, it didn’t help that nearly all of the packages in question were using hand-rolled test suite runners… Also, those were the bad old days when nearly everyone was using a single core x86 CPU, and very few people cared enough to make sure their packages built in parallel.

Even now, the -j1 isn’t something we could just remove arbitrarily on Gentoo. When looking at it for Exherbo’s Exheres format, Ingmar found a non-trivial number of packages that hated not having -j1. For Gentoo, removing the -j1 would certainly require EAPI control.

But here’s the problem: the -j1 doesn’t just affect running the tests. It also affects building the tests. For Paludis, over half the compile time is spent building tests. Were that done with -j1 on a typical quad core box, it would double the compile time. In contrast to running the tests, building usually is parallel-safe, so the -j1 has considerable impact. Alas, most build systems don’t provide a target for building tests without running them.

Even without the -j1, the second issue looms large. The Automake test runner doesn’t parallelise test execution. Tests are run one after another in strict sequence regardless of the number of jobs make is allowed to use.

Fortunately, Automake 1.11 includes a new test runner that does support parallel execution. Alas, this test runner is only used by packages that explicitly request it, and it isn’t something that can be shoved into packages externally by the package manager.

Since every Paludis test is already safely parallelisable (no test does any work outside its individual temporary test directory), I’ve switched us over to using the new test runner. Doing so consisted of the following:

  • Adding AUTOMAKE_OPTIONS = parallel-tests to every Makefile.am. Note that this option cannot simply be set in the top level makefile.
  • Switching from using TESTS_ENVIRONMENT to LOG_COMPILER. The former still works, but is marked for end user use.
  • Working around an annoying ‘feature’ that prevents rules being generated for running tests for TESTS that include a variable set by configure.ac.
  • Doing some ungodly hacks with file descriptors to be able to output to stdout from the test script runner.
  • Splitting up some of the larger tests into multiple smaller tests, to avoid having no output for several minutes, and to increase parallelisability.

The LOG_COMPILER deserves further comment. To avoid massive confusion, when running tests in parallel, output is automatically redirected to a .log file. That’s all very well, but Automake is excessively quiet on this, and it looks a lot like nothing is happening whilst tests run. To work around this, the following ungodly hack appears to work:

LOG_COMPILER = \
    test "x$$BASH_VERSION" == x || \
        eval "exec 3<&1 ; export PALUDIS_TESTS_REAL_STDOUT_FD=3" ; \
    env \
        VARS_NEEDED_FOR_TESTS="whatever" \
        sh $(top_srcdir)/test/run_test.sh

Then, the test runner can output progress messages to stdout by using the $PALUDIS_TESTS_REAL_STDOUT_FD file descriptor (if the environment variable is set; unfortunately, some shells won’t let you do this), while leaving more verbose information to be logged as normal. Since POSIX guarantees writes to pipes of no more than a certain size to be atomic, we don’t have to worry about intermingled output so long as we keep our lines short.

As for the annoying ‘feature’, things like this work fine with the old test runner, but not the new one:

TESTS = $(variable_set_by_configure)
LOG_COMPILER = blah

For Paludis, we use this kind of construct to run tidy on HTML files, where the list of HTML files is taken from some configure.ac voodoo. Fortunately, we can work around it, so long as all of the tests have a common file extension:

TESTS = $(variable_set_by_configure)
TEST_EXTENSIONS = .html
HTML_LOG_COMPILER = blah

Note the arbitrary and annoying case change for HTML.

The end result of this tinkering is that Paludis tests on a quad core box now take around two minutes to run rather than six.

I understand it is traditional when writing long, rambling and largely pointless blog posts about parallel builds to end with shameless whoring of an Amazon wishlist.

Posted in build systems, paludis internals | Tagged: , | 1 Comment »

Exherbo Development Workflow

Posted by Ciaran McCreesh on November 3, 2009

In answer to what appears to be becoming a frequently asked question in #exherbo, my development workflow (and by extension, the one true workflow, any deviation from which is clearly heresy) is as follows:

For any repository I consider interesting, I have a local copy (but not a clone, because git clone is Satan’s work) in my home directory.

For Paludis, every repository it sees has location under /var somewhere. Paludis is never pointed at a repository that is modified by anything other than itself.

For syncing, any repository I consider interesting is synced using sync = git+file:///home/users/ciaranm/repos/blah, with sync_options = --reset (as previously described). Others are synced as normal.

Before syncing normally, I pull all of the interesting repositories I have checked out in my home directory, so that Paludis ends up with everything up to date. The shell one-liner to do this is in my history, so it’s no additional work thanks to the wonder that is reverse-i-search.

On those rare occasions when I have to do some work on Exherbo that I can’t either just yell about until someone fixes it for me or force to be fixed by making Paludis reject it, I work as follows:

  • Changes are made and committed in my home directory copy of the repository.
  • Paludis is synced, picking up those changes.
  • Testing is done.
  • More changes are made and committed, since things never work as expected the first time.
  • Paludis is synced, picking up those changes.
  • And so on.
  • When things finally work, git rebase -i is used to turn all my messy work-in-progress commits into something suitable for pushing. Given that other people are often working on the repositories in question, this also rebases my changes against current master.
  • Things are pushed.
  • When syncing again, the --reset ensures that Paludis ends up with the history-rewritten result, not some horrible automatic merge of the end result and previous works in progress.

Fortunately, Git is easily powerful enough to handle this kind of thing, meaning Exherbo development workflows are designed around what works best, not around what is possible.

On a related note, I am still strongly considering making --reset the default one of these days. Anyone using paludis --sync on a repository they themselves modify should quickly justify their iniquity or risk being horribly surprised when the default changes.

Posted in exherbo | Tagged: , , | 6 Comments »

Paludis 0.42.2 Released

Posted by Ciaran McCreesh on November 3, 2009

Paludis 0.42.2 has been released:

  • An obscure resolver bug that results in ‘evolution’ trying to downgrade ‘gnupg’ on Gentoo has been fixed.
  • paludis -i foo::installed now gives an explanation of why it doesn’t do what some people seem to expect.
  • Assorted documentation tweaks.

Posted in paludis releases | Tagged: | Leave a Comment »

Paludis 0.42.1 Released

Posted by Ciaran McCreesh on October 31, 2009

Paludis 0.42.1 has been released:

  • Various improvements to error handling.
  • Syncing svn repositories no longer breaks when using weird locales.
  • GNU info handling will now work even if Paludis is run via sudo with env_reset enabled.

Posted in paludis releases | Tagged: | Leave a Comment »

Paludis 0.42.0 Released

Posted by Ciaran McCreesh on October 27, 2009

Paludis 0.42.0 has been released:

  • Qualudis has been removed.
  • Various cached values are now forcibly discarded when they’re not going to be used any more, leading to paludis –owner and reconcilio using considerably less RAM.
  • Various bug, style and performance fixes related to updates (package moves etc). By default updates will still only be displayed but not carried out; consult the FAQ for details.

Posted in paludis releases | Tagged: | Leave a Comment »

Paludis 0.42.0_alpha1 Released

Posted by Ciaran McCreesh on October 17, 2009

Paludis 0.42.0_alpha1 has been released:

  • The Git syncer now has various options for dealing with branches.
  • Experimental support for profile updates (package moves etc). By default updates will be displayed but not carried out; consult the FAQ for details.
  • Support for the recently changed EAPI 3 is present and used during tests but excluded from the install target.

Posted in paludis releases | Tagged: | Leave a Comment »

Changes to the Paludis Git Syncer

Posted by Ciaran McCreesh on September 26, 2009

I’ve just committed two changes to the Paludis Git syncer.

The first allows you to do sync_options = --branch=foo to specify a particular branch. I don’t expect this to be widely used, since branching for repositories usually means you’re not taking advantage of the better facilities available for that kind of thing. Still, it has its uses.

The second is sync_options = --reset. Unlike with, say, rsync, syncing via Git will merge any changes you’ve made to the repository with new upstream changes (it uses git pull). With --reset, it will instead discard any local changes and just become equal to whatever you’re syncing against (using instead git fetch and git reset --hard).

It’s a matter of considerable debate as to whether the reset behaviour is the right thing to do.

On the one hand, some people like working directly on a checkout to which Paludis is pointed, but still want cave sync to bring in updates.

On the other hand, doing that is horrible and evil, and a much better workflow is this:

  • Have a local checkout for development work. Commit your changes to it.
  • Have a separate checkout that is synced against your local checkout for Paludis use. After making changes, commit them to your local checkout and sync.
  • When happy with your changes, rebase and squash them in your local checkout, and then push them upstream.

Doing that requires --reset, since otherwise the checkout Paludis sees will end up as some horrible auto-merged mess that includes changes you thought you’d discarded ages ago.

I’m strongly considering making --reset by default sometime in the future. However, this will make syncing a destructive operation for anyone who hates puppies enough to be doing work on a Paludis-synced checkout (in the same way that it’s already destructive for rsync).

Posted in paludis for users | Tagged: , | 1 Comment »

Paludis 0.40.1 Released

Posted by Ciaran McCreesh on September 17, 2009

Paludis 0.40.1 has been released:

  • Bugfix: sometimes fetch failures would not register as errors.

Posted in paludis releases | Tagged: | Leave a Comment »

Ten Ways PMS Raped your Baby

Posted by Ciaran McCreesh on September 15, 2009

Since hating PMS seems to be back in fashion again this week, I thought I’d list ten of the stupidest claims that I’ve seen of late in the hope that some of the FUD might die down:

1. PMS slows down new features and prevents innovation

Actually, once new ebuild-usable features end up in Portage, they very quickly end up in a published EAPI. The reason you can’t use all the fancy new features in EAPI 3 that you’ve all been waiting for for so long is that Portage still hasn’t implemented them. In addition, we’ve gone from “EAPI 3 will be ready for Portage within a month” three months ago to “there’s an 80% chance EAPI 3 support will be ready in Portage by the end of the year“.

Time-wise, EAPI 3 has been waiting for Portage for six months and before that, for the Council to come to decisions for two months. The total overhead imposed by PMS was around four days, and those four days weren’t holding up anything else anyway.

Still, at least there’s a long way to go before EAPI 3 takes as long as it took Portage to get use dependencies

Similarly, PMS isn’t to blame for profiles not being able to make use of new features. People who are telling you this are probably thinking about an undocumented Portage feature that isn’t in PMS and that isn’t supported by other package managers. This feature could very easily be in PMS, but there’s been no interest from the Council in retroactively adding it to EAPI 3 or in doing an EAPI 2.1 just to include that feature. The feature almost certainly will be in EAPI 4, but work on EAPI 4 isn’t going to start until Portage is done with EAPI 3. So again, PMS isn’t the reason you can’t use it.

2. PMS or EAPI is about ebuilds, not profiles

This one’s from people who haven’t bothered to read the opening of PMS, or who haven’t been paying attention to the Council.

PMS covers ebuilds and the tree format, including things like profiles. The aim is to cover everything necessary to produce a package manager that can use ebuilds (except possibly VDB, which probably shouldn’t be necessary for ebuild support but currently is…).

EAPI is used to indicate the rules used to handle ebuilds, and also profiles following the Council accepting Zac’s proposal last year.

3. PMS imposes an Exherbo agenda upon Gentoo

Exherbo doesn’t use PMS or Gentoo EAPIs.

4. PMS imposes a Paludis agenda upon Gentoo

Again, no. There’s no feature in any EAPI that’s there because of Paludis. Every feature in EAPIs 1, 2 and 3 was either requested by a Gentoo developer or included to make things easier for Portage. To get into an EAPI, features merely have to be vetted by the Portage team and the Council.

5. PMS is only used by Paludis anyway

Nope. PMS is used by both Portage and Paludis, as well as a number of third party libraries and utilities which don’t support full package management operations (things that need to compare two versions, for example, need to use PMS), and it was also used by Pkgcore.

Saying that PMS is only of use for third party package managers is like saying that the HTTP specification is irrelevant for Internet Explorer.

6. PMS stops other distributions from doing things

Again, no. Other distributions can ignore PMS entirely. Doing so would of course be a horrible idea, as all the people who wrote websites to work on Internet Explorer 5 found out, but that’s their decision to make. A much better option would be for those distributions to roll their own derived EAPIs, which, as happened for the Gentoo KDE project’s kdebuild-1, could be added to PMS. That way they are guaranteed that any undocumented features they rely upon won’t break with the next release, as well as avoiding complaints from users who want to use a different package manager, thus avoiding the problems people who wrote Internet Explorer 5 specific code rather than following the HTML specification encountered.

7. PMS stops Gentoo from deciding its own features

PMS is run by a Gentoo developer and approval of new EAPI features is handled by the Portage team and the Gentoo council. For that matter, the PMS team has never rejected a single feature for inclusion in a future EAPI.

8. PMS introduces red tape

No, the previous Council introduced red tape, primarily because a couple of Council members refused to read any submissions more than five minutes before a meeting. Had the Council used the two weeks between meetings to read over and state their opinions on the EAPI 3 feature list, EAPI 3 would have been approved within two meetings rather than dragging on for months.

Unfortunately the current Council doesn’t seem to have improved, with at least one member showing up to a meeting having not read the agenda beforehand.

9. PMS imposed stupid ordering on metadata files

There’s a tendency amongst certain people to blame PMS for stupid or arbitrary rules. A typical example is to moan about PMS because it says that EAPI has to be on line 15 of metadata cache files. Quite how PMS is to blame for a decision that was made before PMS existed, and that was made because line 15 was the first available cache line when EAPI was introduced as a metadata key, is completely beyond me. Similarly, the rules for handling leading 0s in version numbers is Portage’s fault (although ultimately it’s Perl’s fault), as is any other format gripe you care to name.

10. PMS will stop me from using my favourite feature in configuration files

PMS doesn’t discuss user configuration at all. Handling of user configuration is left entirely to the package manager.

Posted in gentoo | Tagged: , , , | 4 Comments »

New Resolver Data Structure Pictures, or, Why I Need Lots of Pens

Posted by Ciaran McCreesh on September 7, 2009

As some people may or may not have heard, one of the big Paludis projects we’ve been discussing for the past couple of years has been to come up with a super-amazing dependency resolver that can handle ABIs, binaries and chroots perfectly, provide complete customisation so people can do stupid things like “update everything except glibc”, cure cancer, be adapted to support arbitrary new features with no difficulty and explain all of its decisions in an easy to understand manner. Obviously, doing all of that at once is rather ambitious, so in the interests of it ever being finished, I’ve instead been working on a stupid but incrementally expandable resolver designed around:

  • Doing only the basics initially, but having a simple design that cleanly splits apart things like ID selection, dependency selection and ordering, even if doing so prevents certain short cuts from being taken. That way, when we add things in later, we don’t have to rely upon lots of subtle interactions between all the different components.
  • Making sure that we can explain exactly why we’ve done a particular thing, even if this means not including clever trickery.
  • Having easily accessible innards, meaning if people still insist upon having an “upgrade everything except glibc” option, we can easily move a very small amount of code out into a std::tr1::function and let clients handle it that way without having to pollute the resolver.

The basic features all now pretty much work, and cave resolve is usable on Exherbo (although not Gentoo at present, since I haven’t implemented virtuals handling), although there’s no sensible error handling, several obvious optimisations haven’t been made, the UI is highly crude and there are no bells, whistles or cookies. Still. being able to do this is rather fun:

$ cave resolve gnome --explain libbonoboui:2
[snip]

Explaining requested decisions:

For gnome-platform/libbonoboui:2:
    The following constraints were in action:
      * >=gnome-platform/libbonoboui-2.1.1, use installed if possible, installing to /
        because of dependency >=gnome-platform/libbonoboui-2.1.1 from gnome-desktop/gnome-panel-2.26.3:0::gnome
      * >=gnome-platform/libbonoboui-2.1.1, use installed if possible, installing to /
        because of dependency >=gnome-platform/libbonoboui-2.1.1 from gnome-desktop/gnome-panel-2.26.3:0::gnome
      * >=gnome-platform/libbonoboui-2.13.1:2, use installed if possible, installing to /
        because of dependency >=gnome-platform/libbonoboui-2.13.1:2 from gnome-platform/libgnomeui-2.24.0:2::gnome
      * >=gnome-platform/libbonoboui-2.13.1:2, use installed if possible, installing to /
        because of dependency >=gnome-platform/libbonoboui-2.13.1:2 from gnome-platform/libgnomeui-2.24.0:2::gnome
    The decision made was:
        Use gnome-platform/libbonoboui-2.24.0:2::gnome
        Install to / using repository installed

Now to the important part: the pretty pictures!

Regular visitors to #exherbo may have noticed me moaning that I don’t have enough pens to implement their feature of choice. Here’s why:

Resolver Design 7

Resolver Design 7

Since I can’t keep track of more than around five classes at once in my head, I have to have summaries written out on paper. Furthermore, each class summary has to be in a different colour (although my scanner’s done a fairly good job of hiding that in the picture above…), which means I need a pen (a proper fountain pen, or I can’t write with it) for each class. This in turn means that any new feature will likely require one or more additional pens, and I am more or less at my limit.

I also need a couple of colours spare to be able to scribble all over the diagrams, draw lines, change things and generally make a huge mess of things. An earlier design page now looks like this (and note that this is the most readable of the earlier design pages):

Resolver Design 5

Resolver Design 5

On top of that, any problem too complicated to be solved in my head gets its own highly weird picture drawn out. Unfortunately the only example of this that I have handy (working out a circular dependency breaking algorithm) is on A3 paper, which I can’t easily scan…

I’ve found that working on paper for this kind of thing is much faster than working on a computer (writing’s as fast as typing, but the layout’s much quicker on paper, and scribbling over computerised designs doesn’t work). I don’t use a formal design system at this stage because it’s more pain than it’s worth, especially when there’s no need for other people to be able to read the design without being able to ask questions, although in some ways what I do is close to CRC cards with all the bits I don’t need ripped out.

I do not claim that my system is sane; merely that it works.

Posted in paludis internals | Tagged: | 4 Comments »