He's not dead, he's resting

Category Archives: exherbo

Classifying Repository Masks

In the olden days, when carpaskis roamed the earth and people still used CVS, Gentoo’s package.mask looked like this:


Then people started putting in comments:

# S. Lacker <> (1 Apr 2001)
# Randomly makes giant space monkeys attack you with
# pointy sticks on startup.

Since the comments were in a standard format, Portage then started parsing the comments to be able to show the text to users. Paludis also supports this, and Exherbo originally copied Gentoo’s convention.

This is of course disgusting. Recently Exherbo has switched to a new mask format:

) [[
    *author = [ S. Lacker <> ]
    *date = [ 29 Feb 2011 ]
    *token = [ testing ]
    *description = [ Seems to be a bit iffy. Needs more testing. ]

Which is consistent with how annotations are used in exheres-0. Of particular note is the token field.

When unmasking a package, it’s very easy to accidentally unmask more than what you were after. For example, you might be wanting to unmask testing releases for something, but not also unmask scm or insecure versions. Since masks on Gentoo aren’t classified, there’s no way of doing that. But on Exherbo it’s now possible — masks are now marked with a token (such as scm, testing, broken or insecure), and in package_unmask.conf users can do this:

# Unmask only testing versions (and not scm)
x11-drivers/xf86-video-nouveau testing
x11-dri/libdrm testing

# Ignore security masks for PHP, since otherwise we'd never be
# able to use it at all. But don't unmask scm or broken versions.
dev-lang/php security

If we can encourage users to make use of this, it should lead to a large reduction in people breaking their systems or getting horrible resolutions by overly broad unmasking.

Exherbo Development Workflow, Version 2

My original Exherbo Development Workflow post seems to have become the standard way of doing things. However, it does rather assume that you are developing on most repositories most of the time. When that’s not the case, a new feature named “sync suffixes” may be of use. With sync suffixes, a typical workflow now looks like this:

Repositories are configured as normal, with their sync set to point to the usual remote location. In addition, for any repository you are going to work on, you use a sync suffix to specify a local path too. For example:

sync = git:// local: git+file:///home/users/ciaranm/repos/arbor

where /home/users/ciaranm/repos/arbor is a personal copy of the repository that is entirely unrelated to the checkout Paludis uses.

Normally, when you sync, you’ll be syncing against upstream. But when you want to do some work:

  • Update your personal copy of the repository.
  • Work on and commit your changes.
  • Use cave sync --suffix local arbor to sync just that repository, and against your local checkout rather than upstream.
  • Test your changes.
  • Make fixes, commit, sync using the suffix etc until everything works.
  • Use the wonders of git rebase -i to tidy up your work into nice friendly pushable commits.
  • Push or submit a git format-patch for your changes.
  • Go back to syncing without the suffix.

Some things to note:

  • This only really works with Git, and only when using the default ‘reset’ sync mode.
  • You’re never manually modifying any repository which Paludis also modifies.
  • Unlike the original version of this workflow, you only need to keep your personal copies of repositories up to date when you work on them.
  • The suffix things work on sync_options too, if you need it. Thus, for branches, you can use sync_options = --branch=branch-on-upstream local: --branch=my-local-branch.

Configuring Repositories Automatically via RepositoryRepository

Paludis is aware of packages that are in repositories you don’t have configured thanks to the unavailable repository. However, once Paludis has shown you that the package you want is in a repository you don’t have configured, you need to set up a configuration file for that repository (and any repositories it requires) and then sync. This is more work than is really necessary.

Enter RepositoryRepository, also known as r^2. Conceptually, it works as follows:

As well as providing special packages for packages in unavailable repositories, the unavailable repository also now provides packages named ‘repository/blah’ for repositories you don’t have configured. The metadata for these packages includes dependency information etc, along with useful things like the repository’s sync URI.

A new repository, using format = repository, provides special packages for repositories you do have configured.

Repository packages in unavailable repositories can be ‘installed’ to repository repositories. ‘Installing’ a repository creates a configuration file for it, and then syncs the newly created repository.

The configuration file it creates is controlled by a simple template, so it can contain anything you want it to contain.

Exherbo users can follow the setup instructions to start using this. On Gentoo this functionality is not yet available, since we won’t be switching the generated unavailable data to the new format until we’re reasonably sure that everyone is using a Paludis release that supports it.

Exherbo Development Workflow

Update: also see Exherbo Development Workflow, Version 2.

In answer to what appears to be becoming a frequently asked question in #exherbo, my development workflow (and by extension, the one true workflow, any deviation from which is clearly heresy) is as follows:

For any repository I consider interesting, I have a local copy (but not a clone, because git clone is Satan’s work) in my home directory.

For Paludis, every repository it sees has location under /var somewhere. Paludis is never pointed at a repository that is modified by anything other than itself.

For syncing, any repository I consider interesting is synced using sync = git+file:///home/users/ciaranm/repos/blah, with sync_options = --reset (as previously described). Others are synced as normal.

Before syncing normally, I pull all of the interesting repositories I have checked out in my home directory, so that Paludis ends up with everything up to date. The shell one-liner to do this is in my history, so it’s no additional work thanks to the wonder that is reverse-i-search.

On those rare occasions when I have to do some work on Exherbo that I can’t either just yell about until someone fixes it for me or force to be fixed by making Paludis reject it, I work as follows:

  • Changes are made and committed in my home directory copy of the repository.
  • Paludis is synced, picking up those changes.
  • Testing is done.
  • More changes are made and committed, since things never work as expected the first time.
  • Paludis is synced, picking up those changes.
  • And so on.
  • When things finally work, git rebase -i is used to turn all my messy work-in-progress commits into something suitable for pushing. Given that other people are often working on the repositories in question, this also rebases my changes against current master.
  • Things are pushed.
  • When syncing again, the --reset ensures that Paludis ends up with the history-rewritten result, not some horrible automatic merge of the end result and previous works in progress.

Fortunately, Git is easily powerful enough to handle this kind of thing, meaning Exherbo development workflows are designed around what works best, not around what is possible.

On a related note, I am still strongly considering making --reset the default one of these days. Anyone using paludis --sync on a repository they themselves modify should quickly justify their iniquity or risk being horribly surprised when the default changes.

Introducing e4r, a Script to Make Vim Useless

As some of you may be aware, there exist a few dark heretics who have yet to embrace the One True Editor and instead go around peddling their evil ways (via m-x peddle-evil-ways) or their primitive superstitions (yes, ^O to save makes perfect sense!).

Although Exherbo would ordinarily go out of its way to smite those wicked sinners, unfortunately it seems that some of them are so set in their ways that even removing their satanic instruments from the basic install will not quell their iniquity. Thus, in a rare and display of compromise that risks turning Exherbo into… uh, no, wait, I can’t think of any distributions capable of sensible compromises. Anyway:

e4r is a small script that turns Vim into a very crude, minimally functional non-modal editor that approximately resembles Nano. It has no dependencies other than Vim, and is extremely tiny, making it practical to include it in stages for people whose time isn’t valuable enough for them to learn how to use an effective text editor.

Some screenshots:

Editing a file using e4r

Editing a file using e4r

Loading a file with e4r

Loading a file with e4r

e4r Help Menu

e4r Help Menu

Git format-patches welcome.

Feeding ERB Useful Variables: A Horrible Hack Involving Bindings

I’ve been playing around with Ruby to create Summer, a simple web packages thing for Exherbo. Originally I was hand-creating HTML output simply because it’s easy, but that started getting very very messy. Mike convinced me to give ERB a shot.

The problem with template engines with inline code is that they look suspiciously like the braindead PHP model. Content and logic end up getting munged together in a horrid, unmaintainable mess, and the only people who’re prepared to work with it are the kind of people who think PHP isn’t more horrible than an aborted Jacqui Smith clone foetus boiled with rotten lutefisk and served over a bed of raw sewage with a garnish of Dan Brown and Patricia Cornwell novels. So does ERB let us combine easy page layouts with proper separation of code?

Well, sort of. ERB lets you pass it a binding to use for evaluating any code it encounters. On the surface of it, this lets you select between the top level binding, which can only see global symbols, or the caller’s binding, which sees everything in scope at the time. Not ideal; what we want is to provide only a carefully controlled set of symbols.

There are three ways of getting a binding in Ruby: a global TOPLEVEL_BINDING constant, which we clearly don’t want, the Kernel#binding method which returns a binding for the point of call, and the Proc#binding method which returns a binding for the context of a given Proc.

At first glance, the third of these looks most promising. What if we define the names we want to pass through in a lambda, and give it that?

require 'erb'

puts"foo <%= bar %>").result(lambda do
    bar = "bar"

Mmm, no, that won’t work:

(erb):1: undefined local variable or method `bar' for main:Object (NameError)

Because the lambda’s symbols aren’t visible to the outside world. What we want is a lambda that has those symbols already defined in its binding:

require 'erb'

puts"foo <%= bar %>").result(lambda do
    bar = "bar"
    lambda { }

Which is all well and good, but it lets symbols leak through from the outside world, which we’d rather avoid. If we don’t explicitly say “make foo available to ERB”, we don’t want to use the foo that our calling class happens to have defined. We also can’t pass functions through in this way, except by abusing lambdas — and we don’t want to make the ERB code use rather than make_pretty(item). Back to the drawing board.

There is something that lets us define a (mostly) closed set of names, including functions: a Module. It sounds like we want to pass through a binding saying “execute in the context of this Module” somehow, but there’s no Module#binding_for_stuff_in_us. Looks like we’re screwed.

Except we’re not, because we can make one:

module ThingsForERB

puts"foo <%= bar %>").result(ThingsForERB.instance_eval { binding })

Now all that remains is to provide a way to dynamically construct a Module on the fly with methods that map onto (possibly differently-named) methods in the calling context, which is relatively straight-forward, then we can do this in our templates:

<% if summary %>
    <p><%=h summary %>.</p>
<% end %>


<table class="metadata">
    <% metadata_keys.each do | key | %>
            <th><%=h key.human_name %></th>
            <td><%=key_value key %></td>
    <% end %>


<table class="packages">
    <% package_names.each do | package_name | %>
            <th><a href="<%=h package_href(package_name) %>"><%=h package_name %></a></th>
            <td><%=h package_summary(package_name) %></td>
    <% end %>

Which gives us a good clean layout that’s easy to maintain, but lets us keep all the non-trivial code in the controlling class.

Exherbo Over Twice as Stable as Gentoo: A Totally Objective Study

Potential users often ask whether Exherbo is stable. To test this, I decided to reinstall everything on my Gentoo desktop and my Exherbo laptop. The results are as follows:

For Exherbo:

Summary of failures:

* net-misc/neon-0.28.3:0::arbor: failure
* dev-perl/IO-Socket-SSL-1.17:0::arbor: failure
* sys-apps/upstart-0.3.9:0::arbor: failure

Total: 390 packages, 387 successes, 0 skipped, 3 failures, 0 unreached

For Gentoo:

Summary of failures:

* sys-devel/flex-2.5.35:0::gentoo: failure
* sys-apps/coreutils-6.10-r2:0::gentoo: failure
* sys-libs/glibc-2.9_p20081201-r2:2.2::gentoo: failure
* dev-util/dejagnu-1.4.4-r1:0::gentoo: failure
* sys-devel/automake-1.9.6-r2:1.9::gentoo: failure
* dev-python/numeric-24.2-r6:0::gentoo: failure
* app-misc/g15daemon- failure
* app-misc/g15mpd-1.0.0:0::gentoo: skipped (dependency '>=app-misc/g15daemon-1.9.0' unsatisfied)
* dev-libs/boost-1.35.0-r2:0::gentoo: failure
* dev-libs/xerces-c-3.0.1:0::gentoo: failure
* dev-libs/xqilla-2.2.0:0::gentoo: skipped (dependency '>=dev-libs/xerces-c-3.0.1' unsatisfied)
* dev-util/bzr-1.12:0::gentoo: failure
* dev-util/mercurial-1.0.2:0::gentoo: failure
* media-video/mplayer-20090226.28734:0::gentoo: failure
* www-servers/lighttpd-1.4.20:0::gentoo: failure
* x11-wm/compiz-0.7.8-r2:0::gentoo: failure

Total: 833 packages, 817 successes, 2 skipped, 14 failures, 0 unreached

From this highly objective and totally fair study, we can conclude that Exherbo ~arch is 99.2% stable, whereas Gentoo ‘stable’ is merely 98.1% stable. This, alongside Exherbo’s worryingly disappearing lack of documentation, is an unfortunate trend that must be corrected before things start to get out of hand. I look forward to breaking everything at the earliest available opportunity.

Managing Accounts with the Package Manager

Paludis is a multi-format package manager. One beneficial side effect of this is that the core code is sufficiently flexible to make handling things that aren’t really ‘packages’ in the conventional sense very easy; in the past, this has been used to deliver unavailable, unwritten and unpackaged repositories.

One of the things Exherbo inherited from Gentoo without modification was user and group management. In Gentoo, this is done by functions called enewuser and enewgroup from eutils.eclass; a package that needs a user or group ID must call these functions from pkg_setup. Although usable, this is moderately icky; Exherbo can do better than that.

Really, user and group accounts are just resources. A package that needs a particular user ID can be thought of as depending upon that ID — the only disconnect is that currently dependencies are for packages, not resources. Can we find a way of representing resources as something like packages, in a way that makes sense?

Fortunately, the obvious solution works. Having user/paludisbuild and group/paludisbuild as packages makes sense; adding the user or group is equivalent to installing the appropriate package, and if the user or group is present on the system, it shows up as installed. Then, instead of calling functions, the exheres can just do:


What about defaults? Different users need different shells, home directories, groups and so on. We could represent these a bit like options, but there’s a better way.

If two or more ebuilds need the same user, they all have to do the useradd call. This means duplicating things like home directory information and preferred uid over lots of different ebuilds, which is bad. It would be better to place the users somewhere else. For Exherbo, we’ve gone with metadata/accounts/{users,groups}/*.conf. A user’s settings look something like this (the username is taken from a filename, so this would be metadata/accounts/users/paludisbuild.conf):

shell = /bin/bash
gecos = Used by Paludis for operations that require an unprivileged user
home = /var/tmp/paludis
primary_group = paludisbuild
extra_groups =
preferred_uid =

And a group, metadata/accounts/groups/paludisbuild.conf:

preferred_gid =

We only specify ’empty’ keys for demonstration purposes; ordinarily they would be omitted.

We automatically make users depend upon the groups they use. The existing dependency abstractions are sufficient for this. There’s a bit of trickery in Paludis to allow supplemental repositories to override user defaults found in their masters; details are in the source for those who care.

One more thing to note: with accounts specified this way, we can be sure that the package manager only manages relevant accounts. There’s no danger of having the package manager accidentally start messing with your user accounts.

So what are the implications?

  • We’re no longer tied to a particular method of adding users. If a user doesn’t want to use useradd and groupadd, they can write their own handler for the package manager to update users via LDAP or whatever. Paludis supports multiple handlers here.
  • Users who would rather manage a particular account manually can add it themselves, and the package manager will treat it as being already installed and won’t try to mess with it.
  • User and group defaults are in one place, not everywhere that uses them.
  • It’s much more obvious when an account is going to be added.
  • Accounts that are no longer required can be purged using the usual uninstall-unused mechanism.

And what does it look like?

$ paludis -pi test-pkg
Building target list...
Building dependency list...   

These packages will be installed:

* group/alsogroupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
* group/groupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
* group/thirdgroupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
* user/accountsdemo [N 0]
    Reasons: *test-cat/test-pkg-2:2::ciaranm_exheres_test
    "A demo account"
* test-cat/test-pkg::ciaranm_exheres_test :2 [N 2] <target>
    -foo build_options: recommended_tests split strip
    "Dummy test package"

We can have a look at the accounts before they’re installed:

$ paludis -q accountsdemo groupdemo
* user/accountsdemo
    accounts:                0* {:0} 
    Username:                accountsdemo
    Description:             A demo account
    Default Group:           groupdemo
    Extra Groups:            alsogroupdemo thirdgroupdemo
    Shell:                   /sbin/nologin
    Home Directory:          /dev/null

* group/groupdemo
    accounts:                0* {:0} 
    Groupname:               groupdemo
    Preferred GID:           123

Note the dependencies:

$ paludis -qDM accountsdemo
* user/accountsdemo
    accounts:                0* {:0} 
    username:                accountsdemo
    gecos:                   A demo account
    default_group:           groupdemo
    extra_groups:            alsogroupdemo thirdgroupdemo
    shell:                   /sbin/nologin
    home:                    /dev/null
    dependencies:            group/alsogroupdemo, group/groupdemo, group/thirdgroupdemo
    location:                /var/db/paludis/repositories/ciaranm_exheres_test/metadata/accounts/users/accountsdemo.conf
    defined_by:              ciaranm_exheres_test

The install is fairly boring:

(4 of 5) Installing user/accountsdemo-0:0::accounts

* Executing phase 'merge' as instructed
>>> Installing user/accountsdemo-0:0::accounts using passwd handler
useradd -r accountsdemo -c 'A demo account' -G 'alsogroupdemo,thirdgroupdemo' -s '/sbin/nologin' -d '/dev/null'
>>> Finished installing user/accountsdemo-0:0::accounts

And once they’re installed:

$ paludis -q accountsdemo groupdemo
* user/accountsdemo
    installed-accounts:      0* {:0} 

* group/groupdemo
    installed-accounts:      0* {:0} 

Exherbo will be migrating to this new mechanism shortly — package manager support is already there (it was only a few hours’ work), so it’s just a case of gradually hunting down and killing those enew* function calls.

MYOPTIONS: It’s like IUSE with Candy

The exheres-0 package format, used primarily by Exherbo, calls what Gentoo calls ‘USE flags’ ‘options’. What PMS EAPIs call IUSE, exheres-0 calls MYOPTIONS.

Up until recently, this has just been a differently named variable, minus support for IUSE defaults because we hateses them. But now that Paludis has the choices API we’re not stuck using that format. The first extension is fairly simple:

MYOPTIONS="foo bar baz linguas: en en_GB fr"

This is much nicer than having to write out linguas:en linguas:en_GB etc., and is especially important for exheres-0 because SUBOPTIONS (USE_EXPAND) values must be explicitly listed.

The next step is flag descriptions. use.local.desc is rather crude, and XML is horrible, so we thought about re-using annotations:

    X [[ description = [ Build a graphical user interface ] ]]
    python [[ description = [ Build Python language bindings ] ]]
    linguas: en en_GB fr"

Any undescribed flag falls back to the global description — general consensus is to keep those, because they make it easier for a user to set up a fresh options.conf.

Whilst we’re at it, we might as well solve the conflicting options problem. In the good old days, use flags were used only when something was optional — that is, if support for foo also needed support for bar, USE="-foo bar" would just compile without bar. Unfortunately, a few people didn’t really like that, and even more unfortunately developers started doing horrible die calls in pkg_setup rather than coming up with a proper solution.

With half of the pkg_setup die calls eliminated by use dependencies, it seems a shame not to fix the other not-quite-half-because-of-a-few-obscure-things. There’s already pkg_pretend for exheres-0, which is a big improvement, but moving handling of the common cases into the package manager is cleaner.

So, we start with the simplest case: flags requiring other flags.

        gtk [[ requires = [ X python ] ]]
        qt [[ requires = X ]]
        motif [[ requires = X ]]

We might like SUBOPTIONS and negatives too:

        python [[ requires = [ -minimal ] ]]
            en_GB [[ requires = [ linguas: en ] ]]

There might be a case for “if blah is not enabled then …” requirements:

        -ncurses [[ requires = [ X ] ]]

although we have a nicer solution this particular case. Note that it’s ok to list the same flag multiple times, so the above can be written as:

        -ncurses [[ requires = [ X ] ]]

For convenience, we’d like to be able to apply the same requires annotation to multiple items:

            gtk [[ requires = python ]]
        ) [[ requires = X ]]

Here, gtk requires both X and python (although excessive mixing of things is discouraged for style reasons).

Sometimes, you have to select one of a number of flags:

        ) [[ number-selected = exactly-one ]]

Also allowed are at-least-one and at-most-one.

Sometimes requirements are conditional, too:

        X? (
                gtk [[ requires = python ]]
            ) [[
                number-selected = exactly-one
                requires = X

Although, for style reasons, this would end up looking more like:

        X [[ description = [ Include a GUI ] ]]
        python [[ description = [ Build Python language bindings ] ]]

        gtk [[ requires = python ]]
        ( gtk qt motif ) [[ requires = X ]]
        X? ( ( gtk qt motif ) [[ number-selected = exactly-one ]] )

As for how these are verified… They’re checked at --pretend --install time, right before pkg_pretend is run. Even if a requirement fails, though, pkg_pretend is still run, allowing us to show as many notices as necessary at the same time.

The failure is indicated to the user by the pkg_bad_options function. This probably won’t be overridden by very many packages, and those that do will almost certainly call default. The default output looks like this:

These packages will be installed:

* test-cat/test-pkg::ciaranm :1 [N 1] <target>
    X gtk -motif -python qt build_options: recommended_tests split strip
    "Dummy test package"

Total: 1 package (1 new)

 * The following option requirements are unmet for test-cat/test-pkg-1:
 *     Enabling option 'gtk' requires option 'python' to be enabled
 *     Exactly one of options ( gtk, qt, motif ) must be met

* Cannot continue with install due to the errors indicated above

And just think, all that without resorting to convoluted and incomprehensible set theory.

Dealing with Unwritten Packages

A while ago Exherbo came up with a solution for the “large number of repositories” problem. It turns out this solution works rather well, and I’m now convinced that anyone rushing off and assuming that Git on its own will magically solve anything other than CVS sucking is going in the wrong direction.

Paludis also has ways of tracking packages installed by hand. Whilst useful on Gentoo, this really comes into play on Exherbo, where we don’t want to end up with a massive unmaintainable tree full of packages used by only two people.

The logical next step is tracking packages that don’t exist yet.

An unimaginative person might think that the way to solve this would be to have a bug for each requested package. But that gets messy — it’s not integrated with the package manager, and requires considerable extra effort from the user. A while ago Bernd suggested something more interesting: tracking unwritten packages in a repository in a similar way to how we do unavailable packages.

The implications:

  • Most obviously, we can keep track of things we want that haven’t been written yet in a way that doesn’t involve leaving the package manager to look things up.
  • We can also use it to track version bumps that we know will take a while. Doing paludis --query '>=hugescarypackage-4', say, would show that it’s being worked upon.
  • We can also depend upon things that don’t exist, rather than leaving incomplete dependency strings around. This is fine in at least two cases — if a dependency is conditional upon an option that should probably be implemented but isn’t yet, we can add the option and make it unusable. And we can handle obscure suggested dependencies (e.g. git has lots of optional dependencies upon weird perl modules, so we can say “if you want support for git-send-email with SSL, you need to write such and such a package”).
  • Bored developers could simply paludis --query '*/*::unwritten' and get ideas for what to do.

Adding support to Paludis for this only took a couple of hours. So now Exherbo developers can use this:

format = unwritten
location = /var/db/paludis/repositories/unwritten
sync = git://
importance = -100

And see this:

$ paludis -q genesis
* sys-apps/genesis
    unwritten:               (1.0)X* {:0} 
    Description:             The daddy of all init systems
    Comment:                 We need an init system.
    Masked by unwritten:     Package has not been written yet

As with unavailable, installing behaves sensibly:

$ paludis -pi genesis
Building target list... 
Building dependency list...
Query error:
  * In program paludis -pi genesis:
  * When performing install action from command line:
  * When executing install task:
  * When building dependency list:
  * When adding PackageDepSpec 'sys-apps/genesis':
  * All versions of 'sys-apps/genesis' are masked. Candidates are:
    * sys-apps/genesis-1.0:0::unwritten: Masked by unwritten (Package has not been written yet)

Those interested in the repository format can browse the tree. The observant might notice that the file format is quite similar to the one used by unavailable repositories, and wonder whether I was feeling lazy and swiped a load of code rather than implementing a new repository from scratch.

In the spirit of silly buzzwords, one could argue that this increases distributivity and democratisation of the repository, since there’s nothing to stop motivated users from creating their own wishlist repositories. It wouldn’t be hard to make a simple web interface that lets users request and vote on packages and version bumps, automatically generating a repository that anyone can use. But fortunately Exherbo is a dictatorship and has no users, so we don’t have to put up with that kind of nonsense.