Blag

He's not dead, he's resting

Tag Archives: exheres-0

Classifying Repository Masks

In the olden days, when carpaskis roamed the earth and people still used CVS, Gentoo’s package.mask looked like this:

=some-cat/some-pkg-1.23

Then people started putting in comments:

# S. Lacker <slacker@gentoo.org> (1 Apr 2001)
# Randomly makes giant space monkeys attack you with
# pointy sticks on startup.
=some-cat/some-pkg-1.23
=some-cat/some-pkg-data-1.23

Since the comments were in a standard format, Portage then started parsing the comments to be able to show the text to users. Paludis also supports this, and Exherbo originally copied Gentoo’s convention.

This is of course disgusting. Recently Exherbo has switched to a new mask format:

(
    some-cat/some-pkg[=2.34]
    some-cat/some-pkg-data[=2.34]
) [[
    *author = [ S. Lacker <ingmar@exherbo.org> ]
    *date = [ 29 Feb 2011 ]
    *token = [ testing ]
    *description = [ Seems to be a bit iffy. Needs more testing. ]
]]

Which is consistent with how annotations are used in exheres-0. Of particular note is the token field.

When unmasking a package, it’s very easy to accidentally unmask more than what you were after. For example, you might be wanting to unmask testing releases for something, but not also unmask scm or insecure versions. Since masks on Gentoo aren’t classified, there’s no way of doing that. But on Exherbo it’s now possible — masks are now marked with a token (such as scm, testing, broken or insecure), and in package_unmask.conf users can do this:

# Unmask only testing versions (and not scm)
x11-drivers/xf86-video-nouveau testing
x11-dri/libdrm testing

# Ignore security masks for PHP, since otherwise we'd never be
# able to use it at all. But don't unmask scm or broken versions.
dev-lang/php security

If we can encourage users to make use of this, it should lead to a large reduction in people breaking their systems or getting horrible resolutions by overly broad unmasking.

Advertisements

Intuitive Packaging is Doing It Wrong

Donnie has taken time out of his busy schedule of managing Gentoo to comment on some possible design issues for EAPI 3. He believes that adding support for exheres-0 style DEFAULT_ parameters to ebuilds would result in a less intuitive packaging system, which he considers bad.

Unfortunately, both the term ‘intuitive’ and the conclusion are nonsense. Ebuilds are not intuitive, intuitiveness would not be a useful property for them to have, and allowing parametrisation of default_ functions would not alter any of this.

The only truly intuitive interface is the nipple.
– Jay Vollmer

Let’s look at what intuitive means:

in⋅tu⋅i⋅tive /ɪnˈtuɪtɪv, -ˈtyu-/ [in-too-i-tiv, -tyoo-]
-adjective

  1. perceiving by intuition, as a person or the mind.
  2. perceived by, resulting from, or involving intuition: intuitive knowledge.
  3. having or possessing intuition: an intuitive person.
  4. capable of being perceived or known by intuition.

Ok, let’s look at intuition:

in⋅tu⋅i⋅tion /ˌɪntuˈɪʃən, -tyu-/ [in-too-ish-uhn, -tyoo-]
–noun

  1. direct perception of truth, fact, etc., independent of any reasoning process; immediate apprehension.
  2. a fact, truth, etc., perceived in this way.
  3. a keen and quick insight.
  4. the quality or ability of having such direct perception or quick insight.

So apparently Donnie wants people to be able to write ebuilds without requiring rational thought. Whilst that would go some way towards explaining the state of the tree, it’s evident that ebuilds are not currently intuitive and should not be made intuitive.

What qualities, then, should ebuild design aspire to? Let’s start with these:

  1. Ebuilds should be as obvious as reasonably possible, given the complications of the underlying packaging system and the overall design requirements, to a person with an appropriate level of skill and access to the documentation.
  2. Ebuilds should work to reduce the amount of boilerplate and cut-and-paste duplication required.
  3. Ebuilds should take steps to catch and prevent common errors.

Looking at the first point, one may think it is too weak a requirement — why not “ebuilds should be accessible to your average user”? But then, why should it be?

If you think the average user should have to write ebuilds to be able to get their package manager to track a package they can build by hand — why? Why not simply improve the package manager to be able to track hand-built packages without ebuilds?

If you think the average user should be able to modify ebuilds to add in patches — why? Why not simply improve the package manager to make it easy for the user to add in patches to existing packages?

If you think it will help solve the developer shortage problem — why? There’s no shortage of badly written ebuilds sitting around in bugzilla; making it easier to create more badly written ebuilds won’t fix this. The problem Gentoo faces is how to get more high quality ebuilds, and doing that requires skilled developers who have read and understood the documentation.

Introducing DEFAULT_ parameters has no major effect either way on the first point.

The second and third points are where DEFAULT_ parameters kick in. The reason the default src_configure does something as opposed to nothing is that the something it does is enough for many ebuilds. If instead it were a no-op, a typical simple ebuild would be considerably longer.

Except, these days a lot of ebuilds have a few simple configure options controlled by use flags, so the default src_configure in EAPI 2 (or src_compile in EAPIs 0 and 1) is no good. DEFAULT_ parameters bring this proportion way down.

This brings us to why the default src_install is a no-op. For most packages, something along the lines of “if there’s a Makefile, make DESTDIR="${D}" install” is not enough. For a good proportion of packages, though, that plus an ebuild-supplied list of doc files would suffice.

Donnie claims that specifying things in variables this way is a major change in how ebuilds work. But there are already plenty of examples of things done in this style:

  • The S variable is a declarative parameter to the package manager’s “before we run a phase” functions.
  • Lots of eclasses make use of a DOCS variable.
  • Indeed, nearly all parameterisation of eclasses is done through variables. It could just as easily be done by callback or overridable functions, but developers haven’t opted to do so.

A perfect example of that last point: Donnie’s own x-modular eclass has a variable named PATCHES, which ebuilds set in global scope. If x-modular were using EAPI 3 with a src_prepare and exheres-0 style declarative patches lists, the package manager would already be providing exactly what he’s gone out of his way to implement.

So what gives, Donnie? Do you think your use of PATCHES was a design mistake that you will be correcting? And do you think all those other developers who have been doing the same kind of thing for years are fundamentally wrong?

Managing Accounts with the Package Manager

Paludis is a multi-format package manager. One beneficial side effect of this is that the core code is sufficiently flexible to make handling things that aren’t really ‘packages’ in the conventional sense very easy; in the past, this has been used to deliver unavailable, unwritten and unpackaged repositories.

One of the things Exherbo inherited from Gentoo without modification was user and group management. In Gentoo, this is done by functions called enewuser and enewgroup from eutils.eclass; a package that needs a user or group ID must call these functions from pkg_setup. Although usable, this is moderately icky; Exherbo can do better than that.

Really, user and group accounts are just resources. A package that needs a particular user ID can be thought of as depending upon that ID — the only disconnect is that currently dependencies are for packages, not resources. Can we find a way of representing resources as something like packages, in a way that makes sense?

Fortunately, the obvious solution works. Having user/paludisbuild and group/paludisbuild as packages makes sense; adding the user or group is equivalent to installing the appropriate package, and if the user or group is present on the system, it shows up as installed. Then, instead of calling functions, the exheres can just do:

DEPENDENCIES="
        build+run:
            user/paludisbuild
            group/paludisbuild
    "

What about defaults? Different users need different shells, home directories, groups and so on. We could represent these a bit like options, but there’s a better way.

If two or more ebuilds need the same user, they all have to do the useradd call. This means duplicating things like home directory information and preferred uid over lots of different ebuilds, which is bad. It would be better to place the users somewhere else. For Exherbo, we’ve gone with metadata/accounts/{users,groups}/*.conf. A user’s settings look something like this (the username is taken from a filename, so this would be metadata/accounts/users/paludisbuild.conf):

shell = /bin/bash
gecos = Used by Paludis for operations that require an unprivileged user
home = /var/tmp/paludis
primary_group = paludisbuild
extra_groups =
preferred_uid =

And a group, metadata/accounts/groups/paludisbuild.conf:

preferred_gid =

We only specify ’empty’ keys for demonstration purposes; ordinarily they would be omitted.

We automatically make users depend upon the groups they use. The existing dependency abstractions are sufficient for this. There’s a bit of trickery in Paludis to allow supplemental repositories to override user defaults found in their masters; details are in the source for those who care.

One more thing to note: with accounts specified this way, we can be sure that the package manager only manages relevant accounts. There’s no danger of having the package manager accidentally start messing with your user accounts.

So what are the implications?

  • We’re no longer tied to a particular method of adding users. If a user doesn’t want to use useradd and groupadd, they can write their own handler for the package manager to update users via LDAP or whatever. Paludis supports multiple handlers here.
  • Users who would rather manage a particular account manually can add it themselves, and the package manager will treat it as being already installed and won’t try to mess with it.
  • User and group defaults are in one place, not everywhere that uses them.
  • It’s much more obvious when an account is going to be added.
  • Accounts that are no longer required can be purged using the usual uninstall-unused mechanism.

And what does it look like?

$ paludis -pi test-pkg
Building target list...
Building dependency list...   

These packages will be installed:

* group/alsogroupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
    "alsogroupdemo"
* group/groupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
    "groupdemo"
* group/thirdgroupdemo [N 0]
    Reasons: *user/accountsdemo-0:0::accounts
    "thirdgroupdemo"
* user/accountsdemo [N 0]
    Reasons: *test-cat/test-pkg-2:2::ciaranm_exheres_test
    "A demo account"
* test-cat/test-pkg::ciaranm_exheres_test :2 [N 2] <target>
    -foo build_options: recommended_tests split strip
    "Dummy test package"

We can have a look at the accounts before they’re installed:

$ paludis -q accountsdemo groupdemo
* user/accountsdemo
    accounts:                0* {:0} 
    Username:                accountsdemo
    Description:             A demo account
    Default Group:           groupdemo
    Extra Groups:            alsogroupdemo thirdgroupdemo
    Shell:                   /sbin/nologin
    Home Directory:          /dev/null

* group/groupdemo
    accounts:                0* {:0} 
    Groupname:               groupdemo
    Preferred GID:           123

Note the dependencies:

$ paludis -qDM accountsdemo
* user/accountsdemo
    accounts:                0* {:0} 
    username:                accountsdemo
    gecos:                   A demo account
    default_group:           groupdemo
    extra_groups:            alsogroupdemo thirdgroupdemo
    shell:                   /sbin/nologin
    home:                    /dev/null
    dependencies:            group/alsogroupdemo, group/groupdemo, group/thirdgroupdemo
    location:                /var/db/paludis/repositories/ciaranm_exheres_test/metadata/accounts/users/accountsdemo.conf
    defined_by:              ciaranm_exheres_test

The install is fairly boring:

(4 of 5) Installing user/accountsdemo-0:0::accounts

* Executing phase 'merge' as instructed
>>> Installing user/accountsdemo-0:0::accounts using passwd handler
useradd -r accountsdemo -c 'A demo account' -G 'alsogroupdemo,thirdgroupdemo' -s '/sbin/nologin' -d '/dev/null'
>>> Finished installing user/accountsdemo-0:0::accounts

And once they’re installed:

$ paludis -q accountsdemo groupdemo
* user/accountsdemo
    installed-accounts:      0* {:0} 

* group/groupdemo
    installed-accounts:      0* {:0} 

Exherbo will be migrating to this new mechanism shortly — package manager support is already there (it was only a few hours’ work), so it’s just a case of gradually hunting down and killing those enew* function calls.

MYOPTIONS: It’s like IUSE with Candy

The exheres-0 package format, used primarily by Exherbo, calls what Gentoo calls ‘USE flags’ ‘options’. What PMS EAPIs call IUSE, exheres-0 calls MYOPTIONS.

Up until recently, this has just been a differently named variable, minus support for IUSE defaults because we hateses them. But now that Paludis has the choices API we’re not stuck using that format. The first extension is fairly simple:

MYOPTIONS="foo bar baz linguas: en en_GB fr"

This is much nicer than having to write out linguas:en linguas:en_GB etc., and is especially important for exheres-0 because SUBOPTIONS (USE_EXPAND) values must be explicitly listed.

The next step is flag descriptions. use.local.desc is rather crude, and XML is horrible, so we thought about re-using annotations:

MYOPTIONS="
    X [[ description = [ Build a graphical user interface ] ]]
    python [[ description = [ Build Python language bindings ] ]]
    nls
    linguas: en en_GB fr"

Any undescribed flag falls back to the global description — general consensus is to keep those, because they make it easier for a user to set up a fresh options.conf.

Whilst we’re at it, we might as well solve the conflicting options problem. In the good old days, use flags were used only when something was optional — that is, if support for foo also needed support for bar, USE="-foo bar" would just compile without bar. Unfortunately, a few people didn’t really like that, and even more unfortunately developers started doing horrible die calls in pkg_setup rather than coming up with a proper solution.

With half of the pkg_setup die calls eliminated by use dependencies, it seems a shame not to fix the other not-quite-half-because-of-a-few-obscure-things. There’s already pkg_pretend for exheres-0, which is a big improvement, but moving handling of the common cases into the package manager is cleaner.

So, we start with the simplest case: flags requiring other flags.

MYOPTIONS="
        X
        python
        gtk [[ requires = [ X python ] ]]
        qt [[ requires = X ]]
        motif [[ requires = X ]]
        "

We might like SUBOPTIONS and negatives too:

MYOPTIONS="
        minimal
        python [[ requires = [ -minimal ] ]]
        linguas:
            en
            en_GB [[ requires = [ linguas: en ] ]]
        "

There might be a case for “if blah is not enabled then …” requirements:

MYOPTIONS="
        X
        -ncurses [[ requires = [ X ] ]]
        "

although we have a nicer solution this particular case. Note that it’s ok to list the same flag multiple times, so the above can be written as:

MYOPTIONS="
        X
        ncurses
        -ncurses [[ requires = [ X ] ]]
        "

For convenience, we’d like to be able to apply the same requires annotation to multiple items:

MYOPTIONS="
        X
        python
        (
            gtk [[ requires = python ]]
            qt
            motif
        ) [[ requires = X ]]
        "

Here, gtk requires both X and python (although excessive mixing of things is discouraged for style reasons).

Sometimes, you have to select one of a number of flags:

MYOPTIONS="
        (
            gtk
            qt
            motif
        ) [[ number-selected = exactly-one ]]
        "

Also allowed are at-least-one and at-most-one.

Sometimes requirements are conditional, too:

MYOPTIONS="
        X
        python
        X? (
            (
                gtk [[ requires = python ]]
                qt
                motif
            ) [[
                number-selected = exactly-one
                requires = X
            ]]
        )
        "

Although, for style reasons, this would end up looking more like:

MYOPTIONS="
        X [[ description = [ Include a GUI ] ]]
        python [[ description = [ Build Python language bindings ] ]]
        gtk
        qt
        motif

        gtk [[ requires = python ]]
        ( gtk qt motif ) [[ requires = X ]]
        X? ( ( gtk qt motif ) [[ number-selected = exactly-one ]] )
        "

As for how these are verified… They’re checked at --pretend --install time, right before pkg_pretend is run. Even if a requirement fails, though, pkg_pretend is still run, allowing us to show as many notices as necessary at the same time.

The failure is indicated to the user by the pkg_bad_options function. This probably won’t be overridden by very many packages, and those that do will almost certainly call default. The default output looks like this:

These packages will be installed:

* test-cat/test-pkg::ciaranm :1 [N 1] <target>
    X gtk -motif -python qt build_options: recommended_tests split strip
    "Dummy test package"

Total: 1 package (1 new)

 * The following option requirements are unmet for test-cat/test-pkg-1:
 *     Enabling option 'gtk' requires option 'python' to be enabled
 *     Exactly one of options ( gtk, qt, motif ) must be met

* Cannot continue with install due to the errors indicated above

And just think, all that without resorting to convoluted and incomprehensible set theory.