Blag

He's not dead, he's resting

EAPI 2: SRC_URI Arrows

This is the first item in a series of posts describing EAPI 2.

Some upstreams use annoyingly named tarballs. Most commonly, they don’t include either the package name or the version in the filename. Because DISTDIR is a flat directory, this causes problems — the tree must not use two different tarballs with the same name. Previously, the solution to horrible upstream naming was to manually mirror the tarball with a new filename; this was considered excessively icky.

There have been two sane solutions proposed for this over time. The one we didn’t use was to define a DISTDIR_SUBDIR variable, and do all downloads into there. This would have made the A variable quite a bit messier, and complicated sharing certain tarballs between packages.

The arrows solution was something I came up with for early Paludis experimental EAPIs, and was adopted for kdebuild-1 and from there into 2; it’s also always been present in exheres-0. It works like this:

SRC_URI="http://example.com/stupid-named/1.23/stupid.tar.bz2 -> stupid-1.23.tar.bz2"

or using variables:

SRC_URI="http://example.com/stupid-named/${PV}/${PN}.tar.bz2 -> ${P}.tar.bz2"

This tells the package manager to look at the URL on the left of the arrow, but save to the filename on the right.

Mirroring effects are slightly subtle. Consider:

SRC_URI="mirror://foo/${PN}/${PV}.tar.bz2 -> ${P}.tar.bz2"

The package manager will look both on mirror://foo/ and mirror://gentoo/ for the download. When looking on foo, the raw filename must be used, but when looking on gentoo, the rewritten filename is used.

Anyone using arrows on mirror://gentoo/ URIs gets stabbed.

Arrows make another proposed but rejected EAPI feature irrelevant: there was a proposal floating around (I think it originated with drobbins, but I can’t find an original source) to make unpack ignore ;sf=tbz2 and ;sf=tgz suffixes on filenames, for interoperability with gitweb. Arrows are a more general solution.

Implementation-wise, anyone still using a lexer-based parser will need a single token of lookahead for this. Apparently this causes minor inconveniences in some broken programming languages that only support what C++ calls input iterators; I consider this a good thing, because it might make people either use a better iterator model or stop using lexers.

4 responses to “EAPI 2: SRC_URI Arrows

  1. Pingback: What’s in EAPI 2? « Ciaran McCreesh’s Blag

  2. steev September 28, 2008 at 4:18 pm

    So what happens once the file is mirrored on Gentoo’s or whomever’s mirrors i.e. Foo releases Bar.tar.gz

    SRC_URI=”mirror://sf/${PN}/${PN}.tar.bz2 -> ${P}.tar.bz2″

    What happens when the file hits the mirrors? Does portage (or paludis) attempt to grab the name after the arrow or the former?

  3. Ciaran McCreesh September 28, 2008 at 4:21 pm

    The mirror script uses the renamed name when copying to mirror://gentoo/, and the package manager uses the renamed name when consulting mirror://gentoo/ but not when consulting mirror://sf/. That way you don’t end up with colliding distfile names on mirror://gentoo/.

  4. iaindb September 29, 2008 at 4:27 am

    Good stuff, thanks for the details. Your blag is always interesting!

Leave a comment