Blag
He's not dead, he's resting
EAPI 2: SRC_URI Arrows
September 28, 2008
Posted by on This is the first item in a series of posts describing EAPI 2.
Some upstreams use annoyingly named tarballs. Most commonly, they don’t include either the package name or the version in the filename. Because DISTDIR
is a flat directory, this causes problems — the tree must not use two different tarballs with the same name. Previously, the solution to horrible upstream naming was to manually mirror the tarball with a new filename; this was considered excessively icky.
There have been two sane solutions proposed for this over time. The one we didn’t use was to define a DISTDIR_SUBDIR
variable, and do all downloads into there. This would have made the A
variable quite a bit messier, and complicated sharing certain tarballs between packages.
The arrows solution was something I came up with for early Paludis experimental EAPIs, and was adopted for kdebuild-1
and from there into 2
; it’s also always been present in exheres-0
. It works like this:
SRC_URI="http://example.com/stupid-named/1.23/stupid.tar.bz2 -> stupid-1.23.tar.bz2"
or using variables:
SRC_URI="http://example.com/stupid-named/${PV}/${PN}.tar.bz2 -> ${P}.tar.bz2"
This tells the package manager to look at the URL on the left of the arrow, but save to the filename on the right.
Mirroring effects are slightly subtle. Consider:
SRC_URI="mirror://foo/${PN}/${PV}.tar.bz2 -> ${P}.tar.bz2"
The package manager will look both on mirror://foo/
and mirror://gentoo/
for the download. When looking on foo
, the raw filename must be used, but when looking on gentoo
, the rewritten filename is used.
Anyone using arrows on mirror://gentoo/
URIs gets stabbed.
Arrows make another proposed but rejected EAPI feature irrelevant: there was a proposal floating around (I think it originated with drobbins, but I can’t find an original source) to make unpack
ignore ;sf=tbz2
and ;sf=tgz
suffixes on filenames, for interoperability with gitweb. Arrows are a more general solution.
Implementation-wise, anyone still using a lexer-based parser will need a single token of lookahead for this. Apparently this causes minor inconveniences in some broken programming languages that only support what C++ calls input iterators; I consider this a good thing, because it might make people either use a better iterator model or stop using lexers.
Pingback: What’s in EAPI 2? « Ciaran McCreesh’s Blag
So what happens once the file is mirrored on Gentoo’s or whomever’s mirrors i.e. Foo releases Bar.tar.gz
SRC_URI=”mirror://sf/${PN}/${PN}.tar.bz2 -> ${P}.tar.bz2″
What happens when the file hits the mirrors? Does portage (or paludis) attempt to grab the name after the arrow or the former?
The mirror script uses the renamed name when copying to mirror://gentoo/, and the package manager uses the renamed name when consulting mirror://gentoo/ but not when consulting mirror://sf/. That way you don’t end up with colliding distfile names on mirror://gentoo/.
Good stuff, thanks for the details. Your blag is always interesting!