Archive for June, 2008
Bedtime Reading III
Posted by Ciaran McCreesh on June 29, 2008
Posted in c++ | Tagged: bedtime reading, c++, science | Leave a Comment »
On-Demand Loading using Smart Pointers
Posted by Ciaran McCreesh on June 13, 2008
Previously, I explained how to implement something like the Active Object thread pattern using smart pointers. Next we’ll use the same trick to implement on-demand, lazy construction.
There’s nothing difficult here, once we realise we can reuse the ‘return a different pointer’ trick. We make use of std::tr1::function rather than a raw function pointer so that parameter values can be bound at pointer-construction time. Again, we parameterise on pointer type, not raw type.
template <typename T_> class DeferredConstructionPtr { private: mutable T_ _ptr; std::tr1::function<T_ ()> _f; mutable bool _done; public: DeferredConstructionPtr(const std::tr1::function<T_ ()> & f) : _ptr(), _f(f), _done(false) { } DeferredConstructionPtr(const DeferredConstructionPtr & other) : _ptr(other._ptr), _f(other._f), _done(other._done) { } DeferredConstructionPtr & operator= (const DeferredConstructionPtr & other) { if (this != &other) { _ptr = other._ptr; _f = other._f; _done = other._done; } return *this; } T_ operator-> () const { if (! _done) { _ptr = _f(); _done = true; } return _ptr; } };
Again, some caveats:
- Dealing with const is left to the reader.
- Dealing with thread safety is left to the next article.
- If the constructor in question can throw exceptions, the exception will be thrown at what could be a rather unobvious place. This may or may not be a problem.
Posted in c++ | Tagged: c++ | Leave a Comment »
Building but Not Running Tests with Automake
Posted by Ciaran McCreesh on June 13, 2008
If you’re still stuck using autotools because Quagmire can’t yet do everything you need, you might find this little hack handy…
When making an API change that you know will break lots of your unit tests, it would be nice to have a ‘make check-programs-but-don’t-bother-running-them’ target. Automake doesn’t let you do this — to build check_PROGRAMS, you have to run them (and all the tests in SUBDIRS you already fixed) too. This can get annoying if you’re just wanting the compiler to point out the next place you have to change something. But you can cheat…
make check TESTS_ENVIRONMENT=true
One drawback: if you’re using XFAIL_TESTS, they’ll show up as incorrectly passing. If you don’t have many of these, you can disable them and still gain most of the benefits:
make check TESTS_ENVIRONMENT=true XFAIL_TESTS=
Posted in build systems | Tagged: automake, autotools, testing | Leave a Comment »
Bedtime Reading II
Posted by Ciaran McCreesh on June 12, 2008
- Source Code Optimisation, Felix von Leitner. PDF.
- Open Multi-Methods for C++, Peter Pirkelbauer, Yuriy Solodkyy, and Bjarne Stroustrup. PDF.
- Floating point arithmetic in C++ templates. Heh.
Posted in c++ | Tagged: bedtime reading, books, c++ | Leave a Comment »
Dealing with Lots of Repositories
Posted by Ciaran McCreesh on June 12, 2008
With the explosion of overlays used by Gentoo, finding a package can get quite messy. Most users won’t want to set up lots and lots of repositories, so they won’t necessarily know when a package (or an ebuild for an scm or beta version of a package whose stable versions are in the main tree) is available.
Exherbo has similar issues. It’s likely that not-widely-used applications will remain permanently in individual developers’ personal repositories, with only reasonably important packages making it into Arbor. This has further implications for dependencies — for example, if X11 remains in its own repository, but core Arbor packages have optional dependencies upon X, how will that work?
Enter yesterday’s Paludis project: UnavailableRepository.
The idea is simple:
- Have a repository that contains all the packages in all the repositories you don’t have.
- Make it know enough about those packages to show you that they’re available, but not enough to let you do the install. In Paludis terms, this means making the packages be masked with a non-overridable mask, but still support
InstallAction. - Make it less important than any ‘proper’ repository.
- Do something clever to skip doing unavailable IDs for any repository you have configured. (Not strictly speaking necessary, but a lot nicer.)
Configuration is simple. For Exherbo:
format = unavailable location = /var/db/paludis/repositories/unavailable sync = tar+http://git.exherbo.org/exherbo_repositories.tar.bz2 importance = -100
And for Gentoo:
format = unavailable name = layman location = /var/paludis/repositories/layman sync = tar+http://git.exherbo.org/layman_repositories.tar.bz2 importance = -100
(Both sync URLs will probably change soon.)
Then you can do this:
$ paludis -q firefox
* net-www/firefox
unavailable: (3.0_rc1 (in ::mozilla))X* {:0}
Description: The firefox web browser
Owning repository: mozilla
Repository homepage: http://git.exherbo.org/?p=mozilla.git
Masked by unavailable: In a repository which is unavailable
And this:
$ paludis -pi firefox
Building target list...
Building dependency list...
Query error:
* In program paludis -pi firefox:
* When performing install action from command line:
* When executing install task:
* When building dependency list:
* When adding PackageDepSpec 'net-www/firefox':
* All versions of 'net-www/firefox' are masked. Candidates are:
* net-www/firefox-3.0_rc1:0::unavailable (in ::mozilla): Masked by unavailable (In a repository which is unavailable)
You can even search (on description, at least — searching on other metadata keys won’t find anything):
$ sudo inquisitio browser
* net-www/firefox
unavailable: (3.0_rc1 (in ::mozilla))X* {:0}
Description: The firefox web browser
Owning repository: mozilla
Repository homepage: http://git.exherbo.org/?p=mozilla.git
Masked by unavailable: In a repository which is unavailable
* net-www/w3m
unavailable: (0.5.2 (in ::haskell))X* {:0}
Description: Text based WWW browser, supports tables and frames
Owning repository: haskell
Repository homepage: http://git.exherbo.org/?p=haskell.git
Masked by unavailable: In a repository which is unavailable
So what’s in those magic sync tarballs?
$ ll /var/db/paludis/repositories/unavailable/ total 76K -rw-r--r-- 1 root root 369 2008-06-12 04:47 alsa.repository -rw-r--r-- 1 root root 3.2K 2008-06-12 04:47 haskell.repository -rw-r--r-- 1 root root 195 2008-06-12 04:47 kde.repository -rw-r--r-- 1 root root 666 2008-06-12 04:47 media.repository -rw-r--r-- 1 root root 238 2008-06-12 04:47 mozilla.repository -rw-r--r-- 1 root root 334 2008-06-12 04:47 perl.repository -rw-r--r-- 1 root root 1.2K 2008-06-12 04:47 rbrown.repository -rw-r--r-- 1 root root 338 2008-06-12 04:47 scientific.repository -rw-r--r-- 1 root root 20K 2008-06-12 04:47 x11.repository -rw-r--r-- 1 root root 1.5K 2008-06-12 04:47 xfce.repository
One file per repository, fairly simple. And the files themselves are nice and clean too:
$ cat /var/db/paludis/repositories/unavailable/mozilla.repository
format = unavailable-1
repo_name = mozilla
homepage = http://git.exherbo.org/?p=mozilla.git
dev-libs/
libIDL/
:0 0.8.10 ; IDL parsing and compilation library
net-www/
firefox/
:0 3.0_rc1 ; The firefox web browser
There’s a bit of metadata about the repository in question (not very much — repositories don’t currently have descriptions or anything like that, and even the homepage is a bit of a hack in a lot of cases), and then data about all the versions.
For each package name, we store each version, its slot, and the description of the best version in each slot (all descriptions is probably a waste of space, considering how little descriptions vary between versions). There aren’t any packages with multiple versions or slots in the above example, but when there are they look like:
sys-devel/
automake/
:1.4 1.4_p6 ; Tool used to automatically generate Makefile.in files
:1.5 1.5 ; Tool used to automatically generate Makefile.in files
:1.6 1.6.3 ; Tool used to automatically generate Makefile.in files
:1.7 1.7.9 ; Tool used to automatically generate Makefile.in files
:1.8 1.8.5 ; Tool used to automatically generate Makefile.in files
:1.9 1.9.6 ; Tool used to automatically generate Makefile.in files
:1.10 1.10 1.10.1 ; Tool used to automatically generate Makefile.in files
And that’s all there is to it.
(Well, not entirely. There still has to be a tool to generate the repository content files. Fortunately, dleverton wrote a simple Ruby script using the Paludis bindings that generates them all automatically from the layman master file.)
Posted in paludis for users | Tagged: exherbo, gentoo, paludis, unavailable | 5 Comments »
Implementing Active Objects using Smart Pointers
Posted by Ciaran McCreesh on June 11, 2008
An Active Object is, essentially, a threaded design pattern where an object is always called from the same thread, and where a proxy provides synchronised access to that object. It’s useful in cases where there’s little or no parallelism possible due to the underlying object’s state, and where implementing manual locking would be a nuisance.
The problem in C++, generally, is implementing the proxy. If the underlying object’s class has twenty methods, you have to implement twenty trivial wrapper methods for the proxy that obtain the lock and then forward. Whilst not difficult to do, it’s rather tedious.
But what if the proxy is a smart pointer? Then you could do proxy->method() for any method the underlying class has, and you wouldn’t have to worry about writing wrapper methods. The question then is how to write Proxy<UnderlyingClass>::operator-> ().
It can’t simply return the underlying instance, since it needs to do locking. And it can’t obtain a lock and then return the underlying instance, since the lock would never be released.
But all is not lost. The standard has some rather interesting wording:
An expression
x->mis interpreted as(x.operator->())->mfor a class objectxof typeTifT::operator-> ()exists and if the operator is selected as the best match function by the overload resolution mechanism.
Usually, implementations of operator-> () simply return SomeType *. But with the way the standard is worded, they could instead return a second class instance, so long as that new class has operator-> () defined.
So we make Proxy<UnderlyingClass>::operator-> () return a temporary that, upon construction, obtains the shared lock, and upon destruction releases it. Then we rely upon the temporary’s operator-> () returning a pointer to the underlying object to get the actual method call.
Except… It’s not that simple. With current C++, the temporary has to be copyable (even though the compiler optimises out the copy), but typically mutex locks are noncopyable. With C++0x we’ll be able to return by rvalue reference, but until then we have to store the lock in a shared pointer rather than directly.
We’re going to start making use of this in Paludis to cut down on boilerplate code. A working implementation follows. We have a couple of refinements: the class is called ActiveObjectPtr, and it’s parameterised by a pointer to the underlying class (we’ll see why in a later post — generally it’ll just be a std::tr1::shared_ptr, but leaving it as a parameter lets us compose smart pointer types).
template <typename T_> class ActiveObjectPtr { private: T_ _ptr; std::tr1::shared_ptr<Mutex> _mutex; class Deref { private: const ActiveObjectPtr * _ptr; std::tr1::shared_ptr<Lock> _lock; public: Deref(const ActiveObjectPtr * p) : _ptr(p), _lock(make_shared_ptr(new Lock(*p->_mutex))) { } const T_ & operator-> () const { return _ptr->_ptr; } }; friend class Deref; public: ActiveObjectPtr(const T_ & t) : _ptr(t), _mutex(new Mutex) { } ActiveObjectPtr(const ActiveObjectPtr & other) : _ptr(other._ptr), _mutex(other._mutex) { } ~ActiveObjectPtr() { } ActiveObjectPtr & operator= (const ActiveObjectPtr & other) { if (this != &other) { _ptr = other._ptr; _mutex = other._mutex; } return *this; } Deref operator-> () const { return Deref(this); } };
At this stage, it’s not really a proper smart pointer. It doesn’t (and probably shouldn’t) have operator*, and it ignores the const issue. Handling these is left as an easy exercise for the reader.
Posted in c++ | Tagged: c++, c++0x, paludis, threads | 2 Comments »
Paludis Ruby Bindings and Template Classes
Posted by Ciaran McCreesh on June 6, 2008
A PackageID in Paludis supports various actions. An Action is represented by a subclass instance, such as InstallAction or ConfigAction, some of which carry member data (for example, an InstallAction carries information about the target repository for the install).
To perform an action, the PackageID::perform_action method is used. But not all IDs support all actions — you can’t, for example, uninstall a package that isn’t installed, and not all EAPIs support the ‘pretend’ action. So there’s a second method, PackageID::supports_action, that returns a bool saying whether an action is supported.
That’s all very well, but it would require constructing an Action subclass instance just for querying purposes. There’s not much point in this. So we make PackageID::supports_action take a SupportsActionTest<SomeAction> parameter rather than an Action. Unlike the base Action subclass, the SupportsActionTest<T_> template class carries no member data, so it doesn’t need fancy construction. This lets us do this:
if (my_package_id->supports_action(SupportsActionTest<FetchAction>()))
{
FetchAction fetch_action(_imp->fetch_options);
my_package_id->perform_action(fetch_action);
}
This pattern crops up in two other places. To speed up certain queries, we can ask a Repository whether some of its IDs might support a particular action. The Repository::some_ids_might_support_action method will always return true if any of its IDs support a particular action, and might return false if it’s known for sure that none of them will (this weasel wording is necessary because we might, for example, have a repository full of ebuilds with unsupported EAPIs, and unsupported EAPIs means no actions are possible).
Similarly, the new filter system has filter::SupportsAction<ActionClass>. The implementation of this filter uses the Repository and PackageID methods to be as lazy as possible.
Which is all well and good, in C++, but with bindings things get a bit icky since templates don’t translate naturally. In Ruby, we used to have a bunch of classes. SupportsInstallActionTest.new() would be like SupportsActionTest<InstallAction>(), SupportsConfigActionTest.new() would be like SupportsActionTest<ConfigAction> and so on. This isn’t particularly nice.
It occurred to me that SupportsActionTest.new(InstallAction) is legal syntactically in Ruby. It passes the value InstallAction, which is a variable of class Class, as the parameter. Then we have to screw around a bit in the bindings code, remembering that SomeClass <= OtherClass in Ruby means “SomeClass is or is a subclass of OtherClass“:
/*
* Document-method: SupportsActionTest.new
*
* call-seq:
* SupportsActionTest.new(ActionClass) -> SupportsActionTest
*
* Create new SupportsActionTest object. The ActionClass should be, e.g. InstallAction.
*/
static VALUE
supports_action_test_new(VALUE self, VALUE action_class)
{
std::tr1::shared_ptr<const SupportsActionTestBase> * ptr(0);
try
{
if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, install_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<InstallAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, installed_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<InstalledAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, uninstall_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<UninstallAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, pretend_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<PretendAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, config_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<ConfigAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, fetch_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<FetchAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, info_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<InfoAction>()));
else if (Qtrue == rb_funcall2(action_class, rb_intern("<="), 1, pretend_fetch_action_value_ptr()))
ptr = new std::tr1::shared_ptr<const SupportsActionTestBase>(make_shared_ptr(new SupportsActionTest<PretendFetchAction>()));
else
rb_raise(rb_eTypeError, "Can't convert %s into an Action subclass", rb_obj_classname(action_class));
VALUE tdata(Data_Wrap_Struct(self, 0, &Common<std::tr1::shared_ptr<const SupportsActionTestBase> >::free, ptr));
rb_obj_call_init(tdata, 0, &self);
return tdata;
}
catch (const std::exception & e)
{
delete ptr;
exception_to_ruby_exception(e);
}
}
/*
* Document-class: Paludis::SupportsActionTest
*
* Tests whether a Paludis::PackageID supports a particular action.
*/
c_supports_action_test = rb_define_class_under(paludis_module(), "SupportsActionTest", rb_cObject);
rb_define_singleton_method(c_supports_action_test, "new", RUBY_FUNC_CAST(&supports_action_test_new), 1);
Not exactly pretty, for now. Hopefully when C++0x comes along we’ll be able to invent some obscene hack involving std::initializer_list, lambdas and std::find_if which will make the whole thing somewhat more elegant.
Unfortunately, the Python bindings are still stuck using package_id.supports_action(SupportsFetchActionTest()) etc. So far as I can see there’s no nice way to get the same effect whilst using boost.python, even though Python lets you pass classes around in a similar manner to Ruby.
Posted in paludis internals | Tagged: boost, boost.python, c++0x, paludis, python, ruby | Leave a Comment »
Highly Evil C++
Posted by Ciaran McCreesh on June 5, 2008
- Backyard Hotrodding C++.
- Evil C.
- MetaGene Horror Show. Bonus points for
f<x>::__::res<y>::__::res.
Posted in c++ | Tagged: c++, evil | Leave a Comment »
New Paludis Query Interface (Part III)
Posted by Ciaran McCreesh on June 2, 2008
In the first part, we looked at the new Paludis query interface. In the second part, we looked at available components and how to fit them together. Now we look at how the selection is carried out.
The selection process is split up into four stages:
- Find candidate repositories.
- From those candidate repositories, find candidate categories.
- From those candidate repositories and categories, find candidate packages.
- From those candidate repositories and packages, work out the resulting PackageIDs.
The first three steps are basically the same for all selections. The Generator is queried for candidate repositories, and from the result set, the Filter returns a subset. That subset is then passed to the Generator to find a candidate category set, which goes to the Filter to obtain a subset. That then goes back to the Generator to get the candidate package set, which is again filtered.
At every stage, empty set checks are carried out. For most selections, an empty candidate set at any stage returns an empty result set straight away; for selection::RequireExactlyOne, it instead raises an exception.
The final stage is down to the selection. The overall process resembles the “old subset to generator to filter” flow used earlier. Some selections then re-order and reduce the final result set.
But doing a full filter on a large result set for, say, selection::BestVersionOnly is lots of (potentially very slow, since filters might need metadata) unnecessary work. So instead, the filter might be called repeatedly with a selection-crafted subset of the generator’s output until a single result is found.
Even that can be too much work. If multiple candidate packages are found for selection::SomeArbitraryVersion, asking the generator to provide IDs for every candidate might be too much. So again, a subset of the input is fed into the generator until a result is obtained.
Finally, there’s the question of how multiple generators and multiple filters are handled. When combining generators using operator &, a new generator is returned. For each stage, that generator calls its two child generators and makes a set intersection for its result. For filters combined using operator|, a new filter that feeds the first filter’s output into the second filter’s input and uses the second filter’s output as the result is created.
Posted in paludis internals | Tagged: paludis | 4 Comments »
New Paludis Query Interface (Part II)
Posted by Ciaran McCreesh on June 1, 2008
In the first part, I gave an overview of the new Paludis query interface. Now I’ll go into details about the new components and how they can be combined.
Environment::operator[] takes a Selection instance. Currently available selections are:
selection::AllVersionsSorted, which is like the oldqo_order_by_version.selection::AllVersionsUnsorted, which is like the oldqo_whatever(although fetching all versions is no longer necessary if you don’t need them).selection::AllVersionsGroupedBySlot, which is like the oldqo_group_by_slot. This one is quite expensive.selection::BestVersionOnly, which is like the oldqo_best_version_only, but cheaper.selection::BestVersionInEachSlot, which is likeqo_best_version_in_each_slot.selection::RequireExactlyOne, which is likeqo_require_exactly_one.selection::SomeArbitraryVersion, which is new.
All of the above take a FilteredGenerator as their single parameter. A FilteredGenerator isn’t something you create by hand, though — it’s the underlying type that you get by combining a Generator with a Filter.
You’ll always need a Generator. Your choices are:
generator::All. Matches everything, which is generally slow.generator::Matches(PackageDepSpec). Anything matching the spec.generator::Package(QualifiedPackageName). Anything with that name.generator::Repository(RepositoryName). Anything in the named repository.generator::Category(CategoryNamePart). Anything in the named category.
Generators can be combined to produce a new generator using operator &. In this case, the set intersection of the two generators’ results is used.
Filters are optional. To allow you to pass a Generator straight to a Selection, a Generator can automatically convert to a FilteredGenerator.
A FilteredGenerator can have filters added to it using operator| with the new filter on the right. By way of the automatic conversion, a Generator can be used on the left rather than a FilteredGenerator.
Available filters are:
filter::All. Lets everything through. This is more useful forPackageDatabase::fetch_unique_qualified_package_name, which takes aFilterparameter. It’s also used internally to avoid having to special-case a generator with no filter.filter::NotMasked. Doesn’t let masked packages through.filter::InstalledAtRoot(FSEntry). Lets things that are installed at a given root through.filter::SupportsAction<ActionClass>. Lets things through that support the givenActionClass(e.g.InstallAction).
Putting this together, we get these patterns:
env[SomeSelection(SomeGenerator())] env[SomeSelection(SomeGenerator() & AnotherGenerator())] env[SomeSelection(SomeGenerator() | SomeFilter())] env[SomeSelection(SomeGenerator() | SomeFilter() | AnotherFilter())] env[SomeSelection(SomeGenerator() & AnotherGenerator() | SomeFilter())] env[SomeSelection(SomeGenerator() & AnotherGenerator() | SomeFilter() | AnotherFilter())]
and so on.
In Part III we’ll see how the solving is done.
Posted in paludis internals | Tagged: paludis | 3 Comments »