The great package format debate: why there's no need for distributions to use the same package format

As I mentioned, I spent the weekend at LinuxFest Northwest, a great conference I've been to three times now.

Bryan Lunduke does a recurring talk there which is called "Why Linux Sucks" or something like it. It's a great talk, where Bryan gives his tongue-in-cheek opinion on what's holding back desktop (and, more recently, mobile) Linux adoption. There's a video of this year's talk.

Bryan's awesome, and we get along great. I love his talks (in the last couple of years he's started following the "Why Linux Sucks" talk with a "Why Linux Rocks" talk which uses almost the exact same slide deck, which is an awesome idea). But I think he's wrong on one point he brings up. It's hard to do a good job of explaining why concisely during the talk - you can watch me doing a bad job of it in the video! - because it's a complex issue that's hard to distil. So I thought I'd write it down instead.

The problem Bryan identifies affects third parties providing Linux applications directly to users: Bryan trying to provide his games to users of different distributions, or Google trying to provide Chrome, or Mozilla trying to provide Firefox, and so on and so forth.

If you're one of those third parties, you can put your application in a package, or you can put it in a tarball.

If you put it in a package and provide a repository, your users can take advantage of the features of package management. Your app will show up in their package management applications, it will be updated when they do a system update, and a few other things. But there's a problem: different distributions use different package formats. So if you just do one package (and repository), you won't be covering all your potential users. If you do RPM and DEB you'll cover 90+% of them, but there are still a few who'll be left out.

If you put it in a tarball, you can just make one, and give it to users of all distributions. They'll all be able to extract your app and run it. But it won't show up in their package manager. You'll have to tell them to download a new tarball, or write your own update mechanism.

Bryan wants for third parties to be able to get all the neat stuff that comes with using a packaging format, without the inconvenience of doing multiple builds.

If distributions all used the same packaging format, would Bryan's goal be achieved? Absolutely. Is it a worthwhile goal? Sure. I don't think it'd change the world, but we agree it'd be a benefit.

So Bryan's right? I'm an idiot? Well...not so fast there!

Switching package formats for a distribution is a huge pile of work. I can explain why if anyone needs me to, but for now I'm going to assume you're willing to take it on trust.

If all distributions used the same package format, that wouldn't help them directly at all. A lot of people believe that if, say, Fedora, OpenSUSE and Ubuntu all used the same package format, you could just mix and match packages between them, and each distro wouldn't have to package everything separately.

This is not true, we can't ever make it true, and we don't want it to be true. I'll try to explain why as quickly as I can: distributions are very different, and they all are strongly committed to using shared resources.

When you get Chrome from Google, you get a static build: it uses very little of the shared resources the distribution provides, stuff that's present and compatible on virtually any distro.

When you download a package from your distro vendor, it's not like that. Distros use dynamic builds: distro packages use all the shared resources they can. Just take it on trust that the benefits are huge and distributions will never start using static linking outside of the cases where it's unavoidable (ask me for details if you want). Distros differ in terms of the shared resources they include, and this isn't just being ornery, it's a key part of what differentiates distributions from each other, and why we have them at all. Fedora 18 has openssl 1.0.1e. Ubuntu has 1.0.1c. RHEL 6 has 1.0.0. Debian stable has 0.9.8o. This is because Fedora is bleeding edge, Ubuntu is a bit more conservative, RHEL 6 is a lot more conservative, and Debian stable is even more so. We couldn't all get together and 'agree to compromise on 1.0.0'. Then none of those distributions except RHEL would be serving the needs of its users.

Multiply that by the thousands of packages in a distribution. You can't take a Fedora 18 package and install it on RHEL 6, even though they are both RPM distributions, because the package expects stuff from Fedora 18 that just isn't in RHEL 6.

So we've got a cost/benefit. The benefit? It's there. The cost? It's huge, and it falls on groups that don't directly benefit at all. This sucks. But fortunately, we don't need to pay that cost!

We don't need all the distributions to agree on a common package format for their packages so third parties can provide applications with most of the benefits of package management. All we need is for there to be a package management framework that third parties can rely on to be present on all distributions. It doesn't matter at all whether the distribution packages use it or not.

That's a much easier problem to solve. All we'd need to do is agree to all provide support for one existing package format out of the box. (Sidebar: LSB already actually tries this. It requires RPM.) That's one big bunfight at a conference and maybe like a week or so of work. No big deal. If Bryan pushed for this instead of saying 'all distributions should use the same package format!', I'd have no disagreement with him.

You can stop reading here if you like! But if you're thinking 'hey, waitaminute'...

To pre-emptively address one objection: what if a third party wants to provide a package that depends on something in the core distribution?

Bryan's idea would go a bit further to addressing that than mine would. But it's not so important, Bryan's idea still doesn't totally solve it, and you can modify mine to solve it quite easily.

It's not so important because most significant third party providers just want to make their package as independent as possible. They don't want to worry about making sure their app works with all the possible versions of the shared resources they want to use that are shipped by the various distributions. They usually just link almost everything statically. Here are the requirements of Google's Chrome RPM, for instance:

libatk-1.0.so.0()(64bit) libc.so.6(GLIBC_2.11)(64bit) libcurl.so.4()(64bit) libgconf-2.so.4()(64bit) libnss3.so(NSS_3.12.3)(64bit) libbz2.so.1()(64bit) libXss.so.1()(64bit) libXcomposite.so.1()(64bit) libXfixes.so.3()(64bit) wget xdg-utils zlib

Honestly, you really don't need to specify any of those. If someone's running a Linux distribution with a desktop (which they are if they want to install Chrome), it's about 99 to 1 they have all of that installed.

Bryan's idea doesn't fully address this requirement because distributions don't all agree on what different packages should be called, or how library dependencies should be written, etc etc etc. It's slightly more likely that we could all agree on that than that we could all switch to one package format, but it's still a hell of a long shot. Even if we all used the same package format, you couldn't just write "Requires: foo" and be done. One of the distros probably calls it "libfoo". You'd still have problems.

And you could pretty easily achieve this with my idea, too. Better, we don't have to do it all at once. We could just get the format in place first. If that was a success, then 'I want to be able to specify some dependencies!' is an obvious enhancement request, and now we're building on an existing idea, not just throwing impractical proposals around.

Maybe we can't get all the distros to agree on the names of absolutely all their packages, but we could get them all to agree to have their GTK+ 3.0 package provide gtk+-3.0 under the shared packaging system. The names could still differ under the native packaging systems - it wouldn't matter. But it's plausible to see us agreeing on a set of commonly-required core components, and doing the work to have the distro packages express those provides in the shared package manager as well as their native package manager, using an agreed nomenclature. And it could be incremental - we could just start by doing 50 packages, or 10, or just one; however many we could agree on. And then build it out as we went along. And none of this would in any way disturb the functioning of the distro's native package system and repositories.

Comments

Sepero wrote on 2013-04-30 08:36:
A point I think matters which was not addressed is- administration. If I have a package problem with a deb box, then I know all the tools apt*, dpkg*. I know all the locations to look /var/cache/apt/ /var/lib/dpkg etc. I know how to freeze packages and pinning works. I know how ppa's work. I know how to create my own deb packages from source if needed. It's for these reasons that I won't touch another system (rpm, pacman, whatever). I've changed distros once in the past 10 years- from Debian to Ubuntu. I know it inside and out. Could I learn how everything works in another system? Sure, but why? All I really care about is that deb is the most popular format. If rpm was the most popular, then I would support that instead. My time is valuable, and I don't need to be accidentally mixing up different package info in my mind when a problem needs to be solved right now. Most people are like me in this- we don't have "fun" learning how all the innards of a packaging system work, we just want the shit to work. You're right that having the same package manager doesn't mean Debian packages will install on Ubuntu without problems. But having the same package manager does mean that with little difficulty, I can often fix those problems. So I tell clients it doesn't matter much what distro they use, as long as it is _deb_ package format. Like I said, if Redhat was using the most popular package format, then I would be supporting them instead. So in this way, yes, it does _directly_ harm them. (Most of the distros on distro watch are deb and rpm based and they have no commercial backing. They are _directly_ benefitting from the package management system they choose.) Additionally, working with these package intricacies is much more difficult for 3rd party vendors. Unlike the distro makers, 3rd party vendors aren't dedicated to packaging tools everyday. Instead they are just dedicated to improving their product. I don't think the distro makers appreciate the costs of 3rd party vendors to produce even 1 good package (especially when these vendors are already catering to other OS's). Aside from all that, I think you make a lot of good points in your blog. I agree that something like LSB is good, and I do not know why Debian is not in complete alignment. My wild guess would be that it has something to do with politics or money (who is funding LSB???). Also, agreed with points on "commonly-required core components" and "agreed nomenclature". Very good points, totally realistic, and I would love to see happen. This would make the package manager much less relevant. I'm sure there are things that I haven't covered, and I think this whole topic could be expanded on a lot more. From what I've said here alone, I think there is more direct and indirect harm to the distribution than you initially account for, but still not enough for Redhat to switch all their rpm's to deb. It would probably be easier for them to entirely switch their distro to debian based, and change only necessary extra packages. Either way, we both know none of this deb/rpm union is going to happen. I enjoy the theoretical discussion. ;)
adamw wrote on 2013-04-30 15:34:
"You’re right that having the same package manager doesn’t mean Debian packages will install on Ubuntu without problems. But having the same package manager does mean that with little difficulty, I can often fix those problems." It's not having the same package manager that means that. The point I'm trying to get across is that the package format matters very little to interoperability between distros. You can often take a Debian sid package and install it on Ubuntu stable (or vice versa) simply because Ubuntu is directly derived from Debian sid and they are still quite closely cross-compatible. The packaging format doesn't really have a lot to do with it. If you somehow had two distros that were as closely cross-compatible as Ubuntu and Debian but used different packaging formats, you can bet your bottom dollar someone would've written some trivial conversion tool which meant it was just as easy to share stuff between them as it is now. It's not the package format that makes it possible to share a lot of stuff between Debian-based distros, it's simply the fact that Debian-derived distros are generally all still pretty similar to each other. This is much more obvious to RPM folks because not all RPM distros are derived from a single common core, and even when you have, you know, Fedora and RHEL, or OpenSUSE and SLES, their different goals mean the distros rapidly diverge from each other. But because most Debian-derived distros are actually pretty similar to each other, this confusion between 'level of cross-compatibility between distros' and 'package format' tends to crop up over and over on the DEB side. "Additionally, working with these package intricacies is much more difficult for 3rd party vendors." This is kinda true, but most of them solve it by the simple tactic of making really fucking terrible packages. :P Or, less facetiously - it's much easier to make a package that's basically a tarball of a statically-linked build with a changelog and a version number than it is to make an actual distribution package - a package of a dynamically-linked build with all the correct dependencies that complies with distribution policies. If all you want is something in the right format that installs properly, it's really not that difficult, and there are some hideous, hideous scripts and tools that do this evil thing for you. :P I can see the point you make about comfort/familiarity/package availability, sure; but as you could probably guess, I don't think it's a huge point, and the cost/benefit is still way off to the point where no major RPM distro would seriously consider it. Note that many high-level package management tools (especially GUIs) are at this point fairly strongly abstracted from their underlying frameworks; I don't want to get too far into the castle-building-in-the-sky, but it would not be impossible to envisage support for my 'universal format' being added to GUI tools in distros alongside support for the 'native format'. It's certainly feasible. "My wild guess would be that it has something to do with politics" gee, ya think? :) "or money (who is funding LSB???)" One of the big problems, kinda, is that the answer to that is 'no-one very much'. LSB has been kinda dormant for a long time. I think there've been signs of it waking up lately, but as things stand it's kind of at the 'honourable effort but not quite good enough' stage. "It would probably be easier for them to entirely switch their distro to debian based, and change only necessary extra packages" Congratulations! You have found something even less likely to happen than an RPM-based distro 'just' switching package format. That was quite a feat. =) I mean, yeah, that's just not going to happen. I do think there's a serious point here, which is that when people start talking about absurd ideas like this, anyone who's actually significantly involved in building a distribution and knows what that means and what it involves usually just tunes flat out. If you gave the 'let's all use one package format!' talk to a room of high-level distro maintainers - the kinds of people who actually make these decisions - you'd pretty much be laughed out of the room, you just won't be taken seriously. If you want some kind of practical, achievable improvement, you gotta understand what's reasonable and possible and what isn't.
adamw wrote on 2013-04-30 16:00:
Oh, BONUS THOUGHT TIME: my idea has the other considerable advantage that it's a thing that is currently happening. You know that Steam thing Bryan kept talking about? You know what Steam is? Why, it's a package management platform!
Charles Banas wrote on 2013-04-30 18:28:
I sat across the aisle from you, and I could tell you wanted to make your case, but I have to say I disagree with your premise. We know that every package system has a few key concepts: 1. Every package has a name and a version number. 2. Every package has a set of dependencies, which are typically a name and a version number. 3. Every package manager has a way of deducing the package ownership of a file, and the files owned by a package. A few can do this even without having the package on hand. 4. Every package format has a common set of features: Pre-install, post-install, pre-remove script, post-remove, and a few other scripts; a manifest which describes the package and its dependencies; and a tarball that can be unpacked (usually directly) to /. RPM hides this inside cpio archives, and Debian inside ar archives. We could, in the spirit of the Alien script, devise a common package format or a package manifest format to which all package managers could subscribe. This new package format would have a standardized manifest format, which would describe the library/package dependencies and the version number *range* that is known to be supported, and a Makefile-like language could describe the pre/post install/remove tasks that would normally be performed by package scripts. In this way, the common format could provide enough information for the package manager to determine which packages are needed for its installation and install the foreign package appropriately. 0install is one such effort for a foreign package system, but it's currently incapable of updating the host package manager's database. What's keeping us from making that possible?
adamw wrote on 2013-04-30 19:12:
Charles: that's not terribly different from my proposal really. I'd be fine with it also. But note that you're focusing a lot on dependencies. I think this distinction between distros - which care crazy lots about dependencies, dynamic builds with as much interdependency as possible are what distros are all about - and third parties - which usually just want to ship a copy of their app that is as independent as possible - is really important, and often missed. Dependencies just aren't at all as important to the use case Bryan cares about as they are to the distribution use case. Take Bryan himself. He writes Linux Tycoon. Okay. Now think about Bryan wanting to distribute Linux Tycoon on his website. He doesn't want it to be a 'good citizen' package that's part of a specific Linux distro, built with as much dynamic linking as possible, following all the packaging policies of one distro and yadda yadda. He wants a mechanism that lets him put out *one* 'package' of Linux Tycoon that all Linux users can use. For that use case, compiling it with a ton of dynamically-linked dependencies just doesn't make sense. It's not what he wants to do at all. But now imagine he makes LT open source and Fedora wants to have an LT package in its repos. Fedora would want to build it dynamically and have it follow all Fedora's rules and so forth. The Fedora package of LT would always be a Fedora package. No matter the package format, it would not be at all appropriate to use it on Ubuntu or OpenSUSE or Debian or whatever. Distributions don't build universal packages. They do not want to build universal packages. Third parties do want to build universal packages, and they want to reduce their dependency count as much as they can. The whole distro paradigm of 'dependencies, dependencies, dependencies!' is not what third party devs want at all. Your mechanism is neat, but it kind of misses the point: we don't need all that clever stuff to try and specify complex distro dependencies. Even if we have a mechanism that lets third party packages get really detailed and depend on a whole bunch of libraries, that's kind of pointless, because you're just providing a mechanism for third parties to build non-universal packages, which they don't want to do anyway. Say we put your system into practice, and Bryan builds a Linux Tycoon package in your new format which is able to say "well, I need pango between version 1.30.0 and 1.33.1, a copy of GTK+ between 3.6.0 and 3.6.3, openssl 1.0.0.whatever (but not 0.9.9 or 1.0.1), libsexy 0.1.9 or 0.1.10..." What problem have we solved? We haven't really achieved much of anything, because there's probably only one distribution that actually has the right versions of all those things. This build of LT would still not be universal, even though it would be in a 'universal' package format. As I keep saying, the reasons typical 'distro' builds - builds that use dynamic linking as much as possible - are not cross-compatible between distributions are not to do with the package format, they are simply to do with the fact that different distributions have different components. A universal package format that lets you express complex dependencies is great and all, but it doesn't mean that packages in that 'universal' format with complex dependencies are actually likely to be at all 'universal'. If Bryan was going to go down that road, he'd still need like 10 packages of LT in your 'universal' format, one for Ubuntu 12.04, one for 12.10, one for Debian sid, one for testing, one for stable, one for Fedora 17, one for Fedora 18, one for OpenSUSE 12.whatever, and blah blah blah...so you haven't actually saved him any work!
helsinkiharbour wrote on 2013-04-30 20:09:
Well, I think the solution to all mentioned problems was already found as the advantages of "dynamical packaging" are nowadays neglectable for the desctop use-case: Distro-independent installation formats like portablelinuxapps.org/, www.pgbovine.net/cde.html or http://0install.net/ (or even the "infamous" Autopackage). To call it in another way, strong separation between system and apps, a platform (like steam is now building up), is a killer feature for ISVs and users alike (if you don't have it, you suffer this: https://bugs.launchpad.net/ubuntu/+source/software-center/+bug/578045)
adamw wrote on 2013-04-30 20:20:
helsinki: the advantages of dynamic linking are certainly not 'negligible' for desktops. There are two huge wins with dynamic linking: resource efficiency and consistency. If you run 10 apps that have a library statically linked, you have ten copies of that library in memory. That's...bad. If you run 10 apps that have the same library dynamically linked, you have one copy in memory. That's much better. If you run 10 apps that have a library statically linked, and there turns out to be a bug in that library, each of the apps must be updated separately to fix the bug. If you run 10 apps that have a library dynamically linked, and there turns out to be a bug in the library, then your distro can ship a fix for the library and all the apps get it 'magically'. The apps don't have to be updated at all. This is, of course, especially important for security issues. So please don't dismiss the advantages of dynamic linking; distros don't do it for fun. But yes, in practice, third party distributors are going to favour static linking, and you are right to say that various things already exist which are quite close to what I'm describing: Steam is one that I mentioned, 0install is one that you mentioned. I think 0install doesn't have a lot of buy-in because it makes some choices that distros aren't happy with in terms of its overall design.
helsinkiharbour wrote on 2013-04-30 20:31:
@adamw: I stay my point, the memory efficency argumentation was maybe valid in the nineties, nowadays with 100 times more RAM it's not a problem anymore so that someone should spend a second of work to save some kBs on libraries in memory (should be fixed automaticaly on system level by reusing memory pages of same libraries). If the security argument is in real-world critical enough to outweight the disadvantages is debatable, for my taste it is not. It's a to seldom and theoretical scenario and the application provider themself care enough (or even more) than distros in being up-to-date so that this could be a practical problem outweighting the disadvantages of tight binding all apps with the complete operating system together.
adamw wrote on 2013-04-30 20:34:
helsinki: it's hardly a 'theoretical' problem, it happens all the time on Windows, where static linking is much more common. All sorts of unnecessary updates and vulnerabilities on Windows are the result of the fact that Windows apps tend to statically link a lot of stuff, so when a compromise is found, they all need to ship updates, and some app vendors just don't do it, so their apps remain insecure. You'll find people with experience building distros are pretty reluctant to run third party, statically built packages, FWIW. I avoid it like the plague.
Thomas Leonard wrote on 2013-05-01 09:54:
"I think 0install doesn’t have a lot of buy-in because it makes some choices that distros aren’t happy with in terms of its overall design." Could you be more specific on what these problems are? Almost all distributions include 0install in their repositories (Ubuntu, Fedora, OpenSUSE, Debian, Mint, etc), so I guess they aren't that unhappy...
helsinkiharbour wrote on 2013-05-01 11:18:
OK, it could be argued way around that an unsecure lib which is system wide updated in parallel makes all apps unsecure at once therefore multiplying the damage potential. Could be called a Zero sum game approach, trading frequency of security breaches against severity. But let's talk about known disadavantages of having the apps tight integrated into the system (by package management): Ubuntu's Matthew Paul Thomas identified this approach as not scaling well enough to fullfil the app needs of the users. http://www.youtube.com/watch?v=GT5fUcMUfYg What about the significant chance that an application breaks by subtile changes in an system wide updated library? Regularly something slips throught the tests (more often than potential security breaches). What about the missing of portable apps "stickware/usb-ware" in the linux ecosystem as apps are relying on sepcific systemwide libaries? Works great under mac and windows. What about the limited freedom of user in selection of App versions? Like most recent, older one, several versions in parallel, which are not (or net yet, or not anymore) in the repository? (Described in this ubuntu report: https://bugs.launchpad.net/ubuntu/+source/software-center/+bug/578045) In the end, I think it's about choice and freedom for the users as crucial argument for a distro-agnostic bundle approach (or how you called it "statically linked"). The freedom aspect outweighting all security concerns brought in for distro centralized package management approaches which enforces strict system integration of apps and limits choice. Or how Ingo Molnar called it recently: "Users want to be free to install and run untrusted code." (https://plus.google.com/109922199462633401279/posts/VSdDJnscewS) (Years ago also Ian Murdock called it similar: "And, no, moving everything into the distribution is not a very good option. Remember that one of the key tenets of open source is decentralization, so if the only solution is to centralize everything, there’s something fundamentally wrong with this picture.)" http://ianmurdock.com/linux/software-installation-on-linux-today-it-sucks-part-1/)
kparal wrote on 2013-05-01 21:52:
adamw: nice hat! :-)
adamw wrote on 2013-05-02 06:33:
kparal: I have a wall full of 'em :)
LinuxLover wrote on 2013-05-03 14:31:
I don't think the package format is as important as distros getting together on a standard package manager. It would be so nice to jump from Fedora to Mageia to Ubuntu to Arch and all use the same package manager. It doesn't matter what type of package it is, as long as installing it is trivial without learning a new GUI or new commands. I know there was an attempt at everyone getting around PackageKit for a front end, but that seems to have died off.
adamw wrote on 2013-05-03 16:50:
LinuxLover: that's again something that doesn't need to be done by the distros, really - you just need a universal third-party platform, one all the third-party distributors agree to use. It could be Steam. It could be something built on OBS (which Bryan and I discussed in our Youtube chat, which seems to have been eaten by the internet unfortunately). But yup, sure, obviously such a system would need a universal front end.