Fedora migration: complete

A bittersweet afternoon: I just completed migration of my mail server from Mandriva 2010.1 to Fedora 15. MDV 2010 is out of support now, I don't trust MDV 2011. I could've gone with Mageia, but it just seemed simpler to stick to Fedora for everything. So now all of my personal machines and servers are on Fedora. It's a shame to not be running MDV any more after all this time - nearly exactly ten years. But I feel more confident now about an all-Fedora set up and it'll make things easier.

I also took the opportunity to move from courier to dovecot, which seems to be generally better-maintained and have quite a few capabilities courier does not. The conversion went remarkably smoothly, partly thanks to the great documentation and tools on the dovecot site, and it sure seems to be super fast so far. No problems noted as of yet - fingers crossed! But if I don't seem to be replying to your emails, ping me on IRC and sound the alarm...

Fedora 16 Beta RC2 in validation now!

So it's been a long hard slog - see this mini-essay for some details - but we finally have second release candidate for the Fedora 16 Beta. (Yes, you read that right, a release candidate for a Beta: this is not mass-market stuff). If you want to help us ensure the Beta release goes out on time (well, only a week late...), you can download the candidate images and do some of the validation tests: installation, base, and desktop. The individual tests are small and mostly pretty easy to do, and you can a lot of the tests in a virtual machine or (for the desktop tests) from a live image.

If all the required tests for a Beta release pass, we will give this candidate the official QA thumbs-up to go out as the Beta: if not, we'll have to get the bugs fixed and roll a third release candidate. If the test coverage is incomplete, we can't safely sign off on the release. So we really need to get all the tests done!

For more details and download locations, see the announcement post. Thanks!

Input method Test Day today

Sorry again for the short notice, but it's Test Day time once more. Today - this very day, 2011-09-22! - is Fedora 16 input method test day. We'll be focusing on the new input method features in Fedora 16, including the ibus extension for GNOME Shell. If your language uses a special input method in Fedora, please come along to the Test Day and help us ensure it'll work smoothly in Fedora 16.

You can join #fedora-test-day on Freenode IRC to chat with developers and QA team members to get help with the testing and any issues you encounter, or you can just follow the instructions on the Wiki page and file your results there. If you're not familiar with using IRC, you can join in with WebIRC. The testing is easy and can be done with a live image, so there's no need to install Fedora 16. Thanks for helping!

Fedora 16 Beta: coming a bit less soon to a PC near you

Well, we're slipping the Fedora 16 Beta release.

Let's take a look at why - as I've done for previous slips.

We tried something to try and help avoid slips this time: we release Fedora 16 Beta TC1 (Test Compose 1 - the first full 'dry run' for any given Fedora release or pre-release is a Test Compose, a full set of ISO images built like a regular release by release engineering, for QA to run validation tests on) a week ahead of the original schedule, on 2011-08-31 instead of 2011-09-06. While this obviously didn't succeed in preventing a slip in the end, we feel like it had a positive effect; the extra week wasn't wasted, and we think there's a significant chance the Beta could be in danger of slipping two weeks without the early start on testing.

The accepted Fedora 16 Beta blocker bugs that were not addressed at the time of the Go/No-Go meeting are:

735866: "boot hangs with udevadm settle - timeout of 120 seconds"

This bug was identified early in the Beta cycle, on 2011-09-05. However, we have not succeeded in identifying a specific cause for it, and it's possible there may be multiple bugs with the same symptom. Although it was initially accepted as a blocker, it's not proven to be a really severe problem in later Beta testing, and it's possible this bug would be revised to non-blocker status on Friday. If we'd had no other blockers, we may have downgraded this one and gone ahead with the Beta release. It manifests itself as a 2 minute delay to system boot which seems to happen periodically - we have not found a system on which it happens at every boot. On boots when you hit the bug, you would also not be able to run the live installer successfully. The workaround is simply to reboot until you don't hit the bug.

737731: "Bootloader is left in F15 configuration when preupgrading to F16"

This is part of a family of bugs which is related to a major change in Fedora 16: the switch to the grub2 bootloader. Previously, when upgrading Fedora, both preupgrade and 'normal' anaconda-based upgrades would default to updating the existing bootloader configuration with the kernel from the new release, rather than re-installing the bootloader. With Fedora 16, the 'update' option is no longer available, and the intention is that upgrades will write a new grub2 bootloader. preupgrade, however, needs to be patched to actually request that anaconda writes the new bootloader. This bug was recognized quite early, but there is some QA fail associated with it. There was a similar bug for regular anaconda-based upgrades, to change the default action to 'write a new bootloader', and make that action work. We assumed preupgrade would follow the anaconda default and hence be fixed as well, but in actual fact, preupgrade writes a kickstart file and passes it to anaconda, and defines the bootloader action in that, and so will need a separate fix. We should have checked more closely into this bug rather than assuming it would be fixed.

738803: "SELinux denial(s) prevent(s) gnome-shell from starting on F16 Beta RC1"

This bug was identified as soon as the Beta RC1 build was made available, on 2011-09-15. It's been the focus of our work since, as it's a clearly critical bug that proved to be complex to diagnose. The symptom of the bug is that a file in the home directory of the 'liveuser' account on live images had the wrong SELinux context, which caused gnome-settings-daemon to crash when its helper processes could not access it, which caused GNOME itself not to start correctly. At first we were confused as to why the bug had suddenly appeared in Beta RC1 when neither selinux-policy nor the relevant GNOME components appeared to have changed in a way that would cause the bug. We finally worked out that a different bug in the live image generation tools had been masking this bug: it had caused the /home/liveuser/.local directory to be owned by the root user, which prevented the problematic file from being created at all. gnome-settings-daemon does not crash if the problematic file is not present - only if it is present, but incorrectly labelled - and so previous builds had not exhibited the problem. We were also confused by the fact that the behaviour with a non-live install, or when creating and logging into a new user on an existing F16 install, was different: this turned out to be due to yet a further bug in selinux-policy which did not trigger in the live boot case. After clearing up both those sources of confusion, and fixing the obscuring bugs, we were finally able to establish that this problem was down to the 'filetrans' mechanism by which the kernel should monitor newly created files and directories and apply the correct SELinux labels to them not working correctly. Fixing this issue ultimately proved to be the resolution for this bug. However, we were only able to establish the real problem and build a fix at a time that was too late to allow us to meet the release deadline. I don't think that either QA or the developers did much wrong in this case; it was just a very tricky bug to pin down, and the process took longer than the time we had available.

738964: "Unable to make system bootable due to bootloader choice"

This was another bug that relates to the introduction of the grub2 bootloader, and in this case, the introduction of GPT disk labels.

The GPT disk label format will replace the MS-DOS disk label format (often referred to simply as the MBR) which has been used on just about all disks in PCs for decades. It permits a more flexible partition table and partitions greater than 2TB in size, among other improvements. However, as with any Shiny New Thing, it introduces complexities.

With the old MS-DOS disk label / MBR system, there was a space behind the MBR on the disk in which a bootloader could expand its second stage. With GPT disk labels, there is no such space.

GPT is associated with the new EFI system firmware specification, which will eventually replace BIOS. On EFI-based systems, bootloaders can be installed into the EFI system partition. On BIOS-based systems, however, there is no such partition. Consequently, if you want to boot from a GPT-labelled disk on a BIOS-based (rather than EFI-based) system, it is necessary to put a special 'BIOS boot partition' on the disk. grub2 will use this partition to contain its second stage.

The upshot of this is that, in Fedora 16, there is a tighter relationship between bootloader installation and partitioning than was previously the case. With grub and MS-DOS disk labels, it was never the case that you needed to partition a drive in a specific way in order to install a bootloader to its MBR. With grub2 and GPT disk labels, this is the case.

The Fedora installer, anaconda, turns out to have been designed with the assumption that partitioning and bootloader installation do not need to be closely related. This bug essentially encapsulates the problems that emerge when such an installer is used in a situation in which the two are closely related. It's notoriously difficult to work out all the implications of implementing a Shiny New Thing like the new disk label system and bootloader used by Fedora 16 in advance; pre-release testing exists in part to identify these kinds of issues.

Essentially, we discovered various cases in which anaconda would not allow you to install the bootloader in the logical and desired location, or would even install it in an inappropriate location silently and automatically.

anaconda has a test it runs to determine if a device is a valid one on which to place a bootloader's first stage. This test fails if the device in question is the MBR of a GPT-labelled drive, and the system is BIOS-based. Now, anaconda's logic had already been fortified to handle the case where such a drive was the only one being used in the installation: anaconda would realize the drive needed a BIOS boot partition, and either create one automatically (if automatic partitioning was in use) or prompt the user to create one (in manual partitioning).

However, the logic proved to be unequal to the case where multiple drives (of various types) were available to the installer. In the first noted case, anaconda would sometimes consider the USB stick from which it was being installed (if the user had written the live image, or DVD image, to a USB stick) as a potential bootloader target. In another case, the user might have two hard disks, one with a working OS installation and an MS-DOS disk label, and the other being formatted and Fedora 16 installed on it.

In both of these cases, anaconda would run its 'is this a valid bootloader location?' test on both the actual target installation drive, and the 'other' drive - the USB stick in one case, the 'working OS' drive in the other. This test would often pass - after all, the drive would likely have an MS-DOS disk label, and hence wouldn't need any special handling to have a bootloader written to it.

Because it had found one drive which looked like a valid bootloader location, anaconda wouldn't insist on the creation of a BIOS boot partition on the actual target drive for Fedora 16 installation.

If you selected manual partitioning, Fedora would prompt you for a bootloader location after package installation. If you encountered this bug, you would find that the MBR of the target drive was not an available choice - because no BIOS boot partition had been created on the target drive. You could only choose to install the bootloader to the root or /boot partition of the target drive, or to the MBR of the 'other' drive. Neither of these was likely to be what you actually wanted to do.

If you selected a form of automatic partitioning, the impact was even worse - anaconda would automatically (and silently) install the bootloader to the MBR of the 'other' drive. So if you installed Fedora 16 Beta RC1 to the second drive of a system whose first drive contained a working Fedora 15 installation, the first drive's bootloader would be overwritten with a new grub2 bootloader, and the second drive (where you installed Fedora 16) would get no bootloader at all. This configuration might well work, but it would not be what you wanted. If you hit the USB key case, Fedora would install the bootloader to the MBR of the USB key, leaving you confused when you booted the system without the key attached and found it failed to boot (or plugged the key in and found it had grown a bootloader you did not expect).

Again, this bug proved somewhat tricky to pin down, as it's easy not to hit it, or to hit it in various circumstances whose results seem quite different. It was reported shortly after the Beta RC1 release, but only completely diagnosed on 2011-09-19, which was already likely too late to meet the release schedule. It's possible that we could have caught and diagnosed this bug earlier, but I'm not confident in stating that we ought to have done. It's also not straightforward to decide how to fix it, and the fix we eventually decided upon is quite drastic and will require extensive testing.

739523: "unable to shut down from gdm greeter"

This bug was only discovered quite recently. It's not a hugely obviously critical bug like the previous few, but is a clear violation of the Beta criteria. QA could clearly have performed better here by doing the desktop validation testing more promptly: this bug would easily have been exposed by one of the desktop validation tests. However, a lot of our attention and resources were diverted to the more obvious bugs. On balance, we ought to have identified this bug earlier, but doing so would not actually have made a material difference to the release, as the other bugs would still have been present.

I can definitely see areas in the bugs we hit, and the speed with which we were able to discover and diagnose them, where the QA team can improve our performance. However, I think ultimately the slip would have been very difficult to avoid; the SELinux and bootloader bugs were simply too complex to diagnose and fix in time, even though we did discover them with several days to spare in the schedule. The SELinux bugs is one of those perfect storms of a somewhat obscure bug and several complicating circumstances which seems to arise once in a while and seems to be essentially impossible to design out of a six-month release cycle. The bootloader issue is a consequence of the kind of major change which Fedora exists to do: it's unfortunate, but again, quite difficult to avoid. It is very, very difficult for a large and complex project like anaconda to figure out such a consequence of the grub2/gpt change in advance. It would have been possible for QA to discover and diagnose it sooner, though, and we will be extending our validation test case set to include some test cases which ought to aid in the discovery of similar cases in the future.

Well, I hope that helped to clarify some of the considerations that go into making (and delaying) a Fedora release!

As of this month, I am Red Hat's Fedora QA team lead, which means I'm responsible for directing the efforts of the Red Hat staff who form a part of the Fedora QA team. Combined with my existing community manager role, this means I'm probably in the position of being the 'point person' for Fedora QA, responsible for ensuring we do our job of validating releases and pre-releases comprehensively and promptly. I'm still learning in the role and definitely feel that I'm not quite living up to the high standard set by James Laska yet, but I'm hoping to learn the lessons from each pre-release as we go and try to do a better job of ensuring the team gets the necessary testing done efficiently. Any errors or sub-par performance by the QA team in this release should be considered to rest on my shoulders, not those of the other RH staff and community members who make up the team, all of whom did sterling work. In particular I'd be remiss not to thank (if I may presume to do so) Tim Flink, Andre Robatino, Athmane Madjouj, Jóhann Guðmundsson, Mads Killerich, Thomas Gilliard, Dennis Gilmore, and the anaconda and selinux teams for going far above and beyond the call of duty in trying to get the release out on time. I know we all gave it our best shot.

We're pretty confident that we'll be able to get the Beta out with just the one week's delay, so look out for it on 2011-10-04. It should be an exciting release, and thanks to this strict release validation process, it may be a bit late, but it should be a pretty solid Beta.

Site certificate update

Just a quick note that I've updated the SSL cert for happyassassin.net. It's still self-signed and hence you'll have to allow an override if you want to use https to connect to the site for some reason, but at least it's not expired any more, and has a 'correct' CN, so if you're using Convergence, it should work.

Graphics Test Week this week!

Sorry for the short notice everyone, but this week is Graphics Test Week again. It's time to make sure the major graphics drivers are in shape for Fedora 16. Tomorrow, 2011-09-06, is Nouveau Test Day. Wednesday, 2011-09-07, is Radeon Test Day. And Thursday, 2011-09-08, is Intel graphics Test Day. All the events take place in #fedora-test-day on Freenode IRC.

As always, the more testing we can get, the better. Fedora graphics developers and QA team members will be on hand to help with testing and debugging. If you've been to a graphics Test Day before it would be great to check in and see if any bugs you hit have been resolved, or if things are still working well. If you've never done one before, please come along and add your hardware to the repertoire!

The testing is easy and there are full instructions on the wiki pages. You don't need a Fedora account to do the testing or to file the results, but you will need a Bugzilla account to file bugs.

If you can't make the day for your graphics card, don't worry - you can come along on either of the other days too, or do the testing later and add your results to the wiki page. Even filed 'late' they will still be useful!

To join in, just visit the page for your graphics card and follow the test instructions, then add your results to the table at the bottom of the page - there are instructions on how to do it. You can join #fedora-test-day to chat with the QA team and the developers.

Please come along and help us make sure Fedora 16's graphics experience is second to none!

Logic flaw?

From this story about the police arresting some kids with blacked-out pellet guns:

"These toy guns look realistic to us as police officers, especially at night, and especially when the orange muzzles have been blackened out. We can't tell the difference. And if we come across somebody who's carrying a firearm like that, they are going to be challenged."

So...if I buy a real assault rifle and paint an orange muzzle on it, I can carry it around anywhere and you'll assume it's a fake? Score!

Fedora 16 localization / internationalization test week starts tomorrow!

Tomorrow - Tuesday 2011-08-30 - holds the first event of Fedora 16 localization / internationalization (l10n / i18n) Test Week: desktop localization Test Day. This event will mostly be focused on checking that translations are present, working and correct for key Fedora components like the system configuration tools. Please, if you speak any language other than English, come along and help ensure that things are in shape for the Fedora 16 release. It's very easy to test - just grab a live image, boot it up, run one of the commands listed, and check the result! There will be Fedora QA and translation team members around to help you test and report your results. Find all the instructions on the Wiki page and come join us in #fedora-test-day on Freenode IRC - use WebIRC if you're not an IRC regular.

Coming up on Wednesday 2011-08-31 is the installer i18n / l10n Test Day, focusing specifically on the Fedora installer, and then on Thursday 2011-09-01 is the desktop i18n Test day, focusing on internationalization issues like keyboard layouts, input methods and language-specific fonts. Come help out all week and make sure Fedora 16 rocks no matter what language you speak!