Hi folks! So in the last couple of days a significant issue in all Fedora releases has come to our attention, affecting (so far) several systems that use the Intel ‘Skylake’ hardware platform. Systems that appear to be affected so far – at least with some system firmware versions – include: Lenovo Thinkpad T460, Lenovo Thinkpad x260, Lenovo Yoga 260, ASUS Zenbook UX305CA, Asus Zenbook UX303UB, Samsung Notebook 9.
The problem appears like this: you install a kernel update, and the new kernel fails to boot, failing very early in the boot process (right after the boot loader). Older kernels boot fine.
For those who don’t want to read much, there are a few workarounds available if this is the bug you’re hitting:
- Boot an older kernel.
- Boot with the kernel parameter
- Downgrade the
microcode_ctlpackage to version 2.1-11 or earlier and force an initramfs rebuild, either with
dracutor by reinstalling the packages for the kernels that don’t boot.
You may also find that a system firmware update is available from your system manufacturer, and updating the system firmware makes the bug go away. So do please check with your manufacturer’s site and try updating your system firmware if there’s an update available.
We’re sorry for the inconvenience, and we’re looking at better fixes at present.
The story behind this bug is that it’s not actually a kernel bug at all. It’s a bug in microcode_ctl . This is a package which contains both processor ‘microcode’ updates and a loader for such updates, for Intel processors. You can think of processor microcode as being kind of like firmware for your processor; this mechanism lets Intel correct bugs and improve behaviour in processors after they’ve been released and shipped out. It also occasionally lets them break stuff, like in this case. 🙂
The way this mechanism works on most Linux distros (this bug is affecting other distros as well as Fedora, btw) is that if there’s a microcode update for the CPU in your system at the time an initramfs is built, the update and a loader mechanism for it are built into the initramfs in such a way that they load very early during initramfs initialization, which is as early in the boot process as we can manage. If there is no microcode update for your CPU, you get an initramfs without this trickery.
microcode_ctl-2.1-12 there was no microcode update for the affected CPUs.
microcode_ctl-2.1-12 has added a microcode update for these CPUs, so all initramfs’es built after microcode_ctl is updated to that version will include the update. The bug seems to be that on some system firmwares, the microcode load fails and hangs the system. On other system firmwares the microcode loads fine and the system boots.
The reason the bug appears when you update your kernel – rather than when you update microcode_ctl – is simply that updating microcode_ctl does not trigger any initramfs rebuilds; your existing installed kernels will still have initramfs’es with no microcode update loader mechanism. But when you install a new kernel, a new initramfs is generated for it, and now it will include the microcode update, and thus hit the bug.
This is why you can work around the issue by downgrading microcode_ctl and then regenerating the initramfs for affected kernels. It also means that if you regenerate the initramfs for a kernel that was working fine after microcode_ctl has been updated, that kernel will stop working.
dis_ucode_ldr kernel parameter simply disables this microcode loading mechanism, which obviously avoids the bug happening.