X crash during Fedora update when system has hybrid graphics and systemd-udev is in update

Hi folks! This is a PSA about a fairly significant bug we've recently been able to pin down in Fedora 24+.

Here's the short version: especially if your system has hybrid graphics (that is, it has an Intel video adapter and also an AMD or NVIDIA one, and it's supposed to switch to the most appropriate one for what you're currently doing - NVIDIA calls this 'Optimus'), DON'T UPDATE YOUR SYSTEM BY RUNNING DNF FROM THE DESKTOP. (Also if you have multiple graphics adapters that aren't strictly 'hybrid graphics'; the bug affects any case with multiple graphics adapters).

Here's the slightly longer version. If your system has more than one graphics adapter, and you update the systemd-udev package while X is running, X may well crash. So if the update process was running inside the X session, it will also crash and will not complete. This will leave you in the unfortunate situation where RPM thinks you have two versions of several packages installed at the same time (and also a bunch of package scripts that should have run will not have run).

The bug is actually triggered by restarting systemd-udev-trigger.service; anything which does that will cause X to crash on an affected system. So far only systems with multiple adapters are reported to be affected; not absolutely all such systems are affected, but a good percentage appear to be. It occurs when the systemd-udev package is updated because the package %postun scriptlet - which is run on update when the old version of the package is removed - restarts that service.

The safest possible way to update a Fedora system is to use the 'offline updates' mechanism. If you use GNOME, this is how updates work if you just wait for the notifications to appear, the ones that tell you you can reboot to install updates now. What's actually happening there is that the system has downloaded and cached the updates, and when you click 'reboot', it will boot to a special state where very few things are running - just enough to run the package update - run the package update, then reboot back to the normal system. This is the safest way to apply updates. If you don't want to wait for notifications, you can run GNOME Software, click the Updates button, and click the little circular arrow to force a refresh of available updates.

If you don't use GNOME, you can use the offline update system via pkcon, like this:

sudo pkcon refresh force && \
sudo pkcon update --only-download && \
sudo pkcon offline-trigger && \
sudo systemctl reboot

If you don't want to use offline updates, the second safest approach is to run the update from a virtual terminal. That is, instead of opening a terminal window in your desktop, hit ctrl-alt-f3 and you'll get a console login screen. Log in and run the update from this console. If your system is affected by the bug, and you leave your desktop running during the update, X will still crash, but the update process will complete successfully.

If your system only has a single graphics adapter, this bug should not affect you. However, it's still not a good idea to run system updates from inside your desktop, as any other bug which happens to cause either the terminal app, or the desktop, or X to crash will also kill the update process. Using offline updates or at least installing updates from a VT is much safer.

The bug reports for this issue are:

  • #1341327 - for the X part of the problem
  • #1378974 - for the systemd part of the problem

Updates for Fedora 24 and Fedora 25 are currently being prepared. However, the nature of the bug actually means that installing the update will trigger the bug, for the last time. The updates will ensure that subsequent updates to systemd-udev will no longer cause the problem. We are aiming to get the fix into Fedora 25 Beta, so that systems installed from Fedora 25 Beta release images will not suffer from the bug at all, but existing Fedora 25 systems will encounter the bug when installing the update.

Comments

Elvis wrote on 2016-10-05 00:19:
I just have to confirm: I had this issue yesterday, except I use a laptop with only Intel graphics. I just switch from openSUSE Tumbleweed to Fedora, and installed using the regular install media (kde spin). I then rebooted and switched to a virtual console, and installed all the updates (a ton!). At the last part of updating, it crashed, and went to sddm (for some reason). I then rebooted and the newer kernel wasn't showing in grub. Applications weren't updated either (firefox). And, trying to update again, there were no updates. So, this issue messed up my install, and was too much hassle to fix on my own. Earlier today I re-installed Fedora using the netiso. Now I'm not updating until this issue is fixed ;D
Elvis wrote on 2016-10-05 00:19:
I just have to add: I am using Fedora 24.
adamw wrote on 2016-10-06 03:52:

well, that sounds a bit odd, if you ran the update from a VT; X crashing shouldn't have stopped the update process.

Jake from State Farm wrote on 2016-10-05 14:40:
What about running the update in a screen or tmux session? Would that keep the update running even if you were in an xterm?
adamw wrote on 2016-10-06 03:51:

Should work fine, yeah. There's some uncertainty about how the new KillUserProcesses default in systemd would affect that case, but that's not in Fedora 24 or Fedora 25 yet. I didn't want to talk about tmux or screen in the main message because it's a bit advanced for a general audience.

JP wrote on 2016-10-05 16:17:
Updated my nvidia optimus based laptop (running Fedora 24) using a virtual terminal, and the systemd-udev package update went smoothly. Just rebooted after from the virtual terminal, and typing this from my X session now. Question, where can I stay up to date on when the fix is pushed to us 24 users? Should I keep this article in a tab and just check it every so often? I'm thinking I'll just be installing all updates from a non-X tty session for the foreseeable future...
adamw wrote on 2016-10-06 03:58:

Once you have systemd-229-16.fc24 installed, SUBSEQUENT updates to systemd-udev should be safe from this specific issue. When you actually install the 229-16.fc24 update, that update operation will be vulnerable to the bug (because the offending service restart is in the %postun scriptlet of the old package version; we couldn't really design an update that would prevent the %postun of the old package from running, all we could do was make an update that would fix future updates. Thinking about it now I suppose we could try giving systemd-udev a %pretrans that masks the service if the existing systemd-udev version is low enough that we know it's vulnerable to the bug, and a %posttrans that unmasks the service, but...not sure if there might be other issues with that, I'll have to run it by someone).

But honestly I wouldn't run dnf update direct from a terminal in a graphical desktop ever, it's just a bit too risky. There have been other cases like this before and there might be others again, and it's always possible you could just lose your X session to a driver crash or something while the update happened to be running. Most of the time it's going to be fine, but the occasional time when it isn't can really ruin your afternoon. It's easy enough to do it from a VT, or just from a screen or tmux session instead of direct from the graphical terminal app.

Eugene Regad wrote on 2016-10-06 13:58:
I've been running 'dnf update' every day in the default 'Terminal' icon, as installed by the Fedora installer. It's worked just about every time, for many years. Is it safe(r) to run pkcon from that terminal? Or is there some other access to the command line (other than the 'virtual' terminal?)
adamw wrote on 2016-10-06 22:48:

I don't think it's safer to use pkcon from X than dnf, no. If you don't like switching to a VT, you can use screen or tmux within your desktop terminal; that should allow the dnf process running within the screen or tmux session to survive if the desktop or terminal app crashes.

mt wrote on 2016-10-06 16:37:
This might affect more than just hybrid graphics stuff. On my FC25 machine, Xorg dumped core during today's update having only a single GPU installed (driven by Nouveau, with 2 screens attached). While this might be a different issue on the X side, the trigger seems to be the same: syslog shows udev-related service restarts just prior to the crash and the X log has device removals at its tail. [ 2011.470] (II) config/udev: removing GPU device /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/drm/card0 /dev/dri/card0 [ 2011.470] xf86: remove device 0 /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/drm/card0 [ 2011.470] failed to find screen to remove I'll see that I can get some time later today to investigate and update the bug, if necessary.
mt wrote on 2016-10-06 17:16:
Ah ok, this seems to be known already and is indeed a different X bug. https://bugzilla.redhat.com/show_bug.cgi?id=1381840
trman wrote on 2016-10-06 18:39:
will double forking work? this will assign the process as a child of process 1. ((sudo dnf -y update ) & )
adamw wrote on 2016-10-06 22:46:

Possibly, I haven't tried. But you might just as well use tmux or screen...

Mohammad Mahdi Ramezanpour wrote on 2016-10-21 08:26:
Thank you for the post. I had the same issue on my HP G62 laptop and found a (temporarily) solution for it. Whenever an update is available for systemd-udev I do that the following: 1) Logout. 2) In the login screen I press Alt+F2 to launch a new terminal. 3) Login as root 4) Update the systemd-udev from there.