Fedmsg is the Fedora project-wide messaging bus we’ve had since 2012. It backs FMN / Fedora Notifications and Badges, and is used extensively within Fedora infrastructure for the general purpose of “have this one system do something whenever this other system does something else”. For instance, openQA job scheduling and result reporting are both powered by fedmsg.
Over time, though, there have turned out to be a few issues with fedmsg. It has a few awkward design quirks, but most significantly, it’s designed such that message delivery can never be guaranteed. In practice it’s very reliable and messages almost always are delivered, but for building critical systems like Rawhide package gating, the infrastructure team decided we really needed a system where message delivery can be formally guaranteed.
There was initially an idea to build a sort of extension to fedmsg allowing for message delivery to be guaranteed, but in the end it was decided instead to replace fedmsg with a new AMQP-based system called fedora-messaging. At present both fedmsg and fedora-messaging are live and there are bridges in both directions: all messages published as fedmsgs are republished as fedora-messaging messages by a 0MQ->AMQP bridge, and all messages published as fedora-messaging messages are republished as fedmsgs by an AMQP->0MQ bridge. This is intended to ease the migration process by letting you migrate a publisher or consumer of fedmsgs to fedora-messaging at any time without worrying about whether the corresponding consumers and/or publishers have also been migrated.
This is just the sort of project I usually work on in the ‘quiet time’ after one release comes out and before the next one really kicks into high gear, so since Fedora 30 just came out, last week I started converting the openQA fedmsg consumers to fedora-messaging. Here’s a quick write-up of the process and some of the issues I found along the way!
I found these three pages in the fedora-messaging docs to be the most useful:
All the fedmsg consumers I wrote followed this approach, where you essentially write consumer classes and register them as entry points in the project’s
setup.py. Once the project is installed, the
fedmsg-hub service provided by fedmsg runs all these registered consumers (as long as a configuration setting is set to turn them on).
This exact pattern does not exist in fedora-messaging – there is no hub service. But fedora-messaging does provide a somewhat-similar pattern which is the natural migration path for this type of consumer. In this approach you still have consumer classes, but instead of registering them as entry points, you write configuration files for them and place them in
/etc/fedora-messaging. You can then run an instantiated systemd service that runs
fedora-messaging consume with the configuration file you created.
So to put it all together with a specific example: to schedule openQA jobs, we had a fedmsg consumer class called
OpenQAScheduler which was registered as a
setup.py, and had a
config_key named “fedora_openqa.scheduler.prod.enabled”. As long as a config file in
'fedora_openqa.scheduler.prod.enabled': True, the
fedmsg-hub service then ran this consumer. The consumer class itself defined what messages it would subscribe to, using its
In a fedora-messaging world, the
OpenQAScheduler class is tweaked a bit to handle an AMQP-style message, and the entrypoint in
setup.py and the
config_key in the class are removed. Instead, we create a configuration file
/etc/fedora-messaging/fedora_openqa_scheduler.toml and enable and start the
fm-consumer@fedora_openqa_scheduler.service systemd service. Note that all the necessary bits for this are shipped in the
fedora-messaging package, so you need that package installed on the system where the consumer will run.
That configuration file looks pretty much like the sample I put in the repository. This is based on the sample files I mentioned above.
amqp_url specifies which AMQP broker to connect to and what username to use: in this sample we’re connecting to the production Fedora broker and using the public ‘fedora’ identity. The
callback specifies the Python path to the consumer callback class (our
OpenQAScheduler class). The
[tls] section points to the CA certificate, certificate and private key to be used for authenticating with the broker: since we’re using the public ‘fedora’ identity, these are the files shipped in the fedora-messaging package itself which let you authenticate as that identity. For production use, I think the intent is that you request a separate identity from Fedora infra (who will generate certs and keys for it) and use that instead – so you’d change the
amqp_url and the paths in the
[tls] section appropriately.
The other key things you have to set are the queue name – which appears twice in the sample file as
00000000-0000-0000-0000-000000000000, for each consumer you are supposed to generate a random UUID with
uuidgen and use that as the queue name, each consumer should have its own queue – and the
routing_keys in the
[[bindings]] section. Those are the topics the consumer will subscribe to – unlike in the fedmsg system, this is set in configuration rather than in the consumer class itself. Another thing you may wish to take advantage of is the
consumer_config section: this is basically a freeform configuration store that the consumer class can read settings from. So you can have multiple configuration files that run the same consumer class but with different settings – you might well have different ‘production’ and ‘staging’ configurations. We do indeed use this for the openQA job scheduler consumer: we use a setting in this
consumer_config section to specify the hostname of the openQA instance to connect to.
So, what needs changing in the actual consumer class itself? For me, there wasn’t a lot. For a start, the class should now just inherit from
object – there is no base class for consumers in the fedora-messaging world, there’s no equivalent to
fedmsg.consumers.FedmsgConsumer. You can remove things like the
topic attribute (that’s now set in configuration) and
validate_signatures. You may want to set up a
__init__, which is a good place to read in settings from
consumer_config and set up a logger (more on logging in a bit). The method for actually reading a message should be named
__call__() (so yes, fedora-messaging just calls the consumer instance itself on the message, rather than explicitly calling one of its methods). And the message object itself the method receives is slightly different: it will be an instance of
fedora_messaging.api.Message or a subclass of it, not just a dict. The topic, body and other bits of the message are available as attributes, not dict items. So instead of
message['topic'], you’d use
message.topic. The message body is
Here I ran into a significant wrinkle. If you’re consuming a native fedora-messaging message, the
message.body will be the actual body of the message. However, if you’re consuming a message that was published as a fedmsg and has been republished by the fedmsg->fedora-messaging bridge,
message.body won’t be what you’d probably expect. Looking at an example fedmsg, we’d probably expect the
message.body of the converted fedora-messaging message to be just the
msg dict, right? Just a dict with keys
agent. However, at present, the bridge actually publishes the entire fedmsg as the
message.body – what you get as
message.body is that whole dict. To get to the ‘true’ body, you have to take
message.body['msg']. This is a problem because whenever the publisher is converted to fedora-messaging, there won’t be a
message.body['msg'] any more, and your consumer will likely break. It seems that the bridge’s behavior here will likely be changed soon, but for now, this is a bit of a problem.
Once I figured this out, I wrote a little helper function called
_find_true_body to fudge around this issue. You are welcome to steal it for your own use if you like. It should always find the ‘true’ body of any message your consumer receives, whether it’s native or converted, and it will work when the bridge is fixed in future too so you won’t need to update your consumer when that happens (though later on down the road it’ll be safe to just get rid of the function and use
Those things, plus rejigging the logging a bit, were all I needed to do to convert my consumers – it wasn’t really that much work in the end.
To dig into logging a bit more: fedmsg consumer class instances had a
log() method you could use to send log messages, you didn’t have to set up your own logging infrastructure. (Although a problem of this system was that it gave no indication which consumer a log message came from). fedora-messaging does not have this. If you want a consumer to log, you have to set up the logging infrastructure within the consumer, and tweak the configuration file a bit.
The pattern I chose was to
import logging and then init a logger instance for each consumer class in its
__init__(), like this:
self.logger = logging.getLogger(self.__class__.__name__)
Then you can log messages with
self.logger.info("message") or whatever. I thought that would be all I’d need, but actually, if you just do that, there’s nothing set up to actually receive the messages and log them anywhere. So you have to add a bit to the TOML config file that looks like this:
[log_config.loggers.OpenQAScheduler] level = "INFO" propagate = false handlers = ["console"]
OpenQAScheduler there is the class name; change it to the actual name of the consumer class. That will have the messages logged to the console, which – when you run the consumer as a systemd service – means they wind up in the system journal, which was enough for me. You can also configure a handler to send email alerts, for instance, if you like – you can see an example of this in Bodhi’s config file.
One other wrinkle I ran into was with authenticating to the staging broker. The sample configuration file has the right URL and
[tls] section for this, but the files referenced in the
[tls] section aren’t actually in the
fedora-messaging package. To successfully connect to the staging broker, as
fedora.stg, you need to grab the necessary files from the fedora-messaging git repo and place them into
To see the whole of the changes I had to make to the openQA consumers, you can look at the commits on the fedora-messaging branch of the repo and also this set of commits to the Fedora infra ansible repo.