Fedora openQA now public

Well, I don't know about everyone else, but this is a beautiful sight to me!

As I've written about before, we've been using openQA for some time in Fedora testing - many thanks to the great folks over at openSUSE who work hard on it. It started as a sort of ad-hoc, skunkworks project, so it initially ran simply on a single box we happened to have lying in one of Red Hat's offices, running openSUSE not Fedora. It's turned out to be quite valuable, and so since we want to keep running it at least for the medium term (and potentially longer, if we can find a sensible way to integrate it into Taskotron in the long term), obviously it would be against Fedora's spirit to leave it running behind Red Hat's firewall (where non-RH folks couldn't see it) and on an openSUSE system.

So we've been working quite hard to get openQA running and cleanly deployable on Fedora, and set up a new more official deployment. There are now actually two Fedora openQA deployments: production and staging. Each is running on Fedora Infrastructure managed boxes (running Fedora 23) in the Fedora data centre, and deployment is completely handled by Ansible - you can see the playbooks in the infrastructure ansible git repo. The staging deployment has a VM acting as the server and one bare metal system hosting the 'workers' (which run the actual tests in VMs); the production deployment has a VM server and two bare metal worker host systems. This gives us rather more test capacity than the old deployment had, so the tests run faster and we can add more without them taking too long.

Almost all the packages are in the Fedora repositories, but a few are still outstanding: perl-Mojolicious-Plugin-Bootstrap3 and os-autoinst are in the review process, and the openqa package itself can be reviewed after those two are approved. For now, those three packages are available in a COPR for anyone wanting to set up their own openQA (only Fedora 23 and Rawhide have all the necessary packages at present).

The 'compose check report' emails that get sent daily to the test and devel mailing lists are now generated by the new production deployment, and from tomorrow (2015-12-06) will include direct links to all failed tests for that day, so now non-RH folks can actually see what went wrong with the tests!

Comments

Bernhard M. Wiedemann wrote on 2015-12-08 05:05:
It is a beautiful sight to me, too. Cannot properly describe how good it feels to see this working and being useful.
adamw wrote on 2015-12-08 23:27:

Hi Bernhard! Thanks a lot for kicking off this whole thing and doing all the early work on it :) Glad you saw this blog post!

mbooth wrote on 2015-12-08 13:02:
I'm not ever so familiar with the magic that happens in Fedora QA, what is the relationship between OpenQA and Taskotron?
adamw wrote on 2015-12-08 23:23:

There isn't really a relationship between them at present.

Taskotron is Fedora's ongoing project to build a really kickass test automation framework: that is, it's the framework that's the focus of Taskotron, not the tests themselves. There's lots of work in Taskotron on coming up with really good answers to questions like "how can we fire off a bunch of tests whenever a repo compose happens", that kinda thing. There's a lot of work on results storage. To taskotron, the actual tests are more or less just arbitrary 'things' that get plugged in, all taskotron needs is some kind of entry point and some kind of output (handwave handwave).

Apologies to any openQA devs if I muff anything here, but openQA at least feels like it's pretty much the other way around. openQA feels like the answer to the problem "jesus, it's boring as hell firing up a virtual machine and clicking through the installer thirty times a goddamn day, why can't we just build a robot to do it instead?" The main focus of openQA is really the bit called os-autoinst. os-autoinst is what you get if you just say "let's build a robot to run installer tests": what it basically is capable of doing is running a virtual machine, looking at what's on the screen, and then injecting keyboard and mouse events into it. openQA tests are basically big sequences of 'look for a match for this area of this screenshot, then click on it, then type "foobar"', etc. etc. It's almost exactly like teaching an extremely literally-minded intern to test something - "click on INSTALLATION DESTINATION, look for a box that says 'encrypt my data', click it..."

openQA per se is basically a webapp which provides some kind of interface for scheduling os-autoinst jobs and accessing the results (it also has nifty capabilities for editing the tests). You poke the openQA webapp with a POST request when you want to run tests, telling it where to find the ISO you want tested and some settings that are passed as query parameters; it then accesses a store of configuration data you set up which generates a bunch of test jobs for the ISO each with different settings that will cause os-autoinst to do different things, and then it farms the jobs out to the 'worker' instances that are registed with the openQA server. worker instances are just very simple processes that run a continual loop of asking the server for a job; once the server gives them a job (which is basically just a bunch of settings), they run os-autoinst with all the settings they were given, send the results back to the server, then ask for another job.

The 'framework' bits of openQA probably wouldn't win any awards - they do the job fine but they're not the focus of the project. If you wanted to do some form of automated testing that wasn't the os-autoinst kind, you probably wouldn't say "hey, openQA's an awesome test automation framework, let's plug in a whole new test running backend to it!" openQA's capabilities are pretty much just those that are needed to automate the kinds of testing openSUSE uses os-autoinst for, and not much else. This isn't an insult, or anything - it's just a description of the kind of project openQA is. You actually have to roll quite a bit of your own scheduling infrastructure to use openQA, because openQA has no capabilities along the lines of "know where and when testable artefacts are going to appear, and what tests to run on what artefacts" - you have to provide all that yourself. We built a whole thing called fedora_openqa_schedule to do that for Fedora - it's at https://bitbucket.org/rajcze/openqa_fedora_tools/src (and it uses my 'fedfind' project, which is a chunk of code all on its own).

We're using openQA quite simply because it does something really valuable for us, right now, and it was pretty easy to get up and running; as I said in the post it started out as basically a spare-time project for a couple of the guys in Brno, and they were able to get it up and doing useful stuff quite quickly. Just like the openSUSE folks when they started the openQA project, we do a lot of manual validation testing in VMs, and it's really dull, and openQA certainly does the job of replacing that (and doing the tests more often than we really could by hand, so we're much more likely to find out when Rawhide breaks overnight now).

We have not looked at the details at all yet, but our very vague plan is that somewhere down the road, we'll basically have Taskotron scheduling openQA jobs. We could perhaps completely rip out the openQA bits of openQA and just have Taskotron directly schedule os-autoinst jobs, but I suspect that would turn out to be a lot of work; it would likely be more feasible to still have an openQA instance running, but use Taskotron's capabilities to schedule the openQA jobs - i.e. replace the fedora_openqa_schedule thing we built - and get the results into Taskotron's resultsdb as well. But as I said, this is completely castles-in-the-air stuff ATM, we haven't actually started working on it in any way yet.

mbooth wrote on 2015-12-09 19:52:
Aha, thanks for the detailed explanation :-)