Flock 2014, and other happenings

I had a really great time at this year's Flock in Prague. As always it was a packed event with a really positive and productive vibe - clusters of people all over the place hacking on things. I posted a lot about it on YAY Diaspora and BOO Google+, but here's a long-form post-event wrap-up.

I got into town in the afternoon of day 0, and did the thing I like to do, which is to find a pool and go for a swim - it's an interesting way to see places, and gives you a bit of a feel for the local infrastucture - it's like Van Halen's brown M&Ms, a sort of handy little benchmark for how the whole system works. Last time I was in Prague I wound up in the weird quiet little pool in the downtown YMCA, but this time I found a public pool a 20 minute bus ride north of the Flock hotel. It was interestingly similar to a typical British public pool - one modestly sized 25m pool, clean, functional, pretty spartan, very 1970s public institution. Though they use what I think of as the Hong Kong system for lap swimming, where there aren't many marked lanes and instead everyone swims in a big mass, trying to do on-the-fly collision avoidance. It gets interesting when you get a clash of algorithms...

The Red Hat QA folks went out for dinner on the first day at one of Kamil's signature 'it's just a five minute walk, guys' restaurants, which turned out to have a very nice location and outdoor seating when we arrived after a brutal five day trek (I exaggerate only slightly) but also to have the typical Middle European slightly fuzzy notion of 'vegetarianism' - it's all very well, but surely a bit of chicken can't hurt, right?

Note before jumping into the event proper - you can find the recorded versions of all the presentations (though not hackfests) on Youtube here, many thanks to the folks who set up all that recording.

I saw the day one keynote on F/OSS in the EU (more or less) which had some interesting notes on fairly significant deployments besides the ones that have really made the tech press, then watched the QA overview talk which was kindly and expertly presented by Amita Sharma, who's been contributing some energy to QA lately. I saw Matthias Clasen give an update on the status of Wayland. I've actually been checking up on that for myself lately - there's been some pretty significant fixes landing in Fedora 21 / Rawhide in the last month or two, and it's now at the point where I can run this laptop on Wayland and just about everything I use works fine except for VAAPI video playback acceleration and suspend (or at least, suspend on lid close - I need to look into that one some more). Matthias gave a quick no-nonsense 'here's where we are and here's where we're going' infodump talk, which was probably the best line to take for the audience.

After lunch I was sitting in Christian Schaller's talk on Workstation, but I was actually mostly working on stuff by that point - I think I was still busy cleaning up a bunch of stale wiki bits I'd noticed from screenshots in Amita's talk, ancient test plans and the like. You can see the stuff I cleaned up on August 6 in my contribution log (at least until it gets scrolled out). After that I caught Luke Macken's "Evolving the Fedora updates process" talk - nice job of hiding the now-traditional "Bodhi 2.0" talk, Luke ;) Seriously, Luke mostly outlined the current work on Bodhi 2.0, which looks like it's really truly actually happening real soon now, and is full of awesome. Really can't wait for that to arrive. Some of the hook-ups with Taskotron and other bits that are planned look like they'll be really great.

Next I sat in the usability methods talk, but again I was mostly working on something else - that may have been around the time I started work on getting a test MozTrap instance running, on which more later. Afterwards I believe I was in Dennis Gilmore's release engineering talk, and I think I was splitting my attention again. Finally I sat in on the Badges talk, which mostly served to remind me what an awesome job Marie Nordin has been doing on Badges this year, both as part of the Outreach Program for Women run by GNOME and later. It was great to meet Marie and nice to have the chance to thank her for all the work - Badges is still one of my favourite Fedora things, though it's lost out to Fedlet and ownCloud in the passion project stakes lately, and without her we probably wouldn't have anywhere near as many badges out there!

The evening's festivities involved a pub which gamified the consumption of alcohol - each table has some beer pumps and tracks the amount consumed by each seat and the table as a whole. The table consumptions are plotted up in nice big bar charts and projected on the wall. This concept I think rather terrified those of us who've been going to Flocks/FUDCons for a while (and particularly the folks who've been around long enough to have been at the still-legendary early Boston ones, about which...that was probably enough said already), but obviously the crowd has gotten mellower and/or saner over the years, as I don't think anyone was carted off to the hospital or attracted the attentions of local law enforcement. Fedora QA formed a pretty strong table, but we were narrowly beaten out by a table with a strong contingent of Anaconda developers and release engineers. I don't think this result surprised anyone. It was also where I snapped the now-legendary F/OSS axis of evil photo - I think the Workstation table (with special guest star Lennart) was too busy plotting world domination to challenge for the drinking title... Before the pub, Tim Flink and I decided to eat separately to forestall any more issues over the precise definition of 'vegetarian'. The pasta place I'd seen turned out to be closed, and so was everywhere else on that street (presumably it's for the lunch crowd), but somehow we turned a corner and stumbled across an Asian vegetarian restaurant, which has to be some pretty long odds. Had some very nice fried rice and eggplant and other bits and pieces there, at highly reasonable prices.

On the second morning I saw the neatest 'surprise' talk of the week - the Review Server talk, proposing a new way to do package review, which sounds about fifteen thousand times better than the old way and which I hadn't heard of yet. It may have been floated on devel@, I'm still like 1500 mails behind on that list right now, but it was a new and pleasant surprise to me anyway, and I really hope it gets done. I then gave my UEFI talk (in a sleeveless t-shirt, striking just the right professional attitude - in my defense, it was frickin' hot, and I was running out of clean laundry). It went fine and I got some good feedback, but I forgot to update the talk after I last gave it at LFNW and so I made the same mistakes over again, not fully explaining fallback path and the EFI system partition and not talking about Secure Boot at the right time. I've now updated the slide deck, so if I give it again, it should run better. I watched the keynote from the awesome Sean Cross about the Novena open laptop he's working on. Open hardware is really, really hard, but it looks like they're going to be one of the more successful projects out there. It was really nice to see just a few hours later that John Dulaney, one of QA's awesome volunteers (who also contributes to many other areas of Fedora) had helped Sean get Fedora up and running on the demo hardware, a challenge Sean had issued during the talk!

Then I dropped in on the unit testing talk for a while, but it was heavily coder-focused so I ducked out and worked on something else for a bit. Following that, we put together an impromptu QA hackfest. Amita had suggested doing a triage hackfest during her talk, and we liked the idea, so we went ahead and did it. Unfortunately due to the usual conference communication fubars, Amita didn't make it, but most of the other QA folks and few people who couldn't escape before we nailed them to their chairs (including the awesome Jon Stanley, making a triumphant triage comeback) got together with Richard Hughes and worked to do some triage of the GNOME Software bug list. It's an illustration of the surprisingly time-consuming nature of triage that a roomful of QA folks and the package maintainer couldn't blow through the whole bug list in the 90 minutes we gave it, but there ya go - triage is hard. We did make a dent! For the last half hour or so of the hackfest, I gave a very quick overview of my MozTrap demo install. People were sufficiently interested that we planned to have a more detailed look at it later in the conference.

The revelry that evening was aboard a boat, again an entirely sensible and safe idea with no possible drawbacks whatsoever, but unaccountably no-one wound up being pulled from the Vltava and a good time was had by all (well, at least, no-one complained to me). Prague is an amazing city, in case you didn't know, and from a riverboat is a great way to see it - probably worth a couple of hours if you're ever there.

On Day 3 I unfortunately missed Arun Sag's Docker talk - the title was a bit generic so I sort of figured it was the 'hey, look! Docker!' talk, the one you can find six people capable of giving by just hurling a brick at any F/OSS conference these days, but apparently it was much cooler than that. I still need to check the video. I think I then stopped in on the systemd daemon integration talk, but it seemed a bit more basic than I was expecting and mostly going over stuff I knew, so I worked on MozTrap stuff for a bit (still more coming on that, be patient!). I dropped by the Fedora.next joint session for a bit - it seemed to be mostly a retrospective on how the Fedora.next changes had been handled so far and where we could improve, which was a pretty good idea, and I think we moved things forward a bit. I got some good feedback on how we can maybe get more input from other groups into QA planning, which should help out with doing the validation testing process updates for Beta and Final.

After lunch I caught Arun's second talk, on rapid deployment to bare metal systems, which was really great. It's an interesting area - there are lots of things out there to help you do rapid deployments to VMs, like virt-install and so on, but not so much for bare metal. The approach Arun's been taking at Yahoo is a neat one, extending anaconda's existing support for deploying a disk image 'payload' (which is used for live installations) to support installing a tarball containing such an image, and he did a nice live demo. There were a bunch of anaconda devs and other heavy hitters in the room, and I think we came up with a good half dozen suggestions for improvements to anaconda that would help out this workflow (and incidentally make things nicer for other users too). Should be some really good work there. I also abused the talk to nail one of the anaconda devs to his chair - Vratislav Podzimek was my unfortunate victim - and trace out an install crasher we were hitting during Flock, which eventually turned out to probably be effectively a cunningly-disguised dupe of the bad live image compose bug we knew about already. We were able to come up with a couple of improvements to anaconda's error checking though, which should make it much more apparent when something like this is happening in future.

I managed an hour in the Documentation hackfest, where Jared Smith very kindly gave me the crash course on Docbook, and I got about halfway through turning my post on NetworkManager bridging into a proper guide - I hope to find time to finish that up. After that we ran a second impromptu QA hackfest, this time spending the whole time on MozTrap. Well, first we went for dinner because we were really hungry, THEN we did the hackfest.

We got most of the QA folks together, put up a demo instance on a laptop so we were all on the same subnet, and then just played with it for a bit, setting up a few dummy products and test suites and environments and playing around with the reporting and querying interfaces.

MozTrap is a TCMS - basically it keeps track of test cases and their results. Fedora essentially uses Mediawiki as a TCMS; that's what we're doing when we write test cases as wiki pages based on a template. Things like the release validation matrices are what a TCMS might refer to as a 'test suite' or 'test plan' or 'test run' or similar, and also a (very bad) result storage system. We 'query' our 'TCMS' by, well, looking at matrix pages, using wiki categories, and writing cute nonsense hacks. Up to Fedora 20 we were able more or less to keep up with what we needed to do, but the seams have really started showing with Fedora 21; the amount of inventive judo I have to do with wiki table layouts to encode the concept of 'run this test case on these images in these configurations' is getting unmanageable. Reporting results has always been kind of a sucky experience in the current system - basically you have to edit a page, insert a bit of text like {{result|fail|adamwill|123456}} which is kind of gibberish, hope you got it right and in the right place, then save the page and hope you didn't just blow up the wiki, which is hardly a welcoming experience. 'Querying' results has always been a bit of a pain point - you can usually find the absolutely essential stuff like "what bugs did we find at this Test Day?" or "what release validation tests failed on the last run?" if you knew what you were doing, but it's always a bit disjointed, and you just can't really do stuff like "what results have we had for this test case, regardless of the context, for the last three months?" Finally, we've started running into things we'd like to do that we just can't figure out an entirely satisfactory way to do with the current system. For me, at least, the 'let's get a real TCMS at last' idea seems to be becoming rather more compelling as of late.

We've evaluated real TCMSes in the past; the obvious one has long been Nitrate, the F/OSS release of RH's in-house TCMS. But for various reasons, we've always found Nitrate just doesn't line up very well for Fedora; it makes quite a few assumptions that just aren't quite in line with how we do testing. We found a similar story with the other TCMSes we've looked at in the past.

MozTrap is a fairly new TCMS written by Mozilla, and used in production by them since 2012 - they use it for pretty much all their testing, I believe. It's fully open-source of course, and seems to have a decent bunch of developers behind it who are open to outside suggestions and contributions. It's a Django webapp, which is something we're reasonably familiar with and isn't, you know, Java or raving insanity (wait, aren't those the same thing?) And most importantly, its basic design lines up rather more closely with how we do Fedora testing than any other TCMS I've come across. You write 'test cases' and organize them into 'test suites', then associate them with 'builds' of 'versions' of 'products' and create 'test runs' (which are more or less 'this is a single instance of this test suite executed against this build of this version of this product'). People actually running tests see the 'test runs', pick one, and get a nice interface where they see all the test cases and can easily provide their results, without running the risk of blowing up the system somehow. And there's one other concept which I really like - 'environments' act as a sort of multiplier on the 'test runs'. An environment can be, well, anything, but basically it's that confounding factor of 'well, running this test against ARM is a different result from running it against x86_64, and we need to do both and track them' - in that context the architecture is the 'environment', it can also easily be the product 'flavour' (e.g. Server or Workstation for Fedora.next), or even a version (say the product you're testing is ownCloud, the distribution version may function as an 'environment', depending on exactly how you want to organize your testing). Environment elements are organized into 'environment groups', so you can have multiple 'environment' factors apply to a given test run, and you can track and query all the possible combinations of all the environment elements in the results. The whole setup is very flexible and configurable - you can have all kinds of different products and versions and test suites and environments, and hence track very different forms of testing all in one TCMS.

So for the hackfest, as I said, we just played around with a test deployment for a while, looking for pain points. We wrote down our notes as we went along in a Piratepad. I think in general we all liked the basic design and layout of the system, but we did identify various things that worried or us or might need work. The design lends itself to massive multiplication of test runs and test cases, and the query interface gets a bit 'idiosyncratic' when that happens; if you look at Mozilla's production deployment, they have literally thousands of test runs listed in a five-item-high scrollbox in their query interface. I think there are ways to use the 'smart' query search box to make querying manageable with a huge corpus of test cases and test runs and results, but it's not as nice and obvious an experience it is when you just have a small, manageble amount of test data. We also noted that for production we'd need to jump the usual packaging hurdles - I'm currently looking at exactly how much of a job this would be, how bad the dependencies are (it's always the dependencies, for webapps) - integrate the authentication with FAS somehow, manage the import and fix up of our existing test cases, and a bunch of other little things like checking that the API is sufficient to our Grand Plans for integration with the blockerbugs webapp and so on.

We're currently kicking the idea around a bit more on the mailing list, and the next step will probably be a slightly more permanent staging deployment in Fedora OpenStack or similar - still not suitable for production use, but persistent enough that we could do a bit more of a convincing build-out of test cases within it, maybe duplicate some results from the mediawiki tables, and see how it goes in that kind of use. It'd also be useful for us to do any development we want to submit to upstream against.

There wasn't an official party on that evening, but I think that was the one where the QA team agreed to meet up at a Czech pub down the street from the hotel only to find that the anaconda team was already there, as was Kanarip, and we wound up forming one big happy party, taking over half the tables in the pub in dominoes fashion, and probably driving the rest of the patrons crazy. But it was a fun night.

On the final day I attended Michal Toman's ABRT talk, which was another very useful session - it was another 'past/present/future' infodump, but that was just what I needed, particularly the bits about their immediate future plans, they have some nice stuff in the pipeline. I was able to pitch the importance of improving the UIs for the bits of the system people get to interact with, as well - there's such a mess of workflows and use cases in the gnome-abrt tool and the web UI that it's hard or impossible to really 'fix' it by filing one bug at a time, it really needs the more fundamental work of a proper UX designer sitting down with some users and figuring out the various use cases and workflows, and coming up with a fundamentally sane design.

I then went to Peter Robinson's talk on Fedora's ARM support status, which was one of the most uplifting of Flock because the news was basically all good - our support as of now is pretty damn good and about to get way more so. Fedora 21 should support a huge range of ARM hardware, and especially the new aarch64 platforms that'll really be coming on stream around the time it comes out. Fedora's still primarily focused on dev boards and the new crop of server hardware, so we're still not really talking Fedora-on-cellphones here, but it's awesome news for people playing with Beagleboards and Cubietrucks and Utilites and so forth and so on, and particularly for people interested in the brand-new 64-bit ARM server platforms. Definitely a highly recommended talk for video catch-up (and I don't know how long the cameras ran, but there was some fun stuff at the end where several people who were clearly under NDAs were trying to figure out how much they could say about exciting new shiny hardware without getting in trobule). After that I hit up Cole Robinson's talk on virtualization, which was also very useful because it made me realize I've been a giant idiot for years by not using the offline snapshotting feature of Fedora's virt stack, which is super easy to use and built right into virt-manager since (IIRC) Fedora 20. I think when I looked at 'snapshotting' I wound up looking at online snapshots, which are apparently a completely different and much more finicky creature, and didn't realize this whole easy-as-falling-off-a-log alternative mechanism existed which is just ideal for QA purposes. So that'll help me out a whole heck of a lot in future. The talk also has lots of other interesting tips for improving your use of virtualization, so check it out!

In the afternoon I attended the governance hackfest, which produced about as much controversy and hot air as you'd expect, but also produced at least the skeleton of a possibly-viable plan for revising the makeup of the Fedora Board, which could be interesting. I was the principal note-taker for the session, and you can still find the Piratepad notes at least for now (I don't know how long PP keeps them). I've emailed them off to Haïkel, who ran the session, and I believe the idea now is that the plan gets fleshed out a bit and sent off to the lists for more comprehensive public review.

After that hackfest I sat in on the spins session, which I think was pretty productive, and developed a general consensus to try and focus in harder on higher quality spins as a part of Fedora 21 and later, and move on from the very open and inclusive vision that Spins initially had but never quite achieved - it's mostly become more of a delivery mechanism particularly for alternative desktops, and similar things.

Again there was nothing official for the evening, so I walked around town for a bit with Amita (who had never visited Prague before, so hadn't seen the sights) and then spent the evening doing justice to a bottle of Glenmorangie with Peter, which was thoroughly enjoyable!

Once again a really fun and productive Flock, and as always I have about two dozen possible avenues for doing interesting stuff which I'll never get time to explore fully (the handwritten notes on possible next things to do on Badges that I wrote after the last Flock are still lying all over my desk, where I'm going to get around to them real soon now, honest). But that's a good thing, not a bad one - I'll start getting really worried when I don't see anywhere interesting to go! Since Flock I've been taking a few vacation days, travelling across Europe the slow way and spending a night in London before pitching up back here at my family's home for a few days before I return home and get back to work. I'm hoping by then it'll finally be full speed ahead on Fedora 21 Alpha testing! Thanks to all the folks I saw at Flock, both old and new, for making it such a fun, productive and informative event as always - I love this community.

Proposing Flock QA workshops for bug triage and TCMS

This post is meant mostly for Fedora QA folks attending Flock. We already have a Taskotron hackfest scheduled for Saturday, but nothing for Friday. Amita had the idea of doing a triage hackfest, and I'd like to have one on test case management (particularly looking at Moztrap). So if those involved in QA think it's a good idea, I'm proposing we find a room on Friday afternoon and run those two hackfests, in the timeslot from 3pm to the end of the day (or even earlier if no-one has a talk they want to see in the 2pm slot).

I'll try and let people know in person today, but wanted to post it up on the intarwebs for people to find as well.

Flock 2014

I'm here at Flock 2014 in Prague, currently watching Christian Schaller's talk on the future of Fedora Workstation. It's been a fun conference so far, and you can watch livestreams of (I think) all the talks here. I've been trying to use Diaspora more lately, and I'm updating as I go on my account. Join me there for regular updates! Or you can follow me on boring old evil Google+, but don't do that, do Diaspora. (I'm posting Extra Bonus Diaspora-Only Content too!)

OpenWRT on Zyxel NBG6716

I finished up the last bit of my infrastructure revision (for now) today. Switched out the Linksys WRT-310N (router, with dd-wrt) / Netgear WNDR3700v2 (running solely as a wireless AP) combo I've been using for a while for a Zyxel NBG6716 running OpenWRT.

I'm really impressed with OpenWRT, so far. I'm running a nightly snapshot of the bleeding-edge Chaos Calmer release, though it seems like CC hasn't diverged much from the soon-to-be-released Barrier Breaker yet. I found a bug preventing build of the image for the NBG6716 which was quickly fixed, but there don't seem to be nightlies of BB and I couldn't be bothered waiting for the next RC, so CC it is. I used the Image Generator tool to produce an image, flashed it with mtd as described on the wiki page, and it came right up ready for first login. I was configuring it offline and realized it'd be a pain to get the web interface Luci installed that way, so I ran Image Generator again to build an image with the luci package included, and installed the sysupgrade version of that image, which again went in flawlessly.

I was able to reproduce my setup through the web interface more or less completely, and it's a really nice web interface - somehow my impression of OpenWRT had been that it'd be much more creaky, but it's really smooth, better than any commercial firmware or dd-wrt I've seen. There were only a couple of things I had to do behind the web interface's back. One was to set up SRV and TXT DNS records. SRV records only required me to go one level of abstraction down, to OpenWRT's UCI interface, using the /etc/config/dhcp file. I had to add the TXT record by directly editing /etc/dnsmasq.conf, but that was the only thing that needed me to go that far.

I also had to go down to the UCI level to fully configure the wireless adapters, but only because I wanted 802.11ac capability; this has only very recently been submitted for Luci. So I just had to set /etc/config/wireless appropriately; the hwmode paramter is just 11na, same as for 802.11a/n support, but you have to specify a htmode option to get full 802.11ac support. No big deal.

CLI-level access and config is really, really nice as well; a far nicer environment than dd-wrt or consumer routers. Of course you can (and should) configure it to allow console access by public key SSH only. You get a sensible shell with most of the expected commands available by default (way more than in most stripped-down router environments), and a package manager you can use to install a whole ton more. nano and htop on my router are worth the price of admission alone, so far as I'm concerned. Most config files are in /etc/config, and there's so much sensible stuff, like to re-init the wireless adapters after you change the configuration you just run wifi. It's very nice.

OpenWRT keeps pretty good track of your custom configuration (what you've altered from stock), and you can generate a backup of it from Luci (so if you ever lose it you just re-flash and then restore your backup). The sysupgrade tool for updating your OpenWRT install will preserve any files it knows you've modified - you can see the list with opkg list-changed-conffiles - and any files explicitly listed in /etc/sysupgrade.conf, so you can update your install pretty conveniently without losing your configuration.

And most importantly, of course, it works. So far, at least. After reproducing my current router config in OpenWRT I powered down the routers, switched the jacks to the new one, powered it up, and everything came right up again, existing connections remaining open even (I was pretty quick). I had to cycle every system's connection in the end anyway (because until they actually got a DHCP lease from the router, it wouldn't have them in its dnsmasq config so DNS wouldn't work), but it was still pretty smooth. Now all that's left to see is if it's stable; until recently my 310N was pretty much 100% stable, so the new one has a high bar to live up to.

I don't have any actual 802.11ac hardware yet so I can't say for sure how that protocol works, but other than that I can certainly recommend giving OpenWRT a try, and the NBG6716 seems good so far. I found three viable alternatives for 802.11ac routers with current OpenWRT support: the NBG6716, the Engenius ESR1750, and the TP-Link Archer C7 v2.0 (and possibly one other, but I forget which). The Engenius seems tohave somewhat poor performance and the Archer seems to be flat sold out, so that's why I went with the Zyxel. I'll try and remember to post an update if everything goes pear-shaped...

Overhaulin'

I spent the last week and a half on a bit of tear working on various things, especially OwnCloud, but I imagine my pace appeared to drop over the last couple of days. For me, though, it was as busy or more so, I just re-directed energy temporarily into a different project - overhauling the network infrastructure here at HappyAssassin Towers.

For a few years now I've been keeping most of the "big iron" (hah) in a little Ikea "PS" cabinet in the corner of the bedroom. This has several drawbacks I mentioned in this post (to which this one constitutes a 4.5 year later sequel!), to which has now been added the fact that a small metal cabinet containing three computers, a cable box, and bunch of networking hardware is not an ideal thermal situation. Especially in August. I think it was raising the ambient temperature in the bedroom by about three degrees; you could possibly have fried an egg on it last week, when I put the new PVR PC into it.

So this week I finally snapped and decided I'd had enough of melting heat and dragging the damn thing out of the corner every time I needed to tweak anything, and decided to ship the whole damn kit and caboodle over to the corner of the living room next to my desk, and replace the metal box with something else.

At first I was going to use this other Ikea thing, but I wasn't feeling great about it, partly because it's not that awesome a thing, partly because Ikea is kind of a trek, but mostly because the (rare) unionized staff at my local Ikea are on strike and I'm really not keen on crossing picket lines. So I was pretty happy when I nipped into the big random-junk-liquidation store next to the PC parts store and found a TV stand from Ameriwood (it's not any of the ones listed there, but pretty similar) for fifty bucks. Made in America flat pack furniture, who knew, right?

So without further ado, I present the new HappyAssassin Towers data centre:

Photo of the new setup

I really didn't plan for everything to be shiny black with blue blinkenlights, it just sort of turned out that way. Bonus! Attentive viewers may note the one piece of the TV stand I inevitably screwed in the wrong way up.

On top is a cheapo TV/monitor I also got from the liquidation store, to be plugged in whenever I need to access one of the systems directly (I don't do this often enough to really justify a KVM and all the cables it entails), and my printer. (Also my tablets - you can see the Fedlet in all its glory if you look close). In the top-right compartment, networking stuff - modem (at the back, it's some Cisco thing the cable company sent, operating in bridge mode), router (still for now the Linksys WRT-310N) and wireless AP (Netgear WNDR3700v2, under the router). Bottom-right compartment, the UPS (CyberPower 1500AVR) on the right, PVR setup on the left: the new PVR computer I built, a quick cheap mini-ITX job in an InWin case, and a Motorola DCX3200 cable box on top of it, hooked together with Firewire. Bottom left compartment, my Thecus N5550 NAS. Top left compartment, my server host machine (described in the same post as the NAS), and spare keyboard.

Oh, so yeah, after that other post I did go ahead and set up a dedicated MythTV box. Picked up the DCX3200 on Craigslist. Hit a couple of annoying issues setting it up, but it's mostly running pretty smoothly now. The playback on my HTPC had been slightly jerky with the test box, and it initially was with the dedicated box too, but I worked out it's because the ancient underpowered CPU in my HTPC box was struggling to keep up with deinterlacing the video; switching to a VDPAU-accelerated deinterlace method smoothed it right out. Now there's just an annoying problem where the backend seems to lose its ability to talk to the cable box after it's been inactive for a while, but I should be able to figure out / workaround that one somehow or other.

I'm happy with the new setup - it's cooler, takes the heat out of the bedroom, and is waaaaay more accessible for me. I also took the opportunity to improve the network infrastructure. There are now 8-port gigabit switches mounted on the walls behind the data centre and the TV stand that actually has my TV (and HTPC and game consoles etc) on it, and a 5-port switch I had lying around in the bedroom. The switches and the AP run into the router, and all the actual devices hang off the switches. Much more orderly, and much more room for expansion. However, it did lead to the large chunk of manual labour that kept me busy for the better part of the last two days: all the freaking cabling.

I have a massively heightened respect for those poor buggers who just run ethernet cable all day, now. I had to run and tidy 100ft of ethernet and 100ft of coax to run this setup; the ethernet cable that brings the bedroom online runs behind a dresser, a couch, two ceiling-height Billy bookcases, past the dining table, behind the corner where the cable drop is (both coax cables do the whole run to that point too), behind the media centre (where the other long ethernet cable stops), around the bedroom doorframe, behind another dresser and finally my husband's PC before it hits the switch. All that for a straight-line distance of about 12 feet (I used to do this run around the doorframe and across the ceiling, but it looked kinda ugly so I took this chance to redo it around the walls). And I did the coax cables last night without remembering that the ethernet cables would have to do the same run, so I got to do that part twice. Of course, there are about a zillion other cables in the way that I had to get the new cables under, and both bookcases are fully loaded and properly attached to the wall with a bracket just like the instructions say. Whew.

The next step is to replace the router - which seems to be starting to go senile, every so often lately it stops passing packets to the internet and has to be reset - and wireless AP combination with a new Zyxel NBG6716 I just bought, running OpenWRT. I was going to start configuring the new router today, but discovered OpenWRT had a bug preventing the NBG6716 images from building, so I had to file that and wait for the fix first. Hopefully there'll be images or a working ImageBuilder with the next nightly build.

Here ends the latest news from HappyAssassin Towers, still obstinately doing for itself what it could be paying Google to do better. Just like its soulmate over at Scrye Gardens...

Downtime: impending

Looks like I'll be doing the big router / modem move in about five minutes. Be down for a few minutes if all goes well, or for rather longer if it all explodes (as no doubt it will)...

Bridged networking for libvirt with NetworkManager: 2014 / Fedora 21

I'm not quite sure how, but I got sucked into spending the whole of today poking at various aspects of handling bridged networking with NetworkManager.

One of the most common uses of bridged networking is for virtualization: you set up a bridge for the host's connection to your router and configure virtual machines to use that bridge, which allows them to connect to the router just as if they were real physical machines that were plugged into it, they'll grab their configuration from your router's DHCP server, and virtual and metal systems can all talk to each other.

This is rather more convenient than libvirt's default setup where the VM host more or less acts like a NAT router for all the virtual machines running on it. This works out of the box, but has limitations. The VMs can connect out to the internet and to other systems on the same network as their host, but those systems and systems outside the local network can't connect in to the guests without some messy manual intervention. It's sort of the same situation you have with relation to the public internet when you're sitting behind your NAT router - you have to fiddle with stuff in the router settings in order to allow outside systems to connect in to servers running on your machine.

Since more or less time immemorial, one of the first things you see in any set of instructions you happen to find for configuring bridging for libvirt is "disable NetworkManager, because it doesn't work with bridges".

Every few months I ask the NM devs what the status of this is, and get a sort of handwavy reply, and move on to something else.

But no more! Today I decided to actually poke about at it and see how it works.

Executive summary: yes, you actually can configure bridging for libvirt purposes using NetworkManager, and have it work properly. It's not even that difficult. But there are some really evil gotchas.

Setting up a bridge with NetworkManager

If you have a clean Fedora 20+ (I think - I tested with 21 and 22, but from reports I've read I think this all applies to F20 as well) system, and you just want to make it work, here is what you should do.

  1. Turn off or delete your existing wired network connection. In GNOME, run the Control Center 'Network' panel and slide the slider to Off. Edit its properties You can do it with nmcli or ifdown or KDE or whatever as well. If you only turn it off, best to also set it not to start at boot: in GNOME, open its properties (with the weird cog icon), click Identity, and uncheck Connect automatically.
  2. Create a new bridge connection profile. In GNOME 'Network', click the +, then click Bridge. With nm-connection-editor, click Add, set the dropdown to Bridge, and click Create...
  3. If you use DHCP, leave everything in the Editing Bridge connection X window you see at default, except click Add next to the empty pane labelled "Bridged connections:", leave the dropdown box at Ethernet, and click Create.... If you need to customize your configuration at all - for static IP addressing, or whatever - do it here, in the bridge's properties, in the IPv4 Settings and IPv6 Settings tabs, before you click Add. If you forget to do it now, don't worry, you can always come back and edit the bridge's properties later.
  4. In the Editing bridgeX slave Y window you see, select your ethernet adapter in the drop-down labelled "Device MAC address:". Go to the General tab and check Automatically connect to this network when it is available.
  5. Click Save... (in Editing bridgeX slave Y)
  6. Click Save... (in Editing Bridge connection X)
  7. Open a terminal, and run as root: nmcli con show. You should see a connection whose name matches the 'bridgeX slave Y' profile you just created. Copy the UUID of that connection.
  8. Run as root nmcli con up (UUID), using the UUID you just copied. If you look at GNOME's "Network" applet again, you'll see a new connection suddenly appeared under "Wired". The fact that it didn't show up before and we had to use nmcli to turn it on is a bug. At this point, your network should come back up again, maybe after a 30 second or so delay. If you look in ifconfig or ip addr you'll see the IP address is tied to the bridge interface. If you look at brctl show you should see the bridge you created, with your network interface listed in the right-hand column.
  9. Configure your virtual machines to use the bridge as their Network source (in virt-manager, it's one of the properties of the VM's NIC, in the VM details page)
  10. Profit!
  11. If everything works, and it still works after a reboot, you might want to delete the original profile for your ethernet interface, to stop it confusing things.

During Fedora 21 development, two extra steps were needed between 9 and 10:

  1. Run as root sysctl -p /usr/lib/sysctl.d/00-system.conf
  2. Create a file /etc/udev/rules.d/99-bridge.rules containing just this line: ACTION=="add", SUBSYSTEM=="module", KERNEL=="bridge", RUN+="/usr/lib/systemd/systemd-sysctl --prefix=/proc/sys/net/bridge". What we just did in these last two steps is deal with another, rather notorious, bug, which network.service has a workaround for, but NetworkManager does not.

but I have just tested that these are no longer needed with Fedora 21 Final, as a fix for that bug was included.

Setting up a bridge with virt-manager

You may be able to successfully create a bridged network using virt-manager, with some care and a following wind, but I don't recommend it, as there are some bugs that could leave you in a slightly messy state (though probably nothing a reboot wouldn't solve). But if you really want to try it, here's what I recommend:

  1. Run virt-manager
  2. Right click on your host, and click Details
  3. Go to the Network Interfaces tab
  4. Click + to add an interface
  5. Leave the type at Bridge and click Forward
  6. Set the Start mode: to onboot
  7. Do NOT check Activate now: ! Don't do it!
  8. Check the tick mark for your network interface in the "Choose interface(s) to bridge:" pane
  9. Change IP settings: to manually configured, and set the appropriate configuration for your network (DON'T try copying the settings from the existing wired connection, that seems to be really broken)
  10. Click Finish
  11. Do steps 9, 10 and 11 from the NetworkManager instructions above
  12. Now you can try running nmcli con show as root and bringing up the bridge and slave profiles with nmcli con up (UUID) for each, or you could try rebooting. Again, the network should come back up for the host when you get both the bridge and slave profiles active.
  13. Do steps 11, 12 and 13 from the NetworkManager instructions.

Don't try actually activating the bridge from virt-manager. It uses ifup commands, and I ran into various bugs with that (listed later in the post). If you're going to use virt-manager, just use it to create the configs, but use nmcli or ifup manually to actually interact with the connections (and see the bugs linked later).

Reduce the startup delay

By default the bridge will probably take 30 seconds to become active, each time it comes up (actually each time a slave connection comes up, it delays for 30 seconds). This is apparently intended for complex networks where 'routing loops' are possible if traffic is routed wrongly - the connection observes the network traffic flow for a while to see what it should do.

This is part of a protocol called STP which is apparently meant for complex enterprise-y networks with multiple bridges between network segments. It's probably safe to simply turn it off. To configure this, edit the settings for the Bridge connection from the network configuration tool, and on the Bridge tab, uncheck Enable STP (Spanning Tree Protocol).

Alternatively, you can reduce the delay to the minimum. To configure this, edit the settings for the Bridge connection from the network configuration tool, and on the Bridge tab, set Forward delay and Hello time to 2. (Don't try and set them to 0, or you may run into a bug).

Background and details

So what's actually going on here? Well, NetworkManager's way of handling bridges is actually not very different from the way the old network service handled them. In fact, at the config file level it's identical. If you already have correct configuration for a simple bridge in /etc/sysconfig/network-scripts you should be able to drop any NM_CONTROLLED=no lines, stop network.service, start NetworkManager.service, and find that NM brings up your bridge successfully. You'll need to do steps 9 through 11 from the NetworkManager instructions to make traffic flow correctly from the VMs, though.

A simple config with one bridge to one ethernet adapter basically consists of these files in /etc/sysconfig/network-scripts (on RH-ish distros):

ifcfg-br1

DEVICE=br1
ONBOOT=yes
TYPE=Bridge
BOOTPROTO=dhcp
IPV6INIT=yes
IPV6_AUTOCONF=no
DHCPV6=no
STP=yes
DELAY=2
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
NAME="Bridge br1"
UUID=xxx
BRIDGING_OPTS=priority=32768
PEERDNS=yes
PEERROUTES=yes

ifcfg-em1 (or whatever your adapter is called, though the name of the file doesn't actually matter any more)

DEVICE=em1
HWADDR=f4:6d:04:9a:1d:45
ONBOOT=yes
BRIDGE=br1

All the 'normal' config options go in the bridge connection's config, note (most of the settings aren't strictly necessary, those are just typical ones from a Fedora install on my system; the UUID= line obviously will be some UUID or other that NM generated for the connection, not xxx). The STP and DELAY options control the bits discussed under "Reduce the startup delay" above.

The interface's config just identifies the interface, says to start it on boot (assuming that's what you want), and says it's a bridge slave interface.

To NetworkManager these are two connections, and you want both of them to be active for the bridge to be running. If you set them both to ONBOOT=yes then the whole thing should just come up at boot time, or you can use nmcli con up to bring them up. Remember that NetworkManager can cope with the concept of there being multiple connections for a single device: you may well still have your original connection for the ethernet interface knocking around, and it may confuse things. If you have issues look in nmcli con show and see if you have more than one connection that's for the ethernet interface, and if the wrong one is active. You probably should just get rid of the non-bridge connection once you have the bridge working. At least set it ONBOOT=no.

If you create the setup using NetworkManager as described above, the bridge connection's name will likely be "Bridge connection 1", and the slave interface connection's name will likely be "bridge0 slave 1", or similar. In /etc/sysconfig/network-scripts they'll be named ifcfg-Bridge_connection_1 and ifcfg-bridge0_slave_1. If you use virt-manager, it'll use old-style interface names, and will actually overwrite the existing connection for your physical device (so at least you won't have two knocking around and confusing things).

As mentioned above, when everything's working, the bridge connection / interface should have the IP address, and 'brctl show' should list the bridge with the slave interface in the column on the right. And sysctl -a | grep bridge should show:

net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0

and your VMs should get IPs in the normal range for your router, and you should be able to connect between VMs and 'regular' systems.

Bugs I found along the way

  1. sysctl.conf / sysctl.conf.d settings not read when modules are loaded
  2. Network control center panel does not show non-active bridge slave profiles (consequently, cannot activate bridges properly)
  3. virt-manager errors in 'ifup br1' when creating and activating a bridged connection on NetworkManager system
  4. Cannot bring up a bridge via ifup without causing an error ('waiting for slaves before proceeding')
  5. Set up bridged connection, active slave connection, activate bridge -> active profile for interface switches from slave connection to 'Wired connection' profile
  6. Fails copying a simple interface configuration to a bridge

Firewire + MythTV + XBMC PVR test setup

So I got a bit impatient after that last post and threw together a temporary test system to see how well MythTV+firewire input works these days, and how well XBMC works as a front end to it.

With cables trailing all over the apartment I temporarily hooked up our ancient DCT-6200 to my desktop, installed mythtv-backend and mariadb, and successfully set up my desktop as a temporary MythTV backend with only a couple of hiccups here and there. Connecting the XBMC system to it as a frontend was pretty easy, and XBMC certainly seems like a viable client experience with a bit of button behaviour tweaking and stuff.

Even using my Rawhide desktop as the backend with the storage on my NAS the performance wasn't bad - just a bit of jerkiness on sports channels - so I think it might be worthwhile throwing together a dedicated setup. I can pick up the newer DCX-3200 boxes capable of tuning h.264 channels pretty cheap on Craigslist, and I specced out a dedicated backend box for a couple hundred bucks, so it shouldn't break the bank.

USB IR MCE remote wake from suspend with Harmony: the missing piece?

tl;dr: if you've read all the references and still can't get your Harmony or other universal remote to wake up a computer, make sure you use the Media Center Extender or Media Center Keyboard device profile in the Harmony software.

One of the feelings that almost makes up for all the hassle that comes with building your own infrastructure is the one you get when that last little bit of a jigsaw puzzle finally fits into place.

So that Harmony remote I talked about recently is hooked up to (among other things) my HTPC box. The HTPC box has been one of my more succesful hardware purchases: I've had it running ever since that blog post, and it's just great, really. XBMC and OpenElec are great projects.

Ever since I set it up, I've used a janky USB infrared receiver I got off eBay to control it. It worked fine for a long time, but one thing that never quite worked was that I couldn't manage to suspend and resume the system from the remote. I can't recall whether it was on or off that didn't work with the old one, but one didn't. I just had to control the power manually, which really isn't that big of a deal but ate away at me inside, leaving me a hollow, hollow man.

So that receiver started packing up recently; it'd frequently get stuck repeating keys, or just not register when it was plugged in, throwing USB errors in the kernel logs. So I chucked it and replaced it with a 'genuine' MCE remote transceiver, a Philips OVU4120 (much like this newer model). I say 'genuine' because I bought it off eBay, so who knows, but hey. I set it up with the Philips profile for the Harmony remote, and everything worked fine (as long as I stick the transceiver on a shelf and point it at the wall...yeah, IR is weird), then I thought "hey, maybe on/off from the remote will finally work now!"

Then I tried it, and was a sad bunny when it didn't. I could suspend the system from the remote, but not wake it up.

Now this is one of those topics where if you DuckDuckGo it, you'll find some possibly relevant information, and an awful lot of woo-woo. A fairly typical page is this one. I don't think there's a lot of woo-woo there, but input from kernel folks who know what the stuff that's being cargo culted there actually does would be welcome. It does seem like poking /proc/acpi/wakeup and/or /sys/blahblahblah/power/wakeup is sometimes necessary for some folks, to enable wake-from-USB at the kernel level for the relevant USB host interface. I suspect the usbcore.autosuspend reference is ancient now, but I couldn't say for sure.

None of that applied to me, though. All the entries in /proc/acpi/wakeup and /sys that could possibly be the port which my transceiver was plugged into were definitely set to enabled. I could wake up just fine with a USB keyboard plugged into the same port. I had all the even-possibly-relevant firmware settings I could find set to 'yes please let me wake up thank you very much'. I had XBMC configured appropriately: wake from actual power-off is rarely going to work, so you want to configure XBMC to suspend when it's told to shut off; that setting is in System / Settings / System / Power saving / Shutdown function, set it to Suspend. Everything seemed to be pointing to Go, yet my remote obstinately would not wake up the system.

Grr.

Obviously I couldn't sleep with things this way, so I decided to try just one other thing: I changed the profile I was using for the transceiver in the Harmony configuration. The MCE remote protocol is a standard of sorts, so there are actually a whole bunch of 'devices' in Logitech's database which are listed as being for various HTPCs or remote controls which are really just sending the standard MCE commands, and you can pick any of them if what you're actually talking to is an MCE transceiver. As I mentioned above, I'd picked the one that most closely matched the transceiver I actually bought, the Philips OVU4120. But I'd found a note somewhere that only one specific MCE IR code can actually trigger a wake from suspend, and I wondered if somehow the power command in that Philips profile was the wrong one.

Apparently it was! I switched to the "Microsoft Media Center Extender" profile in the Harmony software, sent a power toggle command, and watched in joy as the damn thing finally actually woke up from suspend.

So yup: if you want to both suspend and wake a computer with a USB MCE IR transceiver using a Harmony remote, do all the other stuff you can read about, but also make sure you use the Microsoft Media Center Extender profile and use the power toggle command. I couldn't find this explicitly noted anywhere else, but it was the bit of the puzzle I was missing.

Happily, in the interim when I'd given up on this working, OpenElec/XBMC seem to have fixed a bug where the Zotac box I'm using didn't come back entirely reliably from suspend, and it all seems to be prety bulletproof now. Whee!

I'm now considering a second attempt at a MythTV-based PVR. I had one more or less up and running for a while, but we got annoyed at only having a single tuner, and there were a few other kinks. In the end I bought a new box from the cable co which has PVR functionality if you plug in an external hard disk, but that's proved to be even more of a nightmare so I don't use it any more. It now appears to be the case that I can pick up a Motorola DCX-3200, DCX-3400 or DCX-3510 box reasonably cheap from Shaw or Craigslist, and by all indications those boxes work well with MythTV and firewire control, and Shaw still transmits most channels without the flag that blocks firewire output. I still have an old DCT6200 box in the bedroom, so with that plus two of the 3200s or one of the PVR boxes I'd have three tuners. I can put together a dedicated MythTV backend box (just a couple of big hard disks, some RAM, and firewire inputs are really all it'd need, as there's no need to transcode firewire-captured video) for $300 or so, and XBMC apparently works well as a MythTV front end these days, so I could use the OpenELEC box as the main front end. Maybe I'll give the project a go this weekend. If I pick up the DCX-3510, even if the MythTV plan doesn't work out, I'd have a better 'official' PVR box...

Fedlet update

Since I forgot to actually include any Fedlet news in my last post, here's some instead!

So I've done a few 3.16rc kernel builds in the repo. Modesetting still doesn't work on my Venue 8 Pro, but various other folks have reported it does work on their hardware.

I did a new image build last week, but I can't really test it, because of the modesetting fail. However, I did at least boot it in a VM. Or rather, I tried, and it failed miserably.

Fedora 21 is a bit fragile right now, so I think the image is broken due to bugs in Fedora itself. Given that it doesn't work in a VM and I can't test it on metal, I'm not willing to put it out, I'm afraid. If you have an installed Fedlet, though, you can grab the latest kernel from the repo, and hopefully you should have accelerated graphics. You'll want to drop the custom X config package and the kernel parameters that force the video mode.

Intel still hasn't put out the firmware necessary for the sound to work in the official linux-firmware repository, unfortunately, or released it anywhere under a license that lets me redistribute it, so far as I can tell. I've just contacted them to ask about that again.