All the small things

As well as doing a whole lot of catch up with mailing lists and Beta stuff, during the last few days since I made it back home, I've tried to devote a bit of time to small things - the little issues that are easy to work around and overlook. Some of these might be useful to you too. Presented in no particular order:

Ever been annoyed at how the output of 'history' has no timestamps, and it'd be really handy to know when you ran that command? Try this: HISTTIMEFORMAT="%F %T " history. Neat, huh? Stick it in a /etc/profile.d file and you'll never have to worry about it again.

Do you have a few mail folders for mails from various Bugzillas? Do you always have a vague feeling you don't really have much useful information about any of those mails till you open them? Well, Bugzilla puts various rather useful attributes in X-Bugzilla headers in every email it sends out. And most mail clients have a setting somewhere to add any given header field to the 'compact' header view: for Evolution, for example, it's in Preferences / Mail Preferences / Headers. I've configured my Evolution instances to display X-Bugzilla-Who (whose activity caused the email to be generated), X-Bugzilla-Product and X-Bugzilla-Component headers whenever they exist, and it's a huge improvement. Also, a really basic thing, but somehow I never thought about it till now: remember you can customize columns in mail clients. The From: column for a Bugzilla folder is entirely useless as it'll only ever show the admin address of the BZ instance. So just get rid of it, and you'll have more space for the Subject: column...

The rest of the things I did are probably more of personal interest, but hey. I finally got around to testing the patch for very slow rebooting on VPCZ1 laptops (my 'old' laptop is one of these) which had been waiting on me for a few weeks. I also finally sat down and figured out why the brightness keys on the same system weren't working right: that's this GNOME bug, and I tested and confirmed the fix for it and asked the devs to backport it to 3.10. I also looked at trying to get middle-click emulation of some kind working on my 'new' laptop (the Dell XPS 13), which has one of those new-fangled trackpads with no real buttons: I did find a way to make it work which I don't really like much, and filed a bug, with Peter Hutterer's help, on the bug preventing me from doing it in a way I would like.

I enrolled the 'new' laptop (which is finally back from Dell with a repaired screen) in my FreeIPA setup, and cleaned up the ugly workaround I'd put in place on the FreeIPA server for the issues preventing it from starting on boot properly, since those bugs have now been properly fixed upstream. The FreeIPA setup has been working completely reliably and unexceptionably since I worked out all the kinks - I'm growing to like it.

I took current backup snapshots of all my server VM images - encrypted and stored on my NAS and (new!) in Amazon Glacier, which is pretty unbeatably cheap for this kind of off-site backup use.

I set up OpenDMARC on my mail server, so now I can apply DMARC policies for domains that publish them, if I feel like it. For now I'm just appending headers on a trial basis, though.

I did a bit of thinking about exactly why I feel so much more productive here at Mission Control than when travelling with my laptop. Either of my two laptops is a damn good laptop; they're both plenty fast and equipped with good storage and screens, and acceptable keyboards. But it still just doesn't really compare. I came up with a combination of factors:

  • A decent laptop keyboard is a decent laptop keyboard, but a proper mechanical desktop board is a completely different beast.
  • I really do love my vertically-oriented dual heads. There is just massively less friction involved in any number of things when you have that much screen real estate available.
  • One I hadn't really counted before, but: the processing power and storage speed on the laptops is fine, pretty comparable to my desktop. But I don't have anywhere near as good network access anywhere else as I do at home. My home connection is a good 50Mb/sec cable link running through a dd-wrt router with DNS caching, with gigabit ethernet to my desktop. On the road I'm never working on anything near as fast. pbrobinson's setup was probably the closest, and indeed I got the most work done on my trip when staying at his place, but that was still only ADSL2+ accessed over wifi. At 'home' in the UK I have ADSL2+ at slightly slower than max speed (about 8Mb/sec down, less than 1Mb/sec up) accessed over a fairly antiquated wireless router on stock firmware, and once I started really noticing it, the difference was pretty considerable. Things just don't pop up as instantly as they do for me here. And of course, when I'm at home, happyassassin servers are exactly one network hop across two ethernet cables away from me; when I'm travelling, they're usually a few thousand miles away...
  • Environmental factors: I can usually count on at least eight hours a day at home just sitting at my desk with nothing at all to bother me. The kitchen is five steps from my desk. In the UK the kitchen is two storeys below my desk and I usually get distracted by someone on my way there; there's always something dividing my attention.

Anyhoo, just something I'd been thinking about. Remember, it's still Graphics Test Week, too! Today is Radeon Test Day, and tomorrow will be Nouveau Test Day. I'm still hoping we can pull together Wayland Test Day for Friday, but it's getting a bit tight - I asked the graphics devs today and didn't get a response, so might have to throw something together myself.

Graphics Test Week kicks off soon!

As I promised last week, here's another post about Graphics Test Week. We have the wiki pages and test result pages in place now, so everything's nearly ready to go - I'm just tuning the special Test Day live images at present. We'll be kicking off tomorrow, Tuesday 2013-10-22, with Intel graphics Test Day, then following up with Radeon Test Day on Wednesday 2013-10-23, Nouveau Test Day on Thursday 2013-10-24, and finally, if we get it together in time, Wayland Test Day on Friday 2013-10-25!

As always, we're working to make testing as easy as possible. All the information is there on the Wiki pages, and we'll be providing special Test Day live images which contain all the necessary tools and even help you open the Test Day wiki page and join the chatroom right when you boot up. Reporting results should be easier than ever now we're using a special webapp for it instead of asking you to edit the Wiki pages themselves! So just visit the Test Day page of your choice and follow the instructions, and join us in #fedora-test-day on Freenode IRC to discuss bugs or get help with testing. If you don't know how to use IRC, read these instructions, or just use WebIRC.

Almost everyone has an Intel, AMD or NVIDIA graphics adapter in at least one of their systems, and you'll be able to do all the tests from the special live image, so there's no need to install Fedora. So really, you've no excuse not to join in and help us! Please do!

Fedora 20 Graphics Test Week next week!

Yes, the time is nearly upon us: next week will be Fedora 20 Graphics Test Week! We're still working on the test day pages, but it's all in hand. Tuesday 2013-10-22 will be Intel, Wednesday 2013-10-23 will be Radeon, and Thursday 2013-10-24 will be NVIDIA. Most excitingly for most people, we're aiming to include a Wayland Test Day that will be on Friday, 2013-10-25. Most excitingly for sad test monkeys like me, we'll be using Josef Skladanka's test day result app to track the results - no more tedious hand editing of mediawiki tables!

Handling test results has been a sort of ongoing problem for Fedora QA for a while now. For package validation we have Bodhi (which has its own little foibles...); for our other major workflows, Test Days and release validation, we rely on the wiki as an ad hoc 'TCMS' (test case management system), both for storing test cases themselves and for tracking results. We have looked at various ways of replacing it several times over the years, but looked at on that scale, it's a major project.

The test day result app is neat because it just de-couples one little part of the problem: it only handles test results (the test cases are still wiki pages), and it only handles them for Test Days (it is not designed to handle release validation). That sounds limited, but it means we were able to actually get something written quite quickly that is light and fit for purpose. It's very easy to set up an event within the app for any given Test Day - the process is explained here - and it's pretty easy to file reports in, much easier than the fussy and error-prone process of editing the Test Day wiki page directly.

We've used it for a few Test Days already, and at this week's QA meeting we agreed to make it the 'official' method for reporting Test Day results going forward - we expect most Test Days will use the system from now on. We hope you find it an improvement!

I'll write another post with more details on the Graphics Test Days themselves soon, once we have everything lined up and ready to go - but for now, mark your calendars!

SSD caching Test Day today (2013-10-13)

Sorry for the short notice, but I wasn't expecting a Test Day on a Sunday! Today is SSD caching Test Day. This is a shiny new feature in Fedora 20 which supports using an SSD as a cache for a larger regular hard disk, using a kernel feature called bcache. As this is a brand new feature in Fedora 20 it needs testing, so please, if you can spare an SSD and a regular disk temporarily for testing, come along and help out!

As usual, the Wiki page has testing instructions, test cases, and a table into which you report your results, so head there and follow the guide! Also as usual, the Test Day organizers will be hanging out in #fedora-test-day on Freenode IRC to discuss bugs and help out with testing, so please join in there too. If you don't know how to use IRC, read these instructions, or just use WebIRC. Thanks!

Upcoming Test Days, and Fedora 20 status

If anyone's noticed I haven't been around as much lately - I'm in Europe visiting family and friends (and, later this week, the Brno office!) If anything I'm busier than usual, but there's a lot of dealing with personal administrivia and seeing people, so I'm not getting as much work done as usual. (Plus my internet connections here are much slower and I'm on my laptop instead of my usual mission control, which makes me a lot less efficient). Normal service should be resumed around Oct 19th, please do not adjust your sets!

We have a couple of Test Days coming up this week: tomorrow (Tuesday 2013-10-08) is Virtualization Test Day, checking out all the latest features of virtualization for Fedora 20, and Thursday 2013-10-10 is GNOME 3.10 Test Day - there are lots of new features in GNOME 3.10 and you might even be able to give it a spin on Wayland, so please do come along if you can make it! As always, the Test Day pages are full of all the information you need to take part, and there will be QA folks and developers in the #fedora-test-day IRC channel to help out.

Fedora 20 is working its way fairly smoothly along the schedule, with some exciting changes - GNOME 3.10 is a pretty big release (as noted above), and the changes to the status of ARM and cloud are great steps in the right direction (and quite...interesting from a QA perspective). The first test compose for the Beta hit last week, and we'll be doing the second soon. The ARM and cloud changes worked out quite smoothly for Alpha, and we're hoping they'll continue that way for Beta and final. Both were actually pretty good for the last few Fedora releases, but the fact that they were 'secondary' in terms of placement and promotion did mean the press and users didn't take them as seriously, I think, so it'll be interesting to see how things work out this time.

Sysadmin adventures: success, glorious success (also ssh tunnelling is awesome)

So I'm pretty happy with myself: I'm sitting in the airport, well off my home network, and all my stuff is still working. The mail server config I hacked up extensively to work with multiple users (never something I bothered to deal with before), and FreeIPA-based auth for almost all my services.

I had a bit of a scare when I got to the gate, went to twiddle my web server config a bit, and realized I hadn't set up the ssh pubkey for this laptop as an authorized key in FreeIPA. I forgot I'd created a new key for my new laptop (the Dell that's sitting in the depot waiting for my "Windows password...") and set that one as an authorized key. For a minute I thought I was utterly locked out of interactive access.

Then I realized, damnit, there's always a way, and remembered I had trusted my phone, so 'all' I had to do was stick my old laptop's pubkey on that, connect to a system on my local network from the phone, add the laptop's pubkey to its local authorized_keys (which is still read along with the ones from FreeIPA), ssh into that system from my laptop, issue myself a kerberos ticket, and use 'ipa' to enrol the laptop's pubkey in the central store, and I was back in business. Simple, eh? :)

I also have no goddamn idea how I never got around to figuring out ssh tunnels until now. I've been using Linux for...er...15 years. Here is the Dumb Monkey's Simple Guide To Ssh Tunnelling:

ssh -D 9001 (some_system_on_a_network_you_want_to_get_at_something_from)

now configure your browser (or other tool) to use a SOCKS proxy on port 9001, possibly hack up /etc/hosts for things that insist on rewriting the hostname you're trying to access them by (koff koff ipa), and you can get at stuff 'behind the firewall' on the network you ssh'ed into. Specifically, by ssh'ing into one of the systems on my network that I allow external ssh access to, I can access the FreeIPA web interface or my router's web interface, even though both are behind the firewall. That simple. I really ought to have worked that one out earlier.

Adventures in tech support

So, my laptop screen got broken on the way back from Flock, and ever since then I have been dealing intermittently with Dell trying to get it fixed under either the warranty or the rather expensive support contract we purchased from them, to so far, very little result.

The system is now in a Dell repair depot somewhere. With an extremely obviously cracked screen. With an incident report that, as it's been told to me, clearly explains that the screen needs to be fixed and that this is a priority job. (And also should, finally, be fairly straight on understanding that this is the version of the system that comes preinstalled with Linux, which it only took them a week to grasp).

I'm sitting in front of my cell phone, here, on a Saturday morning. It just beeped to tell me I had new voicemail (the phone never rang; Sammy tells me this is a trick you can do by calling the voicemail centre...).

It's a message from the friendly people at the Dell repair depot, telling me to call them back between Monday and Friday, 8 to 5 (despite the fact there's clearly someone there making calls right now, and this is supposed to be a priority job), and tell them my "BIOS password" and my "Windows password".

...weeps

OpenID and Persona off again, temporarily

So the Wordpress OpenID plugin is completely broken with PHP 5.5, and the hack I used to have Persona create a new user account on my blog if you didn't already have one has kind of gotten outdated. So neither of those auth methods is much use for the blog any more. Accordingly, I've turned them both off for now, we're back to old-school WP accounts and captchas for those who don't have 'em, folks. Sorry!

Further sysadmin adventures: where's my FreeIPA badge?

So, hey, I have a working FreeIPA deployment for happyassassin.net.

After I put up my last post about how to do unified authentication for the various services I run through happyassassin.net, I got like five tweets and a couple of comments telling me to use FreeIPA. It's the obvious choice!, they said. It's easy!, they said. It only takes half an hour!, some jackass said.

To these people, I blow a gigantic raspberry.

But first, I figured I was smarter than them anyway, and didn't need it. The services I'm mostly interested in having unified authentication for are mail - which for me means dovecot and postfix - and owncloud. I already had postfix configured to auth via dovecot's SASL (which I'd totally forgotten about), so they were already in lockstep, and owncloud has an IMAP authentication option.

So it's easy as pie, right? Turn on IMAP authentication in owncloud and proceed directly to Go.

One problem: owncloud's IMAP authentication does not freaking well work. You will not convince me otherwise. I've read every goddamn Google result for 'owncloud imap', carefully negotiated all the gotchas documented therein, and it still doesn't goddamn work. Nothing logs anything useful anywhere. All I have to go on unless I feel like breaking out wireshark (here's a clue: I DON'T EVER FEEL LIKE BREAKING OUT WIRESHARK) is a 500 error from the server, and a 500 error from an HTTP server is roughly equivalent to "because I don't FEEL like it, that's why."

So. That was a bit of a dead end. Having cursed for a bit and set fire to some things, I decided, screw it, I can set up FreeIPA. It's a bit like using a large atomic bomb to crack a nut, but it's all shiny and I can probably convince people that it is somehow work-related and it'll, I don't know, save me about ten seconds of sshing into boxes and running 'passwd' when I want to change my passwords, or something. And hey, it's only going to take me half an hour anyway, right?

So, cue that raspberry.

I don't know how half hours work in CERTAIN PEOPLE's worlds, but in mine, it takes me half an hour to get through the damn introduction to the FreeIPA installation guide, never mind the rest. I've been up past 4am every day this week so far fiddling with it. But hey, now I actually have it working, and it is pretty shiny.

I don't even remember all the hurdles I leapt on the way any more, but here are some of them. One, DNS. You kind of want to let FreeIPA do your DNS for you, you see. If you're going to have a 'real' deployment of it on a domain with hundreds or thousands of machines and services, which will change quite a lot, you really want to let FreeIPA do your DNS for you. But I don't want to let FreeIPA do my DNS for me, because of REASONS.

Well, okay, because it's just not a very good fit for my setup, right? I'm running a small home network with a bunch of systems which won't ever be part of the FreeIPA domain, and there are issues with how I have my VPN access set up, and it just...didn't look like it was going to work out very well at all to have my network's DNS run out of the FreeIPA server. It makes a lot more sense to run it out of my router, which is what I'm doing. You can set up a subdomain with the hosts that are actually in the FreeIPA domain and just let FreeIPA manage the DNS for that, but ehhh, that would've been even more finicky in some ways. And the practical drawbacks to running DNS outside of FreeIPA once you get it working don't really apply to me: I don't need my boxes to be able to change their hostnames and push that info to the DNS server, and I don't need to keep moving services around between boxes and have the SRV records updated automatically when that happens, my network just isn't that big; I don't use SRV auto-discovery at all.

So. External DNS it was going to be. Setting up FreeIPA to work with external DNS is kind of a pain in the ass, though. Well, if you know a lot about FreeIPA and a lot about DNS I'm sure it's a walk in the freaking park, but for monkeys, no. That one took me most of one evening, including all the time I spent learning what the hell an SRV record is.

To save you the trouble, an SRV record is pretty much just a DNS record which says 'hey, on happyassassin.net, the LDAP server lives HERE'. Or something like that. You specify a service - which is something like LDAP, or Kerberos, or IMAP, or something like that - and list the name of the server that provides that service, the port and protocol, and optionally a couple of numeric parameters that are used for prioritizing multiple servers that provide the same service on big, grown-up domains (used for load balancing and fallback). In 'stuff I'm kinda glad I learned as a side benefit of this whole project', this is actually how mail auto-discovery works (well, it's the way it ought to work, I think Microsoft has their own ridiculous way of doing it too or something): that's RFC 6186 (on which the Evolution maintainer, Matthew Barnes, has some interesting thoughts, but I digress).

Anyhoo. If you're going to run FreeIPA with an external DNS server, you basically need to stick a whole whack of these SRV records in your DNS configuration. If you let FreeIPA handle DNS for you it does all this for you during server deployment, of course. When you run the server deployment script, FreeIPA will spit out an example configuration file in bind format that you can use. Unfortunately, the DNS server on my router's firmware is DNSmasq, so I had to figure out how to transpose the bind config lines to DNSmasq format, which took an hour or so before I grokked it. DNSMasq's man page and example config are invaluable here. And the format is:

srv-host=_kerberos._udp,id.happyassassin.net,88,0,100

srv-host= is the directive for dnsmasq that says 'add this arbitrary SRV record', _kerberos is the name of the service (always has a _ before it in SRV-speak for some reason), _udp is the protocol (ditto), id.happyassassin.net is the server, 88 is the port, and 0 and 100 are those numerical values I mentioned earlier, not really important unless you have multiple servers.

This record also illustrates one of the bear traps I fell into during this process: note the FQDN, id.happyassassin.net. According to the docs I read on the SRV format, just the hostname 'id' is perfectly valid so long as your DNS server will further resolve that to an IP, which mine does. If I set that bit of the line to be just 'id' and query the server with dig, the response I get is sensible. But for FreeIPA purposes, well, it's gotta be an FQDN. That simple. You want FreeIPA to work, make damn sure the server names in your SRV records use FQDNs. Public service announcement right there, folks.

edit: The FreeIPA devs would like me to point out that this is really a Kerberos requirement, not a FreeIPA requirement. Fine, folks, consider it pointed out. I wasn't trying to suggest that FreeIPA was being ridiculous here, just giving appropriate emphasis to the topic. To the monkey who never set up either before, the point is moot: Kerberos is just one of those scary bits that I'm counting on FreeIPA to deal with for me, and save me the trouble of knowing how the heck it works.

On that topic: actually, if you want FreeIPA to work, just generally make sure everything that can possibly be an FQDN is an FQDN. As part of this project I have learned that hostnames are far more psychopathically complicated than I ever wanted to know, but here is the very short version: for every machine that's going to be part of your FreeIPA domain, run 'hostnamectl set-hostname (fqdn)', and check that just plain 'hostname' returns the FQDN for that machine. This:

[adamw@id ~]$ hostname id [adamw@id ~]$ hostname -f id.happyassassin.net

is not good enough for FreeIPA (edit: fine, Kerberos). No. It needs this:

[adamw@id ~]$ hostname id.happyassassin.net [adamw@id ~]$ hostname -f id.happyassassin.net

So just give it that, and don't worry about it too much. If you are trying to be clever and get your hostnames set by DHCP, but when you do that, 'hostname' only gives a shortname, then give up on your clever DHCP thing and just whack the hostname with a hammer. Really. It's easier this way. (Please don't ask me why there is a 'hostname' and a 'hostname -f' because it will make me cry.)

Oh, and also, make sure every machine has a /etc/hosts line like this (for its own IP address and hostname):

192.168.1.XX machine.your.domain machine

make sure the FQDN comes before the shortname. You can even leave the shortname out. But don't have the shortname before the FQDN.

Another bear trap: it's very nice of the FreeIPA server install script to spit out an example BIND config when you run it with its own DNS configuration disabled, but it rather spoils the effect by going on to run the client install script automatically immediately after the server install script. Here's the problem: the client install script won't work properly unless your DNS records are in order. So inevitably it'll fail, because you haven't stuck all the DNS records in your DNS server's config yet. Sigh.

Somewhere around here I found that the freeipa client deployment script doesn't always clean up after itself perfectly when it fails. Bear trap: if you run the client install script once, and it fails, and you fix whatever was causing it to fail, and you run it again, and now it doesn't goddamn well auto-discover the IPA server even though it goddamn discovered it JUST FINE the first time and WHY?, the answer is 'rm -f /etc/ipa/ca.crt'. Bug #1011396. You're welcome. You might also get to a point where it gets so far as to actually register the system with the FreeIPA server but then fails later. If you do, then to do a complete setup once you fix whatever was causing the failure, you'll need to pass --force-join .

Once I finally worked out all THOSE bear traps, I got into a pretty good swing. I had a test client box which I used to get familar with the web UI and do a few test policies and things, and it was all working out nicely. So I was about ready to make the jump to actually registering my real desktop systems as clients and using FreeIPA in anger, to manage my HUGE RANGE of login accounts (which is...er...two, plus a 'facebook' account I use to sandbox facebook in a really janky way). Only one thing: sudo.

If you're going to centralize your auth and privilege policies you probably want it to include sudo, right? Bit pointless otherwise. When you find the obviously available instructions for enabling sudo with FreeIPA, you may, like me, be liable to run off screaming. It's not quite as complex as it looks at first glance, but it is still kinda complex. More significantly, AIUI, that approach has the rather significant drawback that it won't work if the system is offline. (Most bits of a typical FreeIPA setup do, thanks to sssd, which does credential caching; you can assume anything in your FreeIPA setup which runs through sssd will work just fine offline, I think).

Fortunately, there is another approach which is both simpler to implement and works offline, so what you should do is just use that. It's documented very concisely here: that is, indeed, all you need to do, on each machine in the domain. You will also want to add this line to /etc/rc.d/rc.local:

nisdomainname your.domain

or else group-based policies won't work, I think.

So once I found that out, I got on a nice roll of enrolling most of my machines in the domain and setting up a few basic policies. For each machine I'd tidy up so each existing local user matched the UID and GID for that user in the FreeIPA config, enrol it as a client, add the host to the appropriate host group so my policies took effect, set up the sudo stuff, check everything was working, and then delete the local user accounts. It was all working nicely all along the way.

Then I thought 'hey, I'm basically nearly done. I guess I should shut down the FreeIPA server VM and take a backup copy of it at this point.'

So I did. Then I booted up the FreeIPA server again. And found...no FreeIPA server at all. Just a failed ipa.service for no particularly obvious reason, in the logs. Not much to get my teeth into at all.

So I debugged that for four hours this evening, and let me tell you, that wasn't a fun one. It was complicated by the slightly weird way FreeIPA initializes itself: most of the relevant systemd units are disabled, and ipa.service actually starts them up itself. I'm guessing Lennart isn't a fan. I thought the fact that they were disabled was the bug and wasted hours fiddling about trying to enable them 'correctly', which wasn't the problem at all. No, it was this GIANT bear trap:

Sometimes, it seems, when you boot a FreeIPA server machine, the directive in /etc/tmpfiles.d/dirsrv-YOUR-DOMAIN.conf which tells systemd to create a directory called /var/lock/dirsrv/slapd-YOUR-DOMAIN...just doesn't work. That causes 389 Directory Server (FreeIPA's LDAP server) to fail to start.

edit: Thanks to viking-ice and nkinder, I find that this was recently re-reported, and the new bug reports have been far more fruitful. This is getting fixed, in multiple places (not just 389-ds has the problem, it turns out):

FreeIPA #996716 pki-core #996847 389-ds #1008306

Once I finally found that ticket, fixing it was easy: just manually run 'systemd-tmpfiles --create', restart ipa.service, and Bob's your uncle.

Well, he would be, except for this:

[adamw@adam ~]$ ssh id System is booting up. Connection closed by UNKNOWN

Ladies and gentlemen, what the freaking freak?

Maybe some of you ran into that error before. I don't know. But judging by Google, a lot of you are like me, and never have: Googling turns up some results, but not many, and they're mostly Arch users stumbling around and kinda fixing it but not really knowing what they did. I found one reference which clued me in to exactly what was going on, though.

If you ever see that message, what it means is that the file /var/run/nologin exists. This file seems to be as old as the hills, but I'd never come across it before. The idea is that it's created early in system startup and deleted at the end, then again created early in system shutdown, and while it exists, the system will deny any kind of login to any user but root. In a modern-day Fedora-y system, this is implemented by systemd (creates and destroys the file) and PAM (denies you access if it exists). The contents of the file are what is displayed when you are denied login; that's the "System is booting up." message in my output: systemd creates the file with the content "System is booting up."

So here's what happened: that file is actually created during early boot by systemd-tmpfiles, on a systemd system. It's then deleted at the end of boot by systemd-user-sessions.service (which more or less exists solely to do this). But if you have a problem with systemd-tmpfiles and run it manually to correct that problem...hey, you just created /var/run/nologin, and ain't nothing gonna delete it. You just locked yourself out. Congratulations.

The dumb fix for that is 'rm -f /var/run/nologin' after you run systemd-tmpfiles --create, but the smart fix is just to use a better systemd-tmpfiles command:

systemd-tmpfiles --create /etc/tmpfiles.d/dirsrv-YOUR-DOMAIN.conf

that tells tmpfiles to only run the directives from that specific config file, which is the only one we need, and avoids the whole /var/run/nologin problem, as that file is listed in a different tmpfiles config file.

So for now I just disabled ipa.service and have an icky /etc/rc.d/rc.local which runs systemd-tmpfiles then starts ipa.service, and that's working well enough for now.

And...whew, that brings me pretty much up to RIGHT NOW. I have a FreeIPA server which works, and survives a reboot, and all my systems but one are members of the domain, with no local user accounts any more, using the FreeIPA server for authentication and such. Oh, yeah, forgot to mention, one rather useful feature of FreeIPA is that you can store ssh pubkeys for each user, for 'authorized_keys' purposes: instead of having to create ~/.ssh/authorized_keys on every goddamn machine you ever configure, if you use FreeIPA, authorized_keys for any given user will just be read from the IPA server, so for any machine that's a member of the domain it'll "just work". Nice.

Just one thing: I haven't enrolled my mail server yet. Which is the original goal of the whole bloody exercise. But that's because it'll be the most complex bit - as postfix and dovecot are using the system user database for auth at the moment, I'll have to FreeIPA-ify those services as well, when I register it. So I was saving it for last. owncloud is using its own user db ATM, so I could happily register www in the FreeIPA domain and wipe local user accounts without breaking owncloud, but for mail, I'll have to work through the guides for setting up postfix and dovecot to use the FreeIPA server. But once I've done that, I should be home freeeee. And then, with luck and a following wind, I can pretty much grant anyone access to the main happyassassin.net services just by creating them a user on the server and sticking them in a particular group. Everything else should happen magically. Neato.

Per-domain DNS on Linux using a local caching server

More adventures in sysadmin time!

I had a bit of a sticky conundrum to deal with this evening. After setting up local DNS yesterday, I realized I'd have a problem for machines I have that connect to the Red Hat VPN. Here's the conundrum: my local DNS server can resolve, say, 'nas.happyassassin.net' - a box that has no public DNS record because I don't want to allow external access to it - but the Red Hat DNS server obviously can't. But my local DNS server can't resolve, say, 'topsecret.redhat.com', a box on the Red Hat VPN but again with no public DNS record.

How do I set things up so a box connected to the RH VPN can resolve both hosts on the local network and hosts on the RH VPN?

edit: Peter Robinson has pointed out that if you use NetworkManager, it should just handle this. I still run static network.service on my servers - I think because it just wasn't plausible to use NM when I initially configured them. I could probably switch to NM, now, I'll have to look into it. But the below may still be of use to others.

You'd think just listing both nameservers in /etc/resolv.conf would work (perhaps with a delay while it hits up the one that won't work, for the case where the one that won't work is listed first), but it doesn't seem to. This is how things wind up if you just use everything 'out of the box', and in that configuration, I can resolve stuff on the VPN (whose servers wind up listed first) but not the local network (whose servers wind up listed second). I don't know why; if someone does, do let me know.

Unless I missed something somewhere, it sure ain't simple. One option would be to have your router connect to the VPN, of course, but that has other drawbacks, and anyway, the firmware I have on my router at present isn't capable of acting as a VPN client. I poked through name solving documentation until it become pretty clear that this just isn't something you can really do 'simply'. There's no nsswitch option to say 'send lookups for .redhat.com to THIS name server but lookups for .happyassassin.net to THIS name server'.

But! You can do it with a bit of finesse and a local copy of dnsmasq. Basically the approach is to set up dnsmasq as a simple local caching name server using the router as its 'upstream' server, and then have the openvpn bring-up process write a little bit to dnsmasq's configuration which tells it 'use these DNS servers I just found for all redhat.com lookups'. As long as the VPN isn't up, dnsmasq is basically just forwarding all requests to the router; when the VPN comes up it keeps doing that for almost all requests, but requests for redhat.com addresses get sent to the RH server instead. Here's how I did it:

Install dnsmasq. Edit /etc/dnsmasq.conf - very few changes needed here, you just want to set listen-address=127.0.0.1 (to make sure nothing outside of the local box sees it, as we're just using it for this trick - we get caching as a bonus), and resolv-file=/etc/dnsmasq-resolv.conf (or any other file that isn't resolv.conf).

EDIT added later, forgot it at first: Edit /etc/resolv.conf so it simply reads: search=yourdomain yourvpndomain nameserver 127.0.0.1

i.e., send everything to dnsmasq. I suppose you could add the Google or OpenDNS addresses as a fallback in case everything goes south. Also edit /etc/sysconfig/ifcfg-(ifname) and add PEERDNS="no", which should stop the network service overwriting resolv.conf every time you bring up the connection (this step is inexplicably missed out of every 'use dnsmasq as a caching server' guide I've ever read, so I had to figure it out for myself).

If you actually need to have your new 'not-resolv.conf' file populated via DHCP every time you bring up a connection you have a bit of a conundrum to solve here; fortunately I don't, because I'm just always going to want it to use 192.168.1.1, so I simply hand-write it with 'search happyassassin.net redhat.com' and 'nameserver 192.168.1.1'.

Now the tricky bit. This is the bit that'll vary as well depending on exactly why you want to use per-domain DNS. But for me, this is how it goes. The recommended openvpn config scripts for our VPN include this wonderful bit of elegant shell scripting. On VPN up:

if [ -n "${dns[*]}" ]; then for i in "${dns[@]}"; do sed -i -e "1,1 i nameserver ${i}" /etc/resolv.conf || die done fi

On VPN down:

if [ -n "${dns[*]}" ]; then for i in "${dns[@]}"; do sed -i -e "/nameserver ${i}/D" /etc/resolv.conf || die done fi

So, I just munged that up a bit, and made it do this instead. On VPN up:

if [ -n "${dns[*]}" ]; then for i in "${dns[@]}"; do sed -i -e "1,1 i server=/redhat.com/${i}" /etc/dnsmasq.d/redhat.conf || die done systemctl try-restart dnsmasq.service fi

On VPN down:

if [ -n "${dns[*]}" ]; then for i in "${dns[@]}"; do sed -i -e "/server=\/redhat.com\/${i}/D" /etc/dnsmasq.d/redhat.conf || die done systemctl try-restart dnsmasq.service fi

Beautiful, innit? Just beautiful. Instead of haphazardly munging up resolv.conf we're now haphazardly munging up a dnsmasq config snippet. MUCH better. You need to make sure redhat.conf exists and contains a single line before that mess will work. What that does, with a following wind, is write lines like this to /etc/dnsmasq.d/redhat.conf when the VPN comes up, and delete them when it goes down:

server=/redhat.com/XX.YY.ZZ.FOO

which in dnsmasq syntax means 'send requests for this domain to this server'.

I've tested this like twice for thirty seconds, so I'm pretty sure it's bulletproof! Please leave comments indicating exactly how I have sinned against nature this week, I'm always willing to learn...