Further adventures in media streaming: don't buy a Popcorn Hour, and props to XBMC and OpenELEC

So, I said back last April that I'd found an ideal solution to media streaming, replacing my big heavy complex HTPC system with a Patriot Box Office dedicated streamer.

It turns out that I lied...kinda.

After several months of experience, I was almost happy with the PBO. It's really impressive for the price, size and power consumption. But it had a couple of irritating flaws: it couldn't play one particular set of large video files I had, and every so often when playing back a subtitled video - more in the case of certain videos than others - it would 'drop' one of the subtitles, it just wouldn't be displayed. This obviously gets irritating over time. The box was pretty well served with firmware updates and alternate firmwares, but the PBO itself and the other similar devices from which alternate firmwares were derived are all pretty much out of support now, so it's unlikely that'll ever get fixed.

So I came to the conclusion that the obvious solution was...a better media streamer, right? Or rather, a more expensive and heavily community-hyped one. So I went out and pre-ordered a Popcorn Hour A-300 for $220 plus shipping.

I highly recommend you DON'T do that.

The thing is unfit for purpose and has been since release. It has had three firmware updates, none of which has fixed the three serious bugs I've had with it:

  • It frequently turns itself off (into suspend mode) when you try and quit the screensaver
  • It's incredibly slow to browse directories on a CIFS server; every 22 entries there is a pause of a second or so. My NAS is not the world's fastest, but no other device I've ever used to browse that share - other, cheaper, streamers, various Windows and Linux-running computers, hell, even some phones - has had that problem. The thread is full of others with the same problem
  • It frequently loses synchronization between the video/audio and the subtitles when you skip around inside a subtitled video (or, sometimes, just when you play one)

These bugs are pretty much showstoppers for my use of a media streamer. They've each been confirmed by dozens of other users - I'm not some kind of weird bug-inducing outlier. Syabas is clearly aware of them, as they have people reading the forums. Yet in four months since my box shipped, they haven't been fixed. This is utterly unacceptable for a streamer that costs more than twice as much as its competition; I don't expect it to be worse. In addition to the above three main bugs, I've had it crash on me several times. It's crap, and Syabas's support sucks. Don't buy one.

I'm currently trying to persuade them to give me a full refund on their clearly unfit for purpose product, but so far they've only offered to do so if I ship it back in the original box. Seeing as how I live in a 500 square foot apartment I don't have the original box, it'd be utterly impractical to try and keep the original packaging for everything I ever buy lying around my tiny apartment. Just not going to happen. So right now I have a complete waste of $220 sitting on my dining table. Thanks, Syabas. Jackwagons.

I decided to give media streamers one more roll of the dice, and got a WDTV Live for $100 - it's from Future Shop, so I can return it within a month. It turns out to be in the same class as the PBO: it mostly works - it's a lot less buggy than the PCH, which costs over twice as much - but has just a few irritating flaws which I can't quite live with. It doesn't display stylized subtitles - a pretty small nit, but I kinda got used to them on the PCH and I miss not having them. More importantly, it seems like you have to 're-scan' a directory every time you add a file to it or it won't pick the file up, which is a pain in the ass, and it has no ability to 'permanently mount' a shared drive by IP address; it always needs to browse for it, and CIFS share browsing is...ahem...not the most reliable thing in the world. So sometimes it just doesn't see the NAS, which isn't great.

So I finally decided, the hell with media streamers. I briefly considered running XBMC on an Apple TV or getting a Boxee but I could just see too many potential problems with those routes, and I was sick of it.

I'm going back to having an HTPC. But it's not going to be big and heavy and it's not going to run a full Linux distro with the concomitant (ooh! word of the day!) maintenance hassles.

I got me a Zotac AD02: it's an AMD-based 'nettop', a netbook-class system in a miniature desktop-ish chassis, pretty much tailored for HTPC use. It's a bit bigger than the PBO and WDTV Live, but smaller than the PCH. It's pretty efficient on power, and can even be attached to the back of a TV with a VESA mount - me likey. Comes as a fully-specced system with 2GB of RAM and a 250GB hard disk (no need for even that much with all the media on the NAS) for $300 - not a lot more than the PCH plus shipping, but this thing's a fully-powered PC.

Then I put OpenELEC on it. This is a brilliant project - it's a dedicated XBMC-based media centre distribution, rather like OpenNAS for NAS boxes or OpenWRT for router boxes. It's ridiculously easy to install and works - at least, so far as I can tell so far - very very well. You download the appropriate build for your hardware (the Fusion build for my box; they have an Ion build for NVIDIA-based boxes), plug a USB stick into your system, run a little installer script which turns the USB stick into an OpenELEC installer, plug the stick into your HTPC box, boot it up, hit enter a few times, and it installs the distro onto the hard disk in the box. You reboot, and XBMC comes up. It's literally that straightforward - it's barely harder than updating the firmware on a dedicated streamer box. Really impressive stuff.

You get ssh access, though it's pretty limited and really only meant for debugging. You can use it to set up 'permanent' mounts of shares, though, which is all I really need. It boots in about ten seconds even from a hard disk and performs pretty nicely. And you get XBMC. To a traditional F/OSSite XBMC is a pretty icky project, and it's a bit of a headache to package within distro conventions, but if you consider it as being basically a standalone blob for turning PC hardware into a media centre, it's awesome. Beats the software on any dedicated streamer I've seen or tried into a cocked hat.

I'm sure I'll figure out something wrong with this setup, given time. But so far I'm happy with it. Now all I need to do is find the damn USB transceiver for my MCE remote so I can remote control the thing...

Fedora 17 Test Days start up this week: desktop localization and OpenStack

John Dulaney has been doing a great job of co-ordinating the Fedora 17 Test Day cycle so far, so many thanks to him for that. This week sees the first two events of the release cycle: desktop localization (l10n) Test Day on 2012-03-07 (Wednesday) and OpenStack Test Day on 2012-03-08 (Thursday).

The desktop localization event focuses on checking translation completeness and quality across the most visible desktop applications. You can help out if you can read any language other than English - just head over to the test day page, check the translations for the listed applications in your language, file bugs if you find any errors, and fill out the results table.

OpenStack is an open source cloud stack which has gained a lot of momentum and support across the industry, and Fedora is working to include OpenStack in Fedora 17. It looks like the Test Day organizers have come up with a set of test cases to work through the entire process of setting up an OpenStack-based cloud and deploying an instance on it, so this will be a great event for ensuring Fedora has a fully working OpenStack...stack...and could even work as an introduction to OpenStack if you've always been interested in it but haven't had the chance to dive in yet.

You can even contribute to both events without ever installing Fedora on real hardware - apparently it's possible to run OpenStack on a virtual machine, which is really just showing off. So you have no excuse for not coming along and helping ensure Fedora 17's desktop translations are in good shape, and that it can act as a great platform for OpenStack deployments!

We'll hope to see you there in #fedora-test-day on Freenode IRC - you can use WebIRC to connect if you're not a regular IRC user. There will be experienced members of the relevant teams on hand to help you out with testing and reporting your results all day long. Let's get the Fedora 17 test day cycle off to a great start!

Fedora 17 Alpha released!

As a note - I know I've been blogging less recently. I'm finding that being the QA team lead kind of changes my approach to things and I just wind up blogging about stuff less. I keep meaning to make an effort to do it more but it never quite comes together. So sorry about that! I'll keep trying.

In more exciting news - yep, we released Fedora 17 Alpha today. F17 is going to be a pretty cool release with a bunch of interesting new features, and since there aren't any gigantic changes to anaconda or grub this time, I'm hoping it will actually be fairly solid.

As it happens, though, the Alpha is not terribly solid. It's natural that there's quite a wide 'natural' range of variety in the quality of Alpha and Beta releases, because our minimum standards for the quality of Beta and especially Alpha releases are quite low; often it's the case that we happen to exceed them, but there's no guarantee that will be the case for any particular release. It just so happens that the F17 Alpha doesn't exceed the Alpha quality requirements by much, and it's palpably an Alpha - it has some rather large and obvious bugs you'd never expect to see in a final release. This isn't any kind of huge emergency, though, and it doesn't imply anything about the quality of the final release.

So I'd recommend that if you're going to try out the Alpha, do listen to the usual boilerplate and think really hard before you do so on your production machine...and whether you do that or install it on a test system / partition, read the common bugs page, because it provides a lot of useful information on the most severe and obvious bugs in the Alpha, including workarounds or fixes in a lot of cases. You're really going to want to check it out before you go ahead and install the Alpha.

Once you've read that - go ahead and download away! And remember, if you hit a bug - first check that it hasn't been reported already, of course, but then please do go ahead and file a report. The main purpose of pre-releases is to help us catch bugs and fix them before the final release happens, so please do your bit to help us do that. Thanks, everyone!

(random note: Ryan Seacrest, put your damn tie on.)

Go LibreOffice!

A propos of nothing in particular, I just wanted to echo Jono in giving it up to LibreOffice. Ever since the fork everyone involved with LO has done fantastic work on both the coding and community building side; I'd say they're a great model for how such a large F/OSS project should work. The substantial improvements they've made to the project in such a short time are impressive. Kudos to the LO team!

PSA: Attention News Sites

Hello, CNNs, CBCs, BBCs and so forth of the world!

I am visiting your web site through a web browser. Its primary interface is text. I am doing this because I wish to read things.

I do not wish to see fifteen interesting headlines linked from your front page, all of which turn out to be videos. In the time it takes your ridiculous Flash-based video player to load up - never mind for the stupid intro sequence to play, or the possible pre-segment advertising, and double-never-mind the story itself - I could have skimmed the text of the story and moved on. Video is a hideously inefficient method of transferring information that's basically textual. I don't have time to wait for it. I came to you using a primarily textual form of communication because I wanted to read text.

If you must offer videos for the damn Youtube generation, who need to get off my lawn already, so be it. But please at least provide a text version of the story underneath the stupid video player, so I can learn about the story in ten seconds and move on, rather than having to wait two minutes for some ass in a bad suit to tell me about it.

Thank you!

FUDCon Blacksbug 2012: QA stuff

Here's the last in my FUDCon 2012 post series! This time covering QA things that happened at the event.

We had a nice group of QA folks - myself, Tim Flink, John Dulaney and Sandro Mathys were all there, plus many of our regular co-conspirators from the development, anaconda and rel-eng teams.

The 'biggest bang' we had on the schedule was a planned hackfest for testing the new anaconda UI that had been planned to land in Fedora 17, but this changed substantially on the day. The big news is that the new UI is not close to being done and won't be close to being done for the Fedora 17 timeframe, so it's no longer planned to land in Fedora 17.

So we turned the hackfest into a planning session for how we'd handle landing the new UI, then anaconda team took it over to actually work on some new UI design issues. As far as the release plan goes, we came up with the following broad agreement:

  • Prior to the new UI landing anywhere, anaconda team will generate images which include the new UI for testing, and provide a quick and dirty list of what's actually testable at any given time
  • New UI will be landed into Rawhide as early as possible after F17 is branched, giving the longest possible window to test it for F18
  • Myself and Brian Lane will work together to try and ensure any major bugs found in the old UI and left unfixed during F16 cycle, with the justification they'd be irrelevant under the new UI, get fixed for F17, now it'll be using the old UI

This way we can take advantage of the 'no frozen Rawhide' system to provide a >6 month window for new UI testing, which should definitely help smooth out the new UI bumps for F18.

Outside of that, we had an AutoQA hackfest on Friday afternoon where Tim and John worked out some planning, and a QA presentation on Saturday which went off well. We had an Introduction to QA hackfest planning for Sunday morning, but being the first session on the day after FUDPub, attendance was around zero, and it got subsumed into the anaconda session.

We also did lots of checking in with other groups on specific issues; for example, I got a useful request for better navigation of the validation results matrix pages, we heard from Luke that Bodhi 2.0 is really going to be coming Real Soon Now, and we talked to several people on the ARM team about ramping up efforts on ARM testing. All in all it was an enjoyable and productive event.

Fixins

So what does it make sense to do the week after FUDCon, with Fedora 17 Alpha impending?

Why, work on fixing random annoying bugs, of course!

That's pretty much what I've done this week. For some reason - I can't even remember what the reason was any more - I decided to finally nail down the bug in libconcord which prevented things from connecting to the remote without root privileges. libconcord is the library for programming Logitech Harmony smart remote controls: you're supposed to use Logitech's web interface to define how you want the remote to work, then it sends you what's effectively a custom firmware file for the remote, which needs to get flashed onto the remote. libconcord is the library that can do this, but for a long time, apps based on it (concordance is the CLI and congruity the GUI) have needed to run as root, which is kind of useless, as we can't just trigger congruity from Firefox and have everything 'just work'.

With lots of help from Kay Sievers (thanks Kay!) we managed to nail down the couple of issues which were screwing it up, so now it's fixed, until the next time they decide to change udev, anyway: if you have a Harmony remote, you can just 'yum install congruity' and go use the Logitech web interface, and when it sends you a file, congruity should pop up and handle writing it to your remote, nothing else required. Make sure you have updates-testing enabled, of course.

Today I decided, again for no very good reason, to finally nail down the mysterious 25 second delay I and (probably) others sometimes see after waking from suspend, before the network kicks in.

This turned out to be one of those extremely annoying bugs where you're trying to track five different variables at once: at various points I thought it was somehow down to pm-utils' convenience 'dbus_send' function vs. the straight command 'dbus-send', or the fact that packagekit's pm-utils sleep hook didn't run its dbus-send command with && at the end, and the fact that it seems I hit the bug reliably on the first suspend after boot but not if I suspend again soon after didn't help. But finally I'm pretty sure I figured it out.

When you suspend, pm-suspend calls all the hooks in /usr/lib64/pm-utils/sleep.d in numerical order. When you resume, it calls the same hooks in reverse numerical order (just like how old-school SysV boot and shutdown work) - so when you're resuming, 95packagekit happens before 55NetworkManager. This was about the key thing. For some reason, the dbus-send call in 95packagekit often fails on my system, on resume. The fact that it uses the parameter --print-reply means that the command doesn't return immediately and leave dbus to work in the background; instead it blocks and will only return when it gets a reply or times out. So when the call fails, it has to timeout before pm-suspend will proceed to the rest of the hooks (and the rest of the resume process in general).

Just dropping the --print-reply parameter seems to fix the problem ideally: the network comes up quickly on resume, and the dbus call even now seems to succeed, a few seconds later; I suspect that, for some reason, the network has to be up for it to reliably succeed. There's a second bug which is that the dbus methods used by 55NetworkManager got killed in 0.9, so that hook doesn't actually work, but this is less important than you'd think, because NM has a 'backup' mechanism where it wakes up in response to upower waking up, and that kicks in just fine as long as 95packagekit isn't blocking it.

So that's two annoying niggles that have been bugging me for months cleared up, and my FUDCon todo list entirely neglected. Fun times! Jon Stanley, I promise you those links soon, though. I will also finish up my FUDCon wrap posts with one looking at QA activity, of which there was quite a bit.

We start ramping up for Fedora 17 in earnest around about now: the first RATS run is scheduled for this week, and the first blocker review meeting is next Friday. So everyone start clearing your Friday schedules!

FUDCon Blacksburg: Social track

(This is part 3 of my FUDCon odyssey, which I'm trying to split into smaller chunks as per Adam Young's suggestion. Part 1 is my Day 1 summary. Part 2 is a post on the topic of the talk I gave on the barcamp day.)

One of the most fun bits of FUDCon is always the social track, so here's a quick recap of my involvement in that. On the first night I had dinner with a big group at an Indian restaurant, which was pretty good though somewhat under-spiced. I spent most of the time talking to those of Chris Tyler's current group of students at Seneca who made it to FUDCon. The project Chris runs, getting undergrad students involved with real-world F/OSS projects including Fedora, is a fantastic one, and it's amazing to see the amount of effort he puts into it. It has real results, as well - the current crop of students is responsible for building and maintaining the current Fedora ARM build farm, for instance. They were a bright group and it was a pleasure hanging around with them all weekend.

Back at the hotel, I hung around in the lobby for a while, chatting with various groups. The evening wound up quite early for a FUDCon, though. Will Woods was fulfilling his traditional role of feeding people terrible, terrible liquor, a point which will become important later on: I don't remember what it was he'd brought this time, but whatever it was, it was awful.

On the second night, There Was FUDPub. It was a pretty good time, with bowling and pool available, and decent food with two drinks a person. By the time I got up for my last drink, there was only Michelob Ultra and Coors Light left - that's right, they'd run out of beer!

I spent the night bowling, because I rarely get a chance to do it (the last time was at the last FUDPub, actually) and I do enjoy it when I get the chance. The night was capped off by an epic showdown with Greg DeKoenigsberg, in which I bowled back-to-back strikes in the tenth frame, leaving me on 130 and needing essentially just about anything to beat Greg's 131...at which point I felt the pressure and sent down a genuine, honest-to-god gutterball. Sigh.

Once FUDPub finished up, it was back to the hotel for the main event - poker. We have an informal Fedora Poker SIG of Red Hatters and Fedorans who are somewhat nutty about poker, and for the last couple of FUDCons, we've run a game. Before that, though, the Woods Saga had to come to an end. Revenge...had to be taken. I rolled up to the bar offering drinks to random people at the anaconda table, 'happening' to include Will in the offer. He fell for it hook like and sinker, and when offered a choice of drinks, uttered the fateful words 'surprise me!'

Well, I studied the bar for a good long time, and eventually decided the most evil thing on offer was Bacardi 151. So I bought him a shot of that as his 'surprise drink', and the table chipped in heroically by telling him to down it.

He did.

The look on his face will keep me going until the next FUDCon. I like to think a straight shot of 151 had something to do with the subsequent merriment that involved large quantities of sparklers, because...it was Mr. Woods who wound up holding 'em. You're welcome, FUDCon.

Poker then ensued. The first game, honestly, was not much fun: we had a table of ten people, many of whom hadn't played much before (especially live), odd chip values and stacks, and Greg deciding he was not going to play an especially friendly game. What with the nine-handed flops and Side Pot Math 406 courses, it took about two hours to play ten hands. I played tight then wound up all in with top pair and a king kicker against nothing much, until nothing much turned into a set and I was done. The game came down to Greg and John Dulaney, who won by the expedient of never folding anything and winning all his showdowns - hey, nice when it works.

After that, though, we had a second game with fewer, more experienced players, and standard chip values and stacks. It worked out much better, and was a really fun game to be in. We were five handed for a while, then people dropped out due to tiredness or drunkenness, I went bust (my poker specialty: playing a perfectly solid game and always losing showdowns), and it came down to Casey Dahlin and a cool guy named Ivan who isn't on the signup list, so I don't know his surname or who he actually is. I played dealer, and they had a really well-played hour of heads up, which was lots of fun. We finally wound up turning in about 4:30pm.

There wasn't much social track on the last day, as most people were leaving, including me. So that was pretty much it for Blacksburg. It was fun to catch up with people again, and FUDPub and poker night are pretty much my favourite things about FUDCon, most times!

FUDCon Blacksburg: My presentation, Cloud 0.1

In deference to Adam Young, I'm going to try and write a series of broken-down posts on FUDCon, rather than one or two giant mish-mash-y summaries.

So, this one's about the presentation I gave, titled 'Cloud 0.1', with a subtitle I haven't quite nailed down yet, but which is something like 'Why Not to Spend Lots of Time and Energy Running Your Own Infrastructure Much Worse Than Google Would, And How To Do It If You Insist'.

I've had the idea for a while now, but being lazy, didn't write anything at all until the day before FUDCon, nor make any slides. Then I pitched it. To my surprise, it got enough votes to be scheduled. To my consternation, it got scheduled in the very first timeslot - so I had no time to finish my half-written notes, make any slides, do a runthrough, or generally do any of the stuff that would make it into a good talk.

Instead I got up, read my introduction, then improvised inexpertly for an hour. Many thanks to the dozen people who showed up and managed to avoid falling asleep or throwing rotten fruit.

The way I presented the talk was to spend a while talking about the many reasons it's not a good idea to run your own infrastructure and the few reasons it is, then spend quite a while giving a 10,000 foot overview of how to set up a mail and web server, then spend the last 15 minutes briefly going over some rather neat webapps I run on my servers, and IRC/IM proxying. However, in hindsight, I think the most valuable bits are the consideration of whether you should run your own infrastructure, and the notes on neat, not-necessarily-well-known webapps and so on you can use if you do. The mailserver / webserver stuff is just too complex for a one hour presentation. So, since my notes are terrible, personal shorthand gibberish, and I have no slides, instead of giving you those, I'll write a post about the same topics. Deal?

When I talk about 'infrastructure' I'm talking about the services that support your computing. The classic, old-school example is running your own mail server; other bits that come into the talk are a personal web server and IRC / IM proxying servers.

In the past it was pretty hard to find managed ways of doing any of those things, and it was fairly common for geeky types with personal internet connections to DIY. If you look at the internet, of, say, 1995, it was kind of designed as a giant interoperable network of nodes which would provide these kind of services to a group of users, and geeky types would essentially act as a node unto themselves - they were a service provider of one, providing services to themselves, and maybe a few friends and family, instead of relying on mail and web hosting services provided by their ISPs, which were inevitably crappy and limited.

These days it's much less common, for a good reason: you can almost always get someone else to do it for you, much cheaper and better than you would do it yourself.

This forms the 'why you probably shouldn't do this' side of the argument. There is just about nothing you can achieve by hosting your own mailserver which Google won't do much better in exchange for sending you some ads and assimilating your personal information into the future Skynet, or which a service like Fastmail won't do much better in exchange for a frankly pretty small cost - a cost which will almost certainly be less than the value of the time and money you'll invest into doing it yourself. This is not surprising. There are huge, huge economies of scale built into infrastructure provision. Doing it for a user base of 1-5, on a hobbyist basis, is unsurprisingly vastly less efficient than doing it for ten million people on a very very professional basis.

The other disadvantages to self-hosting really just derive from this fact. You will almost certainly screw up more than a hosted provider will: you will break the server by deploying some dodgy app or an untested update. You will have less capacity (wave goodbye to your self-hosted blog when you get slashdotted, for e.g.) You will almost certainly have less redundancy - I know I don't have any kind of failover on this webserver. You will almost certainly fail to take adequate backups. All these are boring, menial things which any decent hosted provider will do better just because it's part of doing a professional job. You won't because you're doing this for fun, and those things are not fun.

Briefly, paid or 'free' (ad-supported / personal data supported) hosting services can provide you with almost anything you can host yourself, and do it much more efficiently. So why would you ever want to do it yourself? There are only a few reasons:

Necessity. I'm sticking this up at the top to make sure you don't miss one of the best bits of this lengthy post. There are some things you can self-host that, to the best of my knowledge, you can't actually get from a paid provider. The thing I know about is IRC/IM proxying. There's no hosted provider of this that I know of. There's a bit of this post down the bottom which explains what this is and, briefly, how to do it. If you're a heavy user of IRC and/or IM you may well want to do it, because it's really useful. So if you skip a lot of this post, do read that bit.

Education. You can learn quite a lot about how the internet (still, more or less) works by doing this stuff yourself. It will certainly teach you things. The internet is a somewhat different beast in practice these days, with so much of it existing inside Google's and Facebook's monstrously internally complex domains, but at a certain level it still works more or less how the RFCs of the Internet Past declare it works, and running your own services will teach quite a lot about that.

Control. Obviously, the higher the level of functionality that you outsource, the less control you have over the implementation. This seems like a really big reason, but it often isn't. When it comes to mail, a hosted mail provider will almost always provide everything you want. You just don't need really fine grained control over the server configuration. You do not need to control the maximum simultaneous connection count to the IMAP server. You want a service that delivers your mail, allows you to send mail, allows you to organize your mail, and filters spam out for you. That's really pretty much it. Gmail certainly achieves all these things. So do dozens of other services. Again, when it comes to web hosting, often what you want is a Wordpress instance. You do not need deep control over the server's PHP configuration. It's more likely to irritate you than help you. There are cases where you actually need such control, as opposed to just maybe finding it cool that you have it, but those cases are fairly uncommon.

Fun. Yeah, it's worth mentioning this. Some of us have very strange mindsets which find battling obscure MTA configuration to be an interesting way to spend our time. I've checked with medical professionals, and this is an incurable condition. Sorry. We just have to live with it. If you're a fellow sufferer, you may self-host for no reason other than that you enjoy doing it.

Privacy. This is probably the largest remaining really valid reason. If you use a 'free' service for your infrastructure, you should always keep in mind that you almost certainly no longer own your stuff in any practical sense. If you use Gmail, Google pretty much owns your email. You don't. They can look at it, use it to develop Skynet, send it to the government, and just generally do whatever the hell they like with it. In strict point of fact this is not entirely true - there are some legal restraints on what they can do with 'your' data - but I find it's an excellent rule of thumb to work from. When dealing with such services I find it pays more or less to assume that everything you put into them will immediately be forwarded to the police and all your worst enemies, and then used to generate large amounts of advertising that will be mailed to you. Doing so avoids you being shocked in future when some of those things actually happen.

Paid services are a somewhat different ball of wax, in that you are not offering up your data in exchange for some services, but actually paying for the services. You therefore have a reasonable expectation that you will retain most of the ownership of your data. If you use a decent service provider, the contract you have with them may even possibly bear this out. However, there are still several problems, mostly legal ones. Your hoster can almost certainly be obliged to nuke your services and probably turn over your data to law enforcement under the terms of various bits of legislation, depending on where you are and where they are. Even if they're not obliged to, they may well do so if asked by a sufficiently powerful body (like the government, or Universal Studios), on the basis that pissing you off is probably less damaging to them than pissing off the government. If you host your own services, this becomes much more unlikely.

It remains only to point out that, in brutal point of fact, this is often unlikely to be a consideration, but it is still worth bearing in mind, and though it's not a huge issue for me, I do still value the fact that it'd be quite difficult for anyone to kill or forcibly access my mail or private web content.

In relation to this last point, it's worth remembering that 'self-hosting' vs 'using a provider' is more of a spectrum than a binary state. Even those of us who 'self-host' are inevitably going to be outsourcing some stuff to someone. I use No-IP for DNS registration, for instance, so in theory someone could at least knock happyassassin.net offline by leaning on No-IP. I don't have control over that level of things. But still, No-IP doesn't own or even have access to any of my actual data, only my DNS records.

At the general level, even if you decide you want to 'self-host', you have a lot of flexibility in terms of what level you want to control yourself and what you want to pay someone else to look after for you. You don't have to actually buy physical hardware and host everything off an internet connection you personally control. If that's at, or near, the extreme 'self-hosting' end of the spectrum, then moving towards 'completely managed', we have:

  • Stick your own hardware in a co-lo (i.e. you outsource the physical internet connection)
  • Use a service like Slicehost where you get full root access to a bare virtual server (i.e. you outsource the physical connection and the 'hardware' provision)
  • Use a service which gives you access somewhere higher up the stack

Everything else is a variant on that last one. It really only matters what level you get access at. Maybe you get a pre-set web server instance in which you can run whatever webapps you want. So-called 'PaaS clouds', like Openshift, are really just this kind of managed hosting, in a way; 'IaaS clouds' are pretty much like Slicehost. Maybe you just get a managed instance of some specific app or service, like Wordpress (or 'email'). It comes down to how much control and privacy you need, with the trade-off for more control and privacy usally being more expense and complexity.

So, there's the theoretical for-and-against of self-hosting. It comes down to the broad conclusion that you probably don't want to do it, and even if you do, you're probably better off going for something in the middle of the spectrum - Slicehost, or one of the new public clouds, or something like that - than really doing (almost) everything yourself.

Assuming you self-host, or are going to start trying, despite all the above: here's some notes on actually doing it.

Domain

Getting a domain of your own is pretty much the Point 0 of self-hosting. It's also, fortunately, pretty simple. You can find a lot of confusing information on the topic but essentially it boils down to: buy a domain name and then set up the information that says 'this domain is associated with this IP address' - DNS records. It is much simpler to do these two things together, through one service. I use No-IP - their prices are reasonable and I've had no problem with their service. There are many other providers. It's really as simple as picking a domain - like my happyassassin.net - paying your fee, and then filling out a little form which says 'www.happyassassin.net should point to IP address xxx.xxx.xxx.xxx, mail.happyassassin.net should point to IP address xxx.xxx.xxx.xxx', and so on. If you're going to host mail for your domain, you'd also need an MX record, which says 'mail for any address at happyassassin.net goes to IP address xxx.xxx.xxx.xxx'. And that's really pretty much it. If you're really self-hosting, as in you own the machines and they're hanging off your own internet connection, all those IP addresses should be your own IP address. You're going to want a static IP, for that.

Mail

Mail is the most complex thing to self-host and probably the least sensible, as hosted mail providers really do have it all figured out. I'm not going to turn this into a comprehensive 'how to host your own mail' walkthrough, because there are many of those already, and if you're going to do it, what you should do is get a hold of a good guide and follow it carefully. But I do have one thing to contribute. I find it helps to bear in mind there are broadly three functions of a mail server, at least in my mental model, and you can pretty much treat them separately:

1: Retrieve messages from your existing mail accounts and serve them back out via IMAP for you to read on your client machines

I do this using fetchmail to actually retrieve the mail, procmail to sort it into folders and spam-test it via spamassassin, and dovecot to serve it back out via IMAP. I would strongly recommend the use of dovecot, it really is the best IMAP server around. It's efficient, actively developed, highly standard-compliant, and supports things like IDLE very well. Other IMAP servers generally fail at at least one of those things. The retrieving and serving out are kind of different functions, but it makes no sense to do one without the other, really. There's no point aggregating the mail from your various accounts in one place without also setting up a convenient interface - i.e. a server - for you to access it with.

2: Act as an SMTP server for your outgoing mail

When you want to send mail you send it through an SMTP server, right. Most people know that. Running your own SMTP server, for your personal use, has the advantage that you don't have to keep changing to an SMTP server that's accessible from the network you're currently on. (Though, of course, if you just use Gmail, you can send outgoing mail from anything...)

3: Accept incoming mail from anyone to mail addresses at a domain you own

This is the most complicated case, probably. The fact that I'm set up to do this is why you can mail me at happyassassin.net, my own domain. When you send a mail there, your mail provider sees that mail to happyassassin.net is supposed to go to an IP address I own, and sends it there. That IP address actually is my own IP address, and connections to port 25 on that IP address are forwarded by my router to my mail server, which accepts the mail and sticks it into my mail folders just like fetchmail/procmail do for the email addresses I don't administer myself.

I'm not going to explain in detail how to achieve all the above, but the key point is to remember these functions are distinct - you can do any one of them without doing the others. Where it's easy to get confused is that you usually would use the same application, the same process, to do functions 2 and 3. I use postfix, because it's marginally less insane than sendmail. But it's best to think of them as two separate operations, and do one and then the other. If you think in terms of 'how do I set up postfix', you're likely to get confused - finding guides for function 3 when what you really wanted was function 2. I know I did.

Another little note on that topic: the sketch of happyassassin.net mail I gave is, strictly speaking, incorrect. Your mail provider doesn't really see that mail for happyassassin.net should go to my IP address: it sees that mail for happyassassin.net should go to No-IP. Why? Well, because I host my servers off my home internet connection, and that has port 25 blocked. Most home internet connections do. The way email actually works, mail for a domain is always initially delivered on port 25. The DNS record which says 'mail for happyassassin.net goes to IP XXX' cannot say 'IP XXX on port 26'. It just says 'IP XXX'. The port is hard-coded in the standards. So if you have a connection on which port 25 is blocked, you really can't be the server that initially receives mail for your own domain. No-IP provide a neat service to get around this, called mail reflector. Essentially you set up your DNS records so that mail for your domain goes to No-IP's server, and you tell No-IP the actual port of your server. Then No-IP's server simply forwards mail straight through to your server. They don't store it or have any access to it, except in the case that your server is down - they will keep it on theirs until your server comes back up, then forward it on. It's a neat way around the port 25 problem, which costs $40 a year - at which price you could instead have fastmail handle your entire mail setup, including your own domain's mail. Again, like I said, self-hosting is almost never actually economically sensible.

Web

Setting up a web server, at the 10,000 feet scale, isn't very difficult. Basically, you do 'yum install httpd' (or equivalent), and you're done. You already registered www.mydomain.com and pointed it to your server's IP address. Now you set your router to forward traffic on port 80 to the appropriate box, and you're done. People going to www.mydomain.com will see a 'hello world!' post that's the default homepage for Apache. Oh, and you do want to use Apache. There are alternatives, but they're rarely what you want for self-hosting, and you will find much more help with configuring Apache than configuring anything else.

These days, you're likely not going to be faffing around creating static content and dumping it in /var/www/html on your server. You really want to run webapps - you probably want to run a Wordpress blog, for instance. Essentially your web server is providing useful services for you.

The 10,000 foot overview of how to install web apps is similarly simple: yum them. The most common ones are packaged. Wordpress is: you can just do 'yum install wordpress'. There are guides for the finicky bits of configuration.

There's one stumbling block you'll hit for most webapps, so I'll mention it quickly: they almost all need a database. Web apps rarely store things as files on your local disk, because that's silly. They want access to an SQL database instead, and they'll store their configuration, your blog posts, and whatever else in there. You almost certainly want to use MySQL for this. MySQL will be packaged in any sane distro. Once you install it, it will probably be configured with no root password and a guest account. You will want to set a root password and destroy the guest account. There are guides to how to do this in the excellent MySQL documentation. Then, for each webapp you install, you'll likely create a new database specially for that webapp, with a user account specially for that webapp which has access to the database. You can do this with a single one line command. The webapp will ask for a MySQL username and password as part of its setup process; feed it the username you created especially for it. That way, no webapp can access another's data; only root will have access to all the databases, and you should only use the root account for any manual poking of the database you personally have to do. Never give the root password to any webapp (or any other person). The most popular webapps, like Wordpress, tend to have the MySQL setup well documented, and you can apply the documentation to any other webapp which just needs a simple MySQL config to work. Which is most of them.

That's web serving. Here are some of the webapps I run on my server. You may not have known about some, and find 'em useful.

Wordpress. Well, everyone knows about Wordpress. It's a blogging platform. If you want to have a blog on your server, you're probably going to want to run Wordpress. It's well documented, easy to set up, hugely popular (and hence well supported), does everything you need from a blog, and has a bewildering array of plugins. Of course, if all you want is to have a blog, it's almost certainly a better idea just to get it hosted by wordpress.com than faff around with setting up your own web server.

Roundcubemail. This is a webmail front end. Combined with my mailserver, it's the last puzzle piece in extremely painfully replicating the functionality of Gmail - it gives me a pretty snazzy web front end to my mail, for the rare cases where I'm on someone else's system and don't want to set up an IMAP client, or something. It also came in quite handy at one FUDCon when the port blocking was so tight that IMAP clients didn't actually work. Roundcube is a very very good webmail app, it has all the functionality of a desktop mail client, is pretty fast, and has a very snazzy interface. The old-school choice, Squirrelmail, is about as functional but nowhere near as pretty.

tt-rss is a news reader webapp. Running it is like hosting your own Google Reader, essentially. It's a lot nicer than just running separate news reader clients on each of your client machines, because it means your read/unread state is always in sync. But of course, you could always just...use Google Reader. It's not like knowledge of what RSS feeds you like is likely to be astonishingly private information.

MyTinyToDo is a very simple todo list webapp. I tried for years to find a big stonking egroupware suite - contacts, calendaring, and tasks, essentially - which would cover those things and sync well with my desktop clients and my phone. I never quite did. But mytinytodo handles one piece of the equation - tasks - just fine. I haven't bothered trying to sync it with desktop clients / phone because you can just use the web interface very easily on any of those devices, it renders nicely on phones. Of course, you could always just use a hosted service like Remember The Milk.

OwnCloud is a 'personal cloud' server, or to avoid the buzzwordiness, it's basically just a file server webapp. You point it at a place where files live and it makes them available through a web frontend and also via WebDAV (which lets you mount them as a shared drive on most OSes). It pretty much just does that, but it does it quite well and easily. At FUDCon, Jeroen gave me a long list of things that are wrong with it, and Jeroen is massively smarter than me so I'm sure he's right, but all I know is it does what I ask it to. It's handy for, say, storing your (encrypted!) password database, or a document you want accessible from anywhere. I store a lot of my notes in it. Your hosted equivalent would be, say, Dropbox.

Finally (man, 4000 words? Anyone still awake?) we come to the one thing I host myself, find useful, and could not find a hosted-provider equivalent of: IRC and IM proxying.

This achieves for IRC and IM what using a mail server achieves for mail, or using a web feed reader achives for news: you can use many clients without them conflicting, and with the state preserved between them. How it works is essentially that you run an app which acts as both an IRC client and an IRC server. It connects to all your IRC servers, and then on your client machines, instead of connecting directly to Freenode or EFnet, you connect to the proxy, which also acts as an IRC server. It then forwards all the traffic to you.

What does this get you? Well, you can sign in from six different clients at once - and instead of each looking to the rest of the world like a separate user, they all act as 'you'. You can have part of a conversation from your laptop, part from your phone, and part from your desktop, and the outside world won't know the difference.

Also, as the proxy's always logged in, you can disconnect all your client machines, and the proxy will keep storing conversations, including any private messages. Then the next time you connect a client, you'll get a log of all the channel traffic that happened while you were away, and any PMs you got sent will show up. It's very handy.

Finally it'll give you a handy central store of logs. It's just a much better way to IRC.

I use Bip as an IRC proxy. It's very easy to set up - really, you just install it and give it a list of IRC networks and channels you use, and tell it your nickname. Then you run it, and set up your IRC clients to connect to it, not directly to the networks. And you're done. It's probably the easiest thing you can self-host, as well as being the most useful.

On the same machine I run Bitlbee, which is an IM proxy - it connects out to MSN, Jabber, ICQ, AIM and so on, and also acts as an IRC server, effectively turning IM traffic into IRC traffic. I then have Bip use my Bitlbee server, so when I'm using MSN, my desktop is connected to my Bip instance, which is connected to my Bitlbee instance, which is connected to MSN. Fun, huh? Bitlbee can also actually connect to Twitter and Identi.ca, effectively turning your 'social network' traffic into IRC. You can tweet just by typing a message into your IRC client, and tweets from people you follow pop up as IRC messages. It's a fun interface if you're used to using IRC.

So...that's my self-hosting story. Why you probably shouldn't do it, and some things you might want to run if you do. Hope it's helpful!

FUDCon Blacksburg: Day 1

Welp, I'm here at FUDCon Blacksburg. I meant to blog about it ahead of time, but never quite got around to it.

It's a slightly odd organization this year, very hackfest-heavy, with the keynote and barcamp stuff happening only on Day 2. So far I'm feeling fairly useless, as I can't contribute much to any of the running hackfests, but never mind.

We do have a good QA presence: myself, Tim Flink, John Dulaney and Sandro Mathys are all here, a nice RH / community mix. We are aiming to give a few talks and run a couple of hackfests, an AutoQA hackfest this afternoon and a hackfest to work on testing the new Anaconda UI on Sunday, so keep an eye out on the schedule for those!

I'm going to pitch a talk I've been thinking about vaguely but only actually somewhat 'written' in the last half hour, called Cloud 0.1, about running your own infrastructure for mail, web, news and so on - in other words, these days, duplicating Google's services with less reliability, at more expense and effort. But hey, it's 'fun'. I don't have any slides, but I think it might make a mildly diverting hour at least. We'll see if anyone signs up for it. I do have a more QA-related talk as well, but it's more for a general audience than a FUDCon one - it's about various principles I've learned on how to function usefully as a small group trying to do a big job within a big project in a short timeframe. I could pitch that too, but we're probably going to be long on talks for a single day anyway, so I may not. Tim and I have also been talking about maybe coming up with a talk on Cool QA Tools, with his new remote_install thingy, fedora-easy-karma, the proposed GUI for f-e-k, and a few others.

If you're at FUDCon and want to chat about anything, feel free to buttonhole me at any time - I'm mostly just taking a 'show up and see what happens' approach to this one. I'll update as things start to come together...