April 20th, 2010
I wrote up a recap of Graphics Test Week for the mailing lists, thought it had some interesting data so I’m reprinting it here.
The insanity that is Graphics Test Week is now over, so it’s time for the recap!
We had a great turnout again; thanks to everyone for testing. Here’s some interesting numbers I just pulled out…
f11 nouveau: 104 tests, 42 bugs – ratio 0.40
f12 nouveau: 53 tests, 34 bugs – ratio 0.64
f13 nouveau: 78 tests, 26 bugs – ratio 0.33
f11 radeon: 55 tests, 46 bugs – ratio 0.84
f12 radeon: 61 tests, 81 bugs – ratio 1.33
f13 radeon: 48 tests, 33 bugs – ratio 0.69
f11 intel: 23 tests, 21 bugs – ratio 0.91
f12 intel: 29 tests, 31 bugs – ratio 1.07
f13 intel: 38 tests, 38 bugs – ratio 1.00
The ‘ratio’ is the number of bugs per test. Obviously there’s wiggle room here; different people report different bugs, and some of the drivers implement features the others don’t (and hence have more ‘surface area’ for bugs). But I think they’re quite fun anyway. Nouveau and Radeon both regressed from f11 to f12, according to the numbers, and got better than either previous release for f13 (at Test Day time). Intel has stayed fairly steady. According to this analysis, Nouveau wins the ‘least buggy driver’ contest by a fair margin, which is interesting! For F13, it has half as many bugs per test as Radeon, and a third as many as Intel. (Of course, I’m sure some of our erstwhile devs would argue it could fairly be renamed the ‘least crack-addled hardware manufacturer contest’…)
The F11 Test Days happened in March 2009, the F12 Test Days in September 2009, and the F13 last week; these are pretty comparable points in the respective cycles.
In terms of participation, we had the largest number of testers for F11, with the nouveau number accounting for most of that; I suspect this is because nouveau was very new and shiny in F11 and impossible to use on most distros, so people were very interested in trying it out for the first time. F12 had the fewest tests run, and F13 pretty much splits the difference.
Since I’m getting up a head of steam, let’s look at fixes!
f11 nouveau: 42 bugs, 4 open, 8 closeddupe, 24 closedfixed, 6 closedunfixed – 70.59%
f12 nouveau: 34 bugs, 11 open, 8 closeddupe, 14 closedfixed, 1 closedunfixed – 53.85%
f11 radeon: 46 bugs, 14 open, 10 closeddupe, 19 closedfixed, 3 closedunfixed – 52.78%
f12 radeon: 81 bugs, 19 open, 32 closeddupe, 28 closedfixed, 2 closedunfixed – 57.14%
f11 intel: 21 bugs, 7 open, 1 closeddupe, 12 closedfixed, 1 closedunfixed – 60%
f12 intel: 31 bugs, 7 open, 12 closeddupe, 12 closedfixed, 0 closedunfixed – 63.16%
I counted CANTFIX, WONTFIX and INSUFFICIENT_DATA as ‘unfixed’, ERRATA, RAWHIDE, CURRENTRELEASE and NEXTRELEASE as ‘fixed’. NOTABUG I lumped in with DUPLICATE as the ‘closeddupe’ number (these are reports that should be discarded from consideration entirely). The percentage is calculated as:
closedfixed / (bugs – closeddupe) * 100
i.e. it roughly indicates the percentage of genuine unique bugs reported that have been fixed so far. These numbers are pretty close, both across drivers and across releases; we’ve fixed just over half the bugs reported. I think the outlying high nouveau result is probably a consequence of ‘low-hanging fruit’ – the driver was in a pretty initial state at that point, so the bugs exposed are likely to have been, on the whole, easier to fix. Obviously I’ve left F13 out as the maintainers have had only half a week to work on the bugs!
Thanks very much to all testers, and to the wonderful Fedora X.org developers and triagers:
for helping to organize the events, set up the test cases, man the IRC channel and triage – and of course fix! – all the bugs.