Important: This is not a rant, and in no way I want to blame anybody or to make anybody feel responsible. I’ve thought about all this before writing it down and the best word to sum it up is: frustration.
About a week ago kernel-2.6.32.10-90 was pushed to Fedora 12 updates with a bug in the SSB driver that apparently made any machine with a Broadcom NIC crash on boot. I’m not sure about the impact of the problem, but checking different bug reports, the Broadcom thing seems a constant (#577463, #578217, might be others).
So if you own a Dell machine is very likely you’ve been affected and your computer won’t boot (but you already knew that, didn’t you?). That’s my case with a Dell laptop (I don’t use the Broadcom ethernet NIC at all, but shit happens).
I’m going to focus in the bug report #578217, wich covers the kernel that was installed in the update in my laptop: it was filed 2010-03-30 11:14 EDT, about 3 hours before I updated my system, and yesterday 2010-04-05 10:25:18 EDT kernel-2.6.32.10-94.fc12 update was pushed to updates. Something happened to that update, because today we can see kernel-2.6.32.11-99.fc12 pushed to updates waiting for testing.
OK, shit happens. It shouldn’t happen, but this time it happened: we pushed an update that impeded the system to boot.
The first obvious thing is: OK, let’s work so it doesn’t happen again. QA, testing before pushing, all this stuff Fedora it’s doing quite good.
The second thing is: OK, everything went wrong. Now, what?
A broken kernel after an update it’s not a big deal for me, and neither it is for any mid-experienced Linux user. Although I needed two tries, I could easily get into the GRUB menu and boot the older kernel (Fedora keeps three kernels, the update and two old).
But, what about non experienced users? I don’t know if there’s any kind of mechanism to detect a problem in a kernel update, but it could be definitely a good idea.
I’ve worked hard to push Fedora in local governments in Spain, and while all this was happening I thought: if we deploy 10k desktops in the local health care system and one day after an update, all the machines refuse to boot. Man, just thinking of it makes me sweat bullets.
Anyway, you don’t have to go that far. My mom uses Linux, and only Linux. She trusts the updates and she’s been taught to install them. What would happen if her computer broke because of this problem and she’s to do some dark computer stuff to fix it? What would be the effect in her confidence in Linux and Fedora?
And the third thing is: more than seven days to push a fix seems too much, although as I said the workaround it’s pretty easy to run.
Update: we have an answer for the first question: the testing process failed, and seems that it was related to a bug in Bodhi mixed with the ongoing review of the update policy (there’s a link to a proposal, but I don’t know if it’s current).
