04 November 2008

Help Avoid Regressions

If you saw Planet Ubuntu today, you saw Jonathan Ernst's post about feature regressions. I was discussing this with Jordan Mantha last week. I think too much time is lost during sync, and with nobody being willing to test until beta or RC by which point it's too late to fix much. He seemed to agree. I think a year-long release cycle would be nice, but he says Ubuntu's testers are a major chunk of GNOME's testers. Hrm, that's tough then, because we don't want to cut out the number of people testing each of GNOME's releases.

Well, anyway, we got to talking about testing and more effective testing. Writing automated test scripts is something that comes up a lot, but it never happens. And anyway, we don't even have specs written up for what those automated tests would do. So we need to start small. We need a set of specs, a list of what minimum functionality there needs to be in at least all of the default desktop software. And we need to have directions for using that functionality by hand. Then, during development, we can all run through little 5-minute tests each week and ensure a minimum of functionality. The more often we test, the quicker we catch regressions.

To this end, I've created a page on the wiki for Application Testing. I made lists of applications that need to be tested. We need test cases. If you've got 15, 20 minutes, why not pick an application out of the list, create its page (there's a template available!), and fill in a few test cases? Right now, Seahorse is the only one with test cases. Or if you get there and more pages have been started, take a read through and fill in any missing details or add a few more test cases. Each test case takes about 10 minutes to write for simple tasks.

I want to have a lot of these test cases ready by UDS, but I can't do it myself. Anyone with a reasonable grasp of English (so anyone that can read this) can help write these. You don't need to be a developer. You don't need to know how to triage. You just need to know what the application can do and what buttons you need to click to do it.


Raseel said...

I think this is a great idea. But people will need some mentoring as the first few Entries get populated. Like, for Eg, I have updated the Gnome Terminal App, but am not sure if my tests are valid.

But I'm sure, as more and more people start contributing, the newbies will learn from their Entries.

Mackenzie said...

Well, I filled in a chunk of Seahorse's page and linked it as an example of how to write test cases. Of course, it's perfectly fine to refine others' test cases, and I hope people do go through and try to improve and grow the test cases.

Mackenzie said...

Er, Raseel, what page did you do Gnome Terminal on? There's a link on that page for where to do it, but I don't see it.

swegner said...

What do you mean, automated testing never happens? Just about every software project should have a "make check" target in their source distribution. These tests should (ideally) cover every aspect of the program-- from unit testing, to story testing, to *drum roll* regression testing.

I don't think we need to start from scratch here. It seems like the problem is either: a) the testing already built into most source packages isn't being run often enough (do they get checked with every new package version), or b) the test suite inside the source package isn't robust enough. I assume the latter is the case, and this is where we could concentrate our more effectively. I realize not everybody is a programmer, so perhaps others can contribute by adding these "user stories" for other programmers to translate into real automated tests.

jblackhall said...

It's an interesting idea and I think it will help, but I think it banks on the assumption that these regressions aren't found early enough (or at all). The point of Jonathan's post is that there are known bugs that are (in some cases major) regressions and we're clinging to a date over a little bit of quality.

For example, the udev CD-rom ejection bug was found at the beginning of October. This was known weeks in advance of release and it should have blocked the release for a few days, especially since a fix was released less than a week post-release. People without regular Internet will suffer from these bugs, and it's just bad form to release a product with major bugs and regressions like that.

It's obvious that not every regression or bug should be cause for blocking release, but there should DEFINITELY be some, especially when there's good evidence that a fix would be available soon, as was the case with the CD-ROM bug.

I do think it's tied to not wanting to change the release number, and I whole-heartedly agree with Jonathan that scheduled release dates should aim for the beginning of the month in order to give more wiggle room to hold back a release for a couple of days (or even 2 weeks if necessary). I also think it will help focus developer attention in the weeks before a release to say "hey, we really need to concentrate on X and Y during the next week or we're going to be pushing back the release date."

Mackenzie said...

I don't know anything about that, but I think that most of the bugs filed happen when people stumble across them on their way to do something else. There aren't any real guidelines for *how* to test, just "use it, and if something goes wrong, file a bug" which could be a lot stronger if we knew what to look for. And Ubuntu does not, as a project, currently have any automated testing in place. It's entirely up to end users to file a bug report. I think the assumption is that if upstream has any tests, they've done them already.

There are a lot of things that happen every release that could have been caught earlier if it wasn't for the fact that most of the testers start during Beta or RC. The aim is the find the bugs early enough in the development cycle that we're not scrambling the last week trying to fix everything.

Unfortunately, things like usability bugs are less likely to be noticed by the power users who are fine with alphas, and will likely only be pointed out by the people who wait til it's released. That's why the DC LoCo has started doing usability testing though.

e-ricardo said...

Good I like that idea ! but the test just in Gnome default programs on Ubuntu or all programs defult, in Gnome develop for Gnome community ?


Justin said...

Isn't this exactly like what the QA people are doing at https://testcases.qa.ubuntu.com/ ? It kind of seems like a duplication of effort.

method said...

I think there should be a /usr/share/tests/ directory that would contain all the tests for every package. A cruisecontrol-like program would run the tests on a dedicated machine every time the distribution is "built" (expect it to take a looong time).

Mackenzie said...

I didn't know about that, but Henrik merged them yesterday.