The End of a Bug’s Life

I recently wrote an article describing five best practices that will increase the odds of your bugs getting fixed. If you were killing time at work, you might have read it. One  practice I suggested was closing your Resolved bugs as early as possible. Although I follow this practice now, I didn’t always do so.

Not long after I became a Microsoft SDET, I was assigned to a project that was approaching its release date. Two weeks before our deadline, it looked like we were in trouble. (I’ve since learned it always looks like you’re in trouble two weeks before a deadline.) There were a lot of bugs that had been Resolved by Dev, but not validated and Closed by Test.

A couple of days before our deadline, the Program Manager emailed the team a status update. We were making great progress, and there were only a few bugs left to Resolve and Close. The email had a chart that looked like this:

clip_image001

The graph showed we went from over 300 Resolved, but un-tested, bugs to almost zero in two weeks. How did we do this? A lot of long days. Working over the weekend. Meals at the office. Little sleep. Lots of stress. As good teams do, we rallied together to meet our deadline. It felt good.

It felt good until our next deadline, when we repeated this cycle all over again.

I’ve since learned there’s a simple solution to avoid the stress that comes with validating so many bug fixes just before a deadline: don’t wait until the last-minute to close your bugs!

Closing your bugs should be part of your weekly routine, not something you do in batch before a deadline. This approach has six advantages.

  1. You won’t have to work long hours to close bugs at your deadline.
  2. You’ll never be tempted to cut corners to close bugs at your deadline.
  3. Bugs are easier to close when they’re fresh in your mind; if you wait too long, you’ll have to re-learn them.
  4. If you find a problem with the bug fix, it gives the developer more time to resolve it.
  5. If you have to re-open the bug because it still doesn’t work, it won’t be rejected because the “bug bar” has been raised.
  6. When a bug isn’t closed, it’s not known whether it’s fixed or not, or whether the fix breaks something else. The more Resolved bugs you have, the less you know about the real state of your software.

Since there are so many advantages to closing your bugs in realtime, rather than in batch, why don’t more testers do it? Let’s create another list.

  1. It’s not very interesting to close your bugs. After automating a test, finding the bug, and investigating the root cause, all the interesting work is done. Verifying the fix is either a matter of running an automated test, or manually testing the feature again; neither of which are favorite pastimes of most testers.
  2. Although many organizations encourage the timely resolution of bugs, few encourage the timely closing of bugs.  I’ve worked on many teams that had guidelines on how quickly bugs must be Resolved, such as P0 bugs within 24 hours, P1 bugs within a week, and P2 within the current milestone. However, there were rarely similar guidelines for testers to verify and Close these bugs. It seems reasonable that testers should be held to  the same standards as developers, and have to close P0 bugs with 24 hours of them being Resolved. One way I’ve seen teams encourage the Closing of bugs is by having a “bug jail”. Once a tester or team exceeds a pre-defined number of Resolved bugs, they have to close these bugs before moving on to other tasks.
  3. Some testers work on teams that reward employees based on the number of bugs they find. What effect does this have on testers? It encourages the finding of bugs, not the Closing of them. If anything, testers should be rewarded based on the number of bugs fixed and closed, which would at least encourage the resolution of quality bugs.
  4. Sometimes testers don’t Close their bugs in a timely manner because they’re disorganized. Most testers have a hectic schedule, and it can be hard to keep track of which bugs you need to Close. Fortunately, Microsoft testers have access to a tool called Bugger. Bugger docks on your desktop, queries the bug database, and displays the number of bugs assigned to you. Clicking on Bugger shows the details. With Bugger docked on your desktop, it’s almost impossible to forget which bugs are assigned to you.

clip_image002

If you work at Microsoft, install Bugger and avoid some of the stress that comes at project deadlines. If you don’t work at Microsoft, consider writing your own version of this tool. Your manager and co-workers will appreciate it.

How to Get Your Bugs Fixed

Fixing a bug.

One of the worst kept secrets in Test is that all released software contains bugs. Some bugs are never found by the Test team; others are found, but the cost of fixing them is too high. To release good quality software, then, you not only have to be proficient at finding bugs, but you also have to be proficient at getting them fixed. Unfortunately some testers are good at the former, but not the latter.

A project I worked on a couple of years ago illustrates the point. We were working hard to resolve our large backlog of active bugs as the release deadline approached. Just before our deadline, I received an email declaring that we were successful, and our backlog had been resolved. This meant every bug had either been fixed, postponed, marked as a duplicate, etc. Congratulations were seemingly in order–until I looked more closely at the chart included in the email, which looked like this:

How we resolved our backlog of active bugs.

I noticed the size of the green Fixed wedge was rather small. The chart represented 150 bugs, so only 4 or 5 were fixed. I wondered why it wasn’t more. Perhaps many of those logged “bugs” weren’t product bugs at all. This turned out not to be the case, however, as the small size of the Not Repro, By Design, and Duplicate wedges tell us.

Now look at the size of the Postponed and Won’t Fix wedges; they make up almost the entire pie! It turned out that although most of the bugs were “real”, they were too trivial to fix that late in the development cycle. For example, there were misaligned UI buttons and misspellings in the product documentation.

I agreed we made the right decision by not fixing these bugs. However, each legitimate bug that was reported, but not fixed, suggests that some amount of test effort didn’t affect the released product.

It takes a lot of effort to log a bug! You have to set up the test environment, learn the product, find the bug, check if it was already logged, investigate the root cause, and file the bug. Let’s be pretentious and call this effort E. Since we had 150 bugs, we had:

150 * E = 5 bugs fixed

In a perfect world, this would be:

150 * E = 150 bugs fixed

I’m not saying that our test effort wasn’t valuable. If the Fixed wedge contained just one security flaw, it would easily justify our work. It did, however, seem as if the ratio of test-effort to bugs-fixed was less than ideal.

Some testers might blame this low ratio on the program manager who rejected their bugs. Others might blame it on the developers who either caused the bugs or never got around to fixing them. These are excuses.

Testers should take responsibility for their own bugs-logged to bugs-fixed ratio. If you think one of your bugs should be fixed, it’s up to you to make sure this happens. You’ll probably never get to the point where every one of your bugs has a positive impact on the product, but here are five “best practices” we can follow to improve our odds.

Account for Bugs in the Schedule

All software contains bugs. Your Test schedule should account for the time it’ll take to find and investigate these bugs. Similarly, the Dev schedule should account for the time it’ll take to fix them. If you don’t see this time in the Dev and Test schedules, kindly call it out.

Start Testing as Soon as Possible

The earlier your report a bug, the better chance it’ll be fixed. Not only will Dev have more time to fix your bugs, but it’s also less likely they’ll be rejected because the “bug bar” is too high. A common mistake is waiting until the end of the release cycle to schedule your bug bashes or performance testing. If you schedule these activities close to the release date, you’ll end up logging a bunch of bugs that never get fixed.

Log Detailed Bugs

When you log a bug, make sure you include all the information necessary to reproduce the issue. This should include repro steps, environment information, screenshots, videos, or anything else that may help. Be sure you also describe the customer impact as detailed as possible; this is often the biggest factor in the decision about whether to fix or defer the bug. For example, detailing real exploits for security bugs, instead of hypotheticals, can help push for fixes. Finally, use reasonable spelling, grammar, and punctuation; it doesn’t have to be perfect, but it has to be understandable by the triage team. Poorly filed bugs are often resolved as Not Repro, Won’t Fix, or Postponed because the triage team doesn’t understand the true impact of the issue.

Close Your Resolved Bugs as Soon as Possible

Although testers love finding bugs, they don’t like verifying bug fixes. Some testers consistently keep a large backlog of resolved bugs, and have to be “asked” by management to close them as the release date approaches. Their theory is that it’s better to spend time finding new bugs than closing old ones. The problem is that by the time they get around to verifying the bug fix, it’s often too late to do anything if the fix didn’t work.

Push Back on Bugs You Feel Strongly About

If you believe a bug should be fixed, don’t be afraid to fight for it. Don’t put the blame on the triage team that closed your bug as Won’t Fix or Postponed. Take ownership of the issue and present your case. If you can successfully push the bug through, both you and the customer win. But even if you can’t push it through, you’ll be respected for fighting for what you believe in and being an advocate for the customer. (Just be smart about which battles you pick!)

If you have other best practices that improve the odds of your bugs being fixed, please share them below. I would love to hear them.

Testers Caught Sleeping on the Job

At Microsoft, we submit every code change to a peer-review before it’s checked in. I’ve performed hundreds (although it feels like millions) of code reviews in the six years I’ve been in Test. One of my biggest red flags during a code review is the Sleep statement.

The Sleep command suspends the current thread for a specified period. A typical test that uses Sleep might look like this:

  // Send a message 
  SendMessage();

  // Wait 2 minutes for message to be received.
  Thread.Sleep(120000); 

  // Message must be there. Let's read it.
  ReadMessage();

The problem with Sleep statements is that they usually sleep either too long or not long enough. If the sleep is too short, your test will fail because the expected state hasn’t been reached. In the above example, if it takes three minutes to receive the message, the test will fail since it’ll try to read a message that isn’t there yet.

If the sleep is too long, your test might still fail; the expected state may have come and gone. But even if it passes, the test isn’t as efficient as it could be. This could lead to test passes that take a long time to complete. If this message was received in one minute, the test will take a minute longer than required. One minute might not sound like a lot, but if you have thirty similar tests, you’re wasting a half hour of run time.

When you find a long Sleep statement inside a loop, there could be room for a significant performance improvement. During one code review, I found a test that looked like the following; this one loop wasted more than a half hour of run time.

  // Send a bunch of messages 
  for (int i=1; i<30; i++) 
  { 
     // Send one message 
     SendMessage(i); 

     // Wait 2 minutes for message to be received. 
     Thread.Sleep(120000); 

     // Read the message.   
     ReadMessage(i); 
  }

Long Sleep statements are also a common cause of tests that fail intermittently. The examples we’ve been using assume we’ll receive every message within two minutes. Even if that’s the case 99% of the time, if the test runs every day it’ll still fail every few months. Furthermore, these tests aren’t very portable. If you run them on a faster or slower machine, or on a server with a different load, the tests might fail. Each of these false positives must then be investigated, wasting your semi-precious time.

When you find a Sleep that lasts more than a few seconds, the best solution is to remove the Sleep and subscribe to an event that’s raised when the expected state occurs. This removes all the guessing out of your test case, and ensures you’re not waiting too long or too short. Such an event might not exist, in which case you may be able to write one yourself, or ask the product developer to create one.

If an event-based solution is impractical to use, all hope is not lost. In that case, replace the long Sleep with logic that polls until either the expected event occurs or until a timeout period is reached. In test above, we saved thirty minutes of run time by replacing the two-minute Sleep with the following:

  private static bool PollForMessage() 
  { 
     DateTime timeout = DateTime.Now.AddMinutes(2); 
     bool gotIt = MessageExists(); 

     // Loop until message received or two minutes has passed 
     while (!gotIt && DateTime.Now < timeout) 
     { 
        Thread.Sleep(1000); 
        gotIt = MessageExists(); 
     } 
     return gotIt; 
  }

You may be wondering why I’m whining about Sleep statements, but then went ahead and used one in this solution. That’s because, at most, this routine will only sleep one second longer than it should, which isn’t too shabby. If you find the message isn’t always received within two minutes, you can safely go ahead and change the timeout to three minutes without wasting any run time.

The next time you find yourself writing a long Sleep statement, replace it with either an event, or a function that polls until your expected state occurs. If you do this consistently, your test passes will finish faster with more consistent results.

P.S. After I wrote this article, I found that BJ Rollison recently wrote about the same topic, only much more eloquently, on his own blog. See for yourself.

Test Estimation Tips (But No Tricks)

Estimating the time it takes to test a feature is an important, yet often overlooked, skill. If your estimate is too low, it can affect your product’s quality and release date. If your estimate is too high, it can lead to an inefficient use of resources.

The topic of project estimation is complex enough topic to fill entire books. (See what I mean?). But here are a few simple tips that can quickly improve your estimation skills, not to mention your work/life balance and general happiness.

Consider non-functional testing

A common mistake among less-experienced testers is to only consider functional testing when providing an estimate. In this type of testing, software is evaluated against the functional requirements. On many projects, however, non-functional testing can take longer than functional testing. Non-functional testing includes areas such as security, performance, stress, accessibility, and long-haul.

I once badly underestimated my testing effort because I didn’t consider long-haul testing. (This was mostly because I didn’t know what long-haul testing was.) I worked some crazy hours to make sure my oversight didn’t negatively affect my deadline. My extra hours however, did negatively affect my wife’s mood, as she was left by herself many evenings to care for our newborn.

Consider the non-testing tasks

Non-testing tasks include writing your test plan, conducting a test plan review, participating in specification reviews, attending meetings, updating the automation framework, and logging your test cases and results. These tasks can easily take 25% or more of your time. You might want to track the time you spend on these tasks for a project or two to get a good idea of how long they take.

Assume bugs will be found.

All complex software contains bugs. One of the goals of testing is to find these bugs so a decision can be make on whether or not to fix them. If you’re not planning on finding bugs, then you’re essentially planning that your tests will be ineffective. Don’t plan on writing ineffective tests. Account for the time it’ll take to investigate test case failures, reproduce issues, log bugs, follow-up with dev, fight for bugs in triage, validate fixes, and re-run your tests.

Be cautious about giving an estimate before doing your research

One of the biggest mistakes you can make is to offer an early ballpark estimate before doing your due diligence. Your initial guess will greatly influence your final estimate due to a behavior known as anchoring. People have a tendency to “anchor” to their first guess and make small adjustments from there. These adjustments, however, typically under-estimate new information, leading to an inaccurate final estimate.

For example, the always-reliable Wikipedia cites a study in which people were asked to guess the percentage of African nations that are in the United Nations. Those who were asked “Was it more or less than 10%?” guessed lower values (25% on average) than those who were asked if it was more or less than 65% (45% on average).

I was once asked to estimate my test effort during the project kick-off meeting. I had just learned about the project, and had no business giving an estimate. But I was pressed to give a rough estimate, so I guessed two weeks.

When wrote my test plan, I realized my original estimate was too low. I had forgotten to consider some non-functional testing tasks and found the automation framework I planned on using wasn’t reliable. I spoke to my manager and we both agreed to adjust the estimate, but the damage had already been done. My two-week guess had become our anchor. We doubled the estimate to four weeks, but it turned out we should have raised it to six. As a result, my work/life balance took a turn for the worse, and once again, my wife was not happy.

Ask the developers for their estimate

Although it’s not ideal, the reality is that at some point you will be asked to estimate your test work before properly analyzing the project. In these situations, the amount of development work is a good gauge for the amount of test work. The more complex the feature, the longer it takes to both develop and test.

I’ve found that a reasonable rule of thumb is that the test effort is 1 to 2 times the development effort. If this ratio sounds high, consider how many times you gave a test estimate, but then wished you had more time as the project completed. I want to stress, however, that this is just a rule of thumb; it should only be used if you absolutely have to give an estimate before doing your due diligence.

If the developers haven’t yet made their estimate, then you have a stronger case for not yet giving yours. You can’t reasonably be expected to make an accurate test estimate if the developers don’t even know how complex their code will be. If management still insists on an estimate, have them speak to your wife.

Ostrich a la Mode

One hallmark of a great test team is how they behave in the final days before a release. I’m not referring to working late to automate the last few tests, validate bug fixes, and run a final test pass. These long hours are an expected part of the job, and when you think about it, these tasks are really just bookkeeping. I’m talking about something that actually affects the product: does your test team still try to find bugs just before a release?

The best test teams I’ve worked on (and still do, by the way) looked for bugs right up to the release date. Testers took pride finding a release-blocking bug. The test team, manager included, congratulated the tester that found one. The developers and program managers (half) jokingly gave the tester a hard time, but everyone rallied together to resolve the issue on-schedule. Overcoming this adversity brought the team closer together. More importantly, we prevented a customer from finding the bug.

Some teams, however, don’t share this sentiment. In The Art of Software Testing, Glenford Meyers wrote that most teams are “measured on the ability to produce a program by a given date and for a certain cost.” Unfortunately, finding bugs late in the release cycle “may be viewed as decreasing the probability of meeting these schedule and cost objectives.”  These teams can go into ‘ostrich mode’–their heads buried in the sand for fear of finding a bug and missing their goal.

I remember working on a team a while back when our pass rate fell short of our release criteria. On the day of Test Complete, the team was called into a meeting where we were asked, “How can we get this pass rate above 95% today?”

The problem I had with this question was that the emphasis was on meeting the metric, not finding the bugs that caused the failures. Worse, it didn’t convey the sentiment that finding a bug would even be appreciated.

One way to quickly raise a pass rate is to fix errors in the test automation code. Another is to add new passing tests–with every new passing test case added, the pass rate increases. Both of these accomplish the stated goal, but neither improves the product quality. And what if we succeeded in reaching our pass rate goal? Shouldn’t we still investigate the failures to see if any were caused by a ship-blocking bug?

Instead of being asked how we could raise our pass rate, we should have been told, “Let’s investigate these failures right away and make sure any product bugs are triaged and resolved immediately.”

Meyers wrote that most teams “are more interested in having their system test pass proceed as ‘smoothly’ as possible and on-schedule, and are not truly motivated to demonstrate that the program does not meets its objectives.” As a tester, however, it’s your responsibility to report an accurate picture of the product quality. Think of yourself as a lawyer; the product is on trial and you’re representing the customers. If you find a ship-blocking bug, present it as evidence. A ‘jury’ will then use that evidence to decide whether to release on-schedule.

Have you ever worked on a team that went into ostrich mode? Please take the poll and share your stories in the comments section below.

%d bloggers like this: