Why NOT to fix a bug

Us testers love to have our issues/bugs fixed, especially Sev 1 (i.e. crashing or data loss) ones. Sometimes we love it when they DON’T fix a bug. Say what? Yes, I fought to NOT fix a crashing bug. But I’m getting ahead of myself.

Whenever we find a bug, we assign a number to it denoting the severity of the bug. Maybe it’s trivial issue and the customer would likely never notice it. Maybe it’s a must-fix bug such as a crash, data loss, or security vulnerability. At Microsoft, we generally assign all bugs two numbers when we enter it: Severity and Priority. Severity is how bad the bug is: Crash = 1, a button border color off by a shade = 4. Priority is how soon the bug should be fixed: Search does nothing so I can’t test my feature = 1, Searching for ESC-aped text doesn’t work = 4.

Once we enter a bug, then it’s off to Bug Triage. Bug Triage is a committee made up of representatives from most of the disciplines. At the start of a project, there is a good chance all bugs will be fixed. We know, though, based on data mining our engineering process data, that whenever a bug is fixed, there is a non-zero chance that the fix won’t be perfect or something else will be broken. Early on in the project, we have time to find those new bugs. As we get closer to release, there may not be time to find those few cases where we broke the code.

One more piece to this puzzle: Quality Essentials (QE). It is a list of the practices and procedures – the requirements – that our software or service must meet in order to be released. It could be as simple as verifying the service can be successfully deployed AND rolled back. It could be as mundane as zero-ing out the unused portions of sectors on the install disk.

Now, that bug I told you about at the beginning. We have an internal web site that allows employees to search for and register for trainings. We had a sprint, a four week release cycle, at the end of the year where we had to make the site fully accessible to those with disabilities. This was a new QE requirement. We were on track for shipping on time…as long as we skipped our planned holiday vacations. While messing around with the site one lunch, I noticed that we had a SQL code injection bug. I could crash the SQL backend. The developer looked at the bug and the fix was fairly straight forward. The regression testing required, though, would take a couple of days. That time was not in the schedule. Our options were:
• Reset the sprint, fix the new bug, and ship late. We HAD to release the fix by the end of the year, so this wasn’t an option.
• Bring in more testing resources. With the holiday vacations already taking place, this wasn’t a really good option.
• Take the fix, do limited testing, and be ready to roll back if problems were found. Since this site has to be up 99.999%, this wasn’t a legitimate option.
• Not fix the bug. This is the option we decided to go with.

Why did we go with the last option? There were a couple of reasons:
1) The accessibility fix HAD to be released before the end of the year due to a Quality Essentials requirement.
2) The SQL backend was behind a load balancer, with a second server and one standby. One SQL server was usually enough to handle the traffic.
3) The crashed SQL server was automatically rebooted and rejoined the load balancer within a minute or two, so the end user was unlikely to notice any performance issues.
4) The web site is internal only, and we expect most employees to be well behaved…the project tester, me, being the exception.

So, the likelihood of the crash was small, the results of the crash were small, so we shipped it. After a few days off, the next sprint, a short one, was carried out just to fix and regress this one bug. According to the server logs, the SQL server was crashed once between the holidays and the release of the fix. It was noted by our ever diligent Operations team. But, hey, I was testing the logging and reporting system. 🙂

I would be remiss if I didn’t add that each bug is different and must be examined as part of the whole system. The fix decision would have been very different if this were an external facing service, or something critical such as financial data was involved.

UI Testing Checklist

When testing a UI, it’s important to not only validate each input field, but to do so using interesting data. There are plenty of techniques for doing so, such as boundary value analysis, decision tables, state-transition diagrams, and combinatorial testing. Since you’re reading a testing blog, you’re probably already familiar with these. Still, it’s still nice to have a short, bulleted checklist of all the tools at your disposal. When I recently tested a new web-based UI, I took the opportunity to create one.

One of the more interesting tests I ran was a successful HTML injection attack. In an input field that accepted a string, I entered: <input type=”button” onclick=”alert(‘hi’)” value=”click me”>. When I navigated to the web page that should have displayed this string, I instead saw a button labeled “click me”. Clicking on it produced a pop-up with the message “hi”.  The web page was rendering all HTML and JavaScript I entered. Although my popup was fairly harmless, a malicious user could have used this same technique to be, well, malicious.

inject

 

 

 

 

 

 

Another interesting test was running the UI in non-English languages. Individually, each screen looked fine. But when I compared similar functionality on different screens, I noticed some dates were formatted mm/dd/yyyy and others dd/mm/yyyy. In fact, the most common bug type I found was inconsistencies between screens. The heading on some pages were name-cased, while others were lower-cased. Some headings were butted against the left size of the screen, and others had a small margin. Different fonts were used for similar purposes.

Let’s get back to boundary value analysis for a minute. Assume you’re testing an input field that accepts a value from 1 to 100. The obvious boundary tests are 0, 1, 100, and 101. However, there’s another, less obvious, boundary test. Since this value may be stored internally as an integer, a good boundary test is a number too large to be stored as an int.

My UI Testing Checklist has all these ideas, plus plenty more: accented characters, GB18030 characters, different date formats, leap days, etc.. It’s by no means complete, so please leave a comment with anything you would like added. I can (almost) guarantee it’ll lead you to uncover at least one new bug in your project.

 

Click this images to download the UI Testing Checklist

Click this image to download the UI Testing Checklist

Continue reading

The Tax Man

Introduction

As I write this, Tax Day in the United States is just behind us. This is the day when individuals are required by law to settle their tax bill with the US Federal Government, and in many cases with their state governments. Filing your taxes can get complicated, and income tax preparation is a large industry in the United States, with revenue around ten billion dollars according to IBISWorld. That’s ten billion dollars spent annually, helping people cope with the hassle, the complexity, and the general all around unpleasantness of paying taxes.

In the past, software testing could feel like paying taxes and being a software tester could start to feel like working for the IRS. There are forms to fill out. There are deadlines to be met. And there’s process, process, process and at the end of the day it feels like people only notice when something goes Horribly Wrong. But there’s a way to change all that.

Do you know how to get people to appreciate and understand paying taxes? Demonstrate a clear value for their investment, be flexible, and be data-driven.

Deliver Measurable Value

How many times this week did you run a test that didn’t find a bug in the last year? Be honest.

Running tests that don’t find bugs is not, generally, a great way to deliver value. It’s tempting to confuse activity with value and it’s nice to feel busy, but can you imagine explaining how you spent your time to the CEO of your company? It would go something like this:

CEO: So, what have you done this week?

You: Well, I ran a bunch of tests.

CEO: I see. Tell me more about how that improves our product or our profitaiblity?

You: Well, most of the tests I run don’t find bugs. We just run them because we’ve always run them.

[Cricket sounds]

In a conversation with your CEO, your boss, or even your mom you should be able to clearly and distinctly describe what you did to deliver value to the organization. Imagine this conversation instead:

CEO: So, what have you done this week?

You: I prevented thousands of dollars in revenue loss.

CEO: Pray tell, how did you to this?

You: I looked at the numbers from our website and I saw that traffic was down. I talked to Larry in development and found out that we’d just changed the font on the front page, about the same time traffic dropped. I worked with him and we put together an experiment where we tested the new font with some users and the old font with some other users. Turns out users hate the new font. We reverted the change and traffic’s gone up. We’ve put a process in place to do experimentation on all our big changes now.

CEO: I see. Tell me more about your ideas.

[Applause]

Be Flexible

In the past, we kept our tests in long documents, like giant Word files or long Excel spreadsheets. Then we created specialized software to manage the test cases. We had checklists. Lots of checklists.

The funny thing was, it could be possible to hit every box on the checklist and still deliver mediocre software. It was also possible for the checklist to slow down product development, without having anything measurable to show for it. How many times have you had to reject a change because the test process couldn’t accommodate it in time for your product deadline? Again, be honest. I bet you can think of at least one great change that couldn’t make it into your product because your change management system or your test pass couldn’t move fast enough to get the change in before the deadline.

There are some software development activities that require very rigorous, very cautious change management either because of regulatory requirements or because of risk to life or property. It’s very likely that you do not work in one of those areas. Even if you do there are probably places in your process that could stand some scrutiny and some adjustment.

At the end of the day you are not shipping your checklist. Go through the stages of your process and ask yourself if they’re really adding value, or if they’re only there because they’ve always been there. Then cut everything out of the process that doesn’t deliver real, measureable value to your users or your organization.

Be Data-driven

There are tons of tools available for getting data back from your application. Amazon has published a framework for doing A\B testing and analytics, and Facebook has Airlock. There are frameworks available for .NET, JavaScript, and pretty much any other platform that you need.

But, you protest, your product is written in COBOL and distributed via floppy disk to a small group of users in Thule, Greenland and McMurdo Base, Antarctica. There’s no way to be data-driven without writing my own framework from scratch and I don’t have time for that!

To this I say: horsefeathers. Being data-driven isn’t about using a framework about a buzzword. It’s about going out and looking at the real world (which is where your users live) and trying to understand how they’re using your software and what could change to make life better for them. If you don’t have a framework for doing A\B testing or getting user data back directly from your application, consider doing user interviews or surveys. The important thing is that you look at real users doing their real work and don’t just rely on results you got from your manual or automated testing.

To a certain extent, being data driven can be like answering a Fermi question. Start with a question that you need an answer to, for example “Do my users like the new font?” Then figure out what you already know and what tools you have at your disposal that can help you learn more. Frameworks are nice and they can make information easier to gather, but they aren’t the only way for you to go into the real world and get data from real users.

Conclusion

I don’t think there’s anybody who truly enjoys paying taxes. There are, however, tons of people who enjoy testing software. If you’re one of these people, and if you want your users and your dev peers to respect your work, then deliver measurable value, be flexible and be data-driven. If you don’t want to do these things, then don’t be surprised when your collleagues greet you with about as much enthusiasm as they greet an audit from the IRS.

How to release more often, have great customer focus, and great quality

Introduction

Something that I hear in software development circles is that folks want more frequent releases, they want more customer focus, more opportunities for innovation, and excellent quality. However, sometimes they do not know where to start to achieve what may at first seem like daunting goals.

Long software release cycles, feature sets that match a specification but might not match customer expectations, and lack of perceived innovation can be significant problems in software development.

—————————————————————————————————

The Methodology

Some software methodologies have certain preconceptions associated with them and can sometimes be discounted immediately without really taking a hard look at the details of what they are.

I have had the pleasure of working on a few teams here at Microsoft that use this sometimes misunderstood methodology.  I have both been a leader and participant in the process.

This methodology is called Scrum and is valuable for those in several different engineering disciplines and to their leadership as well.

—————————————————————————————————

How does it work at a high level?

Outline of process:

  • Create a backlog of scenarios and associated features for the product.  This backlog can be updated, or added to, at any time and can be reviewed periodically.  There are various tools such as Team Foundation Server (TFS) that can be used for this purpose.
  • Assemble a backlog of scenarios and features to be completed in a relatively short milestone (two to four weeks). Design, coding, testing, and bug fixing all occur during the milestone.
    • A planning meeting prior to the start of the milestone can be used to discuss this and give relative costing to the work items.
    • Each workday have a short meeting, fifteen minutes at maximum, to discuss progress from the previous day, what will be worked on, and any blocking issues.  If the meeting is reserved for these topics only it will stay on point and short.
      • There can also be similar meetings to coordinate larger cross-team projects.
      • At the end of the milestone there should be potentially shippable results, or one should be able to demo the progress.
        • Additionally, there can be opportunities to do an early release to a limited group of customers via alphas or betas.

—————————————————————————————————

Science behind Scrum

Scientific studies have confirmed several benefits of this approach.  You might be surprised to find that one of the studies was regarding three different teams at Microsoft!

These teams used various practices along with Scrum to achieve their goals: regular short meetings, planning poker, continuous integrations, test-driven development (unit tests), done criteria, source control, code coverage, peer review, static analysis, and XML documentation.

What did these teams learn from transitioning to this methodology?  They found that initially their productivity decreased slightly while they were learning the process.  However, by their fourth short milestone they observed a significant increase in productivity without an increase in defects!  They also had better quality with regard to defect density compared other teams not using the methodology,  including data benchmarked across forty projects from nine companies!  “These results indicate that [the methodology] combined with sound engineering practices have the potential to yield a higher quality product.” (Scrum and Engineering Practices:  Experiences of Three Microsoft Teams)

—————————————————————————————————

Key Concepts

“Scrum is an iterative and incremental Agile software development framework for managing software projects and product or application development. Its focus is on ‘a flexible, holistic product development strategy where a development team works as a unit to reach a common goal’ as opposed to a ‘traditional, sequential approach’. Scrum enables the creation of self-organizing teams by encouraging co-location of all team members, and verbal communication among all team members and disciplines in the project.

A key principle of Scrum is its recognition that during a project the customers can change their minds about what they want and need (often called requirements churn), and that unpredicted challenges cannot be easily addressed in a traditional predictive or planned manner. As such, Scrum adopts an empirical approach—accepting that the problem cannot be fully understood or defined, focusing instead on maximizing the team’s ability to deliver quickly and respond to emerging requirements.” (http://en.wikipedia.org/wiki/Scrum_(software_development))

The statement below is a high level glance at the philosophy behind the methodology.  It is the agile manifesto.

“We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiation

Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.” (http://agilemanifesto.org/)

So, in other words, the objective is to use the Scrum methodology to achieve an organization’s goals, not to follow a process for its own sake.

—————————————————————————————————

Other Helpful Items

Scrum can be even more effective when used in tandem with other helpful tools and methods.  Some of these are: user research, A/B testing, testing in production, telemetry, business intelligence data, etc…  For example, on my Xbox SmartGlass client platform team the development and test teams work very closely together.  We leverage unit tests, continuous integration, and a focus on the end-to-end experience, amongst other items during development to keep quality high.  Using these tools we can quickly catch regressions and other issues.  In our case, we work in an open workspace that easily facilitates communication between developers and testers.  However, there are other teams that still successfully use Scrum without this workspace layout.

—————————————————————————————————

Training and Implementation

To be successful using Scrum it is very important to get good quality training.  There are many resources available and even certifications, as well.

Teams should give the new process time.  It is common for teams to question or have doubts for the first few milestones or sprints.  For some big teams or organizations it may take longer, as much as twelve to eighteen months in some cases, for them to get into a good rhythm with the new methodology.

By tracking the relative costing of work items and how much work is completed per milestone it leads to a predictability of what could potentially be accomplished in the future.

Another key factor to consider is to get approval and agreement from the entire team, including upper management.  This is so each person not only agrees with the approach but also has a sense of ownership too.

—————————————————————————————————

Tailoring the Process to Fit Your Team

Teams can tailor Scrum to fit the way that works best for them.  After having the context and knowledge from training and experience it can be easily changed to meet the team’s needs and desires.  For example, if a team prefers a detailed process with several reports (burn-down graphs, etc…) then they can do that.  However, if they prefer a very lean process with more focus on team members talking in person to resolve priorities and issues (this can work well for small teams) then they can implement that.

—————————————————————————————————

Improving Customer Focus

Improving customer focus is a key benefit of using Scrum.  Demonstrations at the end of a milestone is a good way to show progress and gather feedback.  For example, during Xbox One development my Xbox SmartGlass team would do demos to show progress and gather feedback from others in the larger Xbox organization.

For features that are ready to ship or to be adopted early there are other options for gathering feedback.  Pre-releases such as Alphas and Betas can be used to get feedback.  Also, regular releases to customers can be useful.  One example was for the Xbox One release. We leveraged Beta feedback to address issues and improve Xbox SmartGlass prior to the holiday release.  Also, for the third party developers that we work with we did regular monthly releases to meet their needs and gather feedback.

In addition, Scrum provides the flexibility to add work items, or stories, that were not planned and prioritize them before other work items.  An example of this was that we quickly developed and tested features and a demo for the E3 conference in 2012 related to work on the Xbox 360.  We had already planned other work for the milestones in question but were able to be flexible and re-prioritize.

Another advantage of this approach is that teams do not have to follow a predefined schedule for releases.  Products can ship when they are ready and not at a prearranged date, if so desired.  After seeing the results of the team’s labor they come to know how much work can be completed each milestone, and can predict better when releases can occur.  So, essentially a team could ship anytime that they and the customer deem is appropriate.

I have worked on a few teams that have used Scrum, and a few that have claimed to use it.

I would like to emphasize the importance of training for Scrum.  The teams who have had the most success with it were the ones that had training and had ownership in the process.

Examples of teams where Scrum worked well were: Microsoft SharedView, Xbox.com, and currently Xbox SmartGlass.  Sometimes it takes time for members of the team to adjust to the new process and that is okay, and likely should be expected.  However, the process leads to better productivity, more predictability, better quality in terms of the customer-facing results, and better transparency of what is happening on the team through planning and daily standup meetings.  It also lends to a better sense of identity for the team as well.

Conversely, the teams that had mixed results were teams that claimed to use it but did not really follow the process.  In other words, they would say that they were using Scrum but still clung to some old ways of planning, developing, testing, releasing, etc…

—————————————————————————————————

Conclusion

In summary, Scrum is an effective methodology that enables shorter software release cycles, more opportunities for gathering and responding to customer feedback, more avenues for innovation, and meanwhile keeping excellent quality.  It is even better when used with additional tools and processes, like those mentioned above.

—————————————————————————————————

Additional Resources

More case studies:

Scrum books

Testing Is Like Going To The Doctor

Introduction

By now, you’ve probably heard the phrase “move quality upstream.” The idea is that we want developers to take more ownership of the validating the quality of the code they produce. There are a couple of common sense practices–most of which you’ve already heard of–that you need to adopt to start moving your org in this direction.

The first practice you need to adopt is unit testing. If your org doesn’t embrace unit testing, this is the first thing you need to go fix. Fortunately there’s been a lot written about unit testing and about how to get a dev org to adopt unit testing as a best practice. There’s even an IEEE standard on how to approach unit testing, if you’re into that kind of thing.

The next practice you need to adopt is test automation. Everything I write is going to assume that you believe test automation is generally a good thing and that you’ve got a standard test harness that you can use to exercise the code under test. Maybe you bought a test harness off the shelf. Maybe you’re using an open source test framework. Maybe you’re a special, unique snowflake and your org has a test harness that’s completely internal. Whatever. The point is, I’m going to assume that you believe in writing tests that are automated, repeatable and maintainable. I’m also going to assume you have some automated sanity tests, and probably an automated way to build and deploy code. If you haven’t got these yet, there’s some great books and blogs out there that’ll help you get the job done.

Great. What’s Next?

OK. So you’ve adopted both unit testing and automated functional testing. You’ve got an automated build and an automated BVT system that tells you when the build is just plain broken. That’s great. The next thing you need to think about is how to move more of the functional testing of software into the hands of the dev that’s writing it.

It makes sense for dev to own at least some functional testing. If you think about it, every time we move code from a developer to a tester we’re introducing overhead. It’s like having to setup and tear down a stack during a function call, or like having to move from “on box” to “on rack” in a cloud computing environment. It’s a tiny cost that repeated hundreds of times add up to a really big cost. The problem is that functional testing is a really big problem space. Are we asking dev to take on basic happy path testing? What about pairwise or stress testing? What about negative testing?

Yes. Yes, to all of it. But we’re going to do it one piece at a time. And when we’re done, we’re still going to have a ton of other stuff to do as quality assurance professionals. It’ll just be different stuff than what we do today.

The first thing you need to ask yourself is, “What is the first thing I do when dev hands me a piece of code to test?” The next thing you need to ask yourself is, “Does this really need to be done at all?” Then ask, “If this needs to be done, am I really the right person to do it?”

The first thing that you do when dev hands you a piece of code is going to be different depending on the kind of software you’re shipping. If you own a web service, you might validate that basic browsing and payment processing are working. If you own a game, you might make sure the menus load. The thing is, there’s probably something that you always do first when a dev tells you that a piece of code is done.

That thing, whatever it is–do you really need to do it? Is it finding bugs? What’s the risk to your product if you don’t find a bug at this stage? If a test doesn’t find bugs and it doesn’t mitigate risk, it’s probably not worth running. If a test always finds bugs, then we need to do something to improve quality because consistent failure is a sign that something’s systemically wrong.

Here’s a table to illustrate the point:

Always find Bugs Rarely Finds Bugs
High Risk We need more quality before we run this test Somebody should run this test
Low Risk We need more quality before we run this test Don’t run the test

Is There A Doctor In The House?

A test in software is kind of like getting a test from the doctor. If the doctor finds that my blood pressure is high, I’m the one who gets on the treadmill to try to get my blood pressure down, not the doctor. If the tests find that something’s consistently wrong, the the developer needs to do something to rectify the consistent failure: more code review, more unit testing–or maybe running the functional test themselves.

If the doctor thinks that it’s worthwhile to have my blood pressure monitored, I could come into the office to do it but that’ll get expensive and time consuming really fast. Or I could buy a blood pressure cuff and monitor my vitals myself, which will be much cheaper. The automated test that you’ve been running every time your dev hands you code? That test is your automatic, home-use blood pressure cuff.

The key here is that the home blood pressure monitor is automatic. I push a button, it squeezes my arm like an anaconda, and it spits out a reading. I don’t need a stethoscope. I don’t even need to know how to spell “systolic”. If somebody hadn’t invented this nifty little automated testing device, I wouldn’t be able to do this test by myself. But they did, and I can, and it saves a ton of time and money.

So if your test is always finding bugs or if the area is high risk, take your automated test and say to your dev partner, “Hey, I run this test every time you hand me code. If you ran it instead it would save us both some time.”

I’m assuming, of course, that your automated test runs pretty quickly, doesn’t generate false failures, and don’t require mastery of the Dark Arts to setup and execute–just like the automated blood pressure cuff. As long as you meet these conditions with your tests, the win for everybody is usually pretty obvious. It’s exactly the same as not driving to the doctor when you have a home blood pressure cuff–don’t go to the tester when you’ve got a quick, easy way of validating quality yourself.