How to release more often, have great customer focus, and great quality


Something that I hear in software development circles is that folks want more frequent releases, they want more customer focus, more opportunities for innovation, and excellent quality. However, sometimes they do not know where to start to achieve what may at first seem like daunting goals.

Long software release cycles, feature sets that match a specification but might not match customer expectations, and lack of perceived innovation can be significant problems in software development.


The Methodology

Some software methodologies have certain preconceptions associated with them and can sometimes be discounted immediately without really taking a hard look at the details of what they are.

I have had the pleasure of working on a few teams here at Microsoft that use this sometimes misunderstood methodology.  I have both been a leader and participant in the process.

This methodology is called Scrum and is valuable for those in several different engineering disciplines and to their leadership as well.


How does it work at a high level?

Outline of process:

  • Create a backlog of scenarios and associated features for the product.  This backlog can be updated, or added to, at any time and can be reviewed periodically.  There are various tools such as Team Foundation Server (TFS) that can be used for this purpose.
  • Assemble a backlog of scenarios and features to be completed in a relatively short milestone (two to four weeks). Design, coding, testing, and bug fixing all occur during the milestone.
    • A planning meeting prior to the start of the milestone can be used to discuss this and give relative costing to the work items.
    • Each workday have a short meeting, fifteen minutes at maximum, to discuss progress from the previous day, what will be worked on, and any blocking issues.  If the meeting is reserved for these topics only it will stay on point and short.
      • There can also be similar meetings to coordinate larger cross-team projects.
      • At the end of the milestone there should be potentially shippable results, or one should be able to demo the progress.
        • Additionally, there can be opportunities to do an early release to a limited group of customers via alphas or betas.


Science behind Scrum

Scientific studies have confirmed several benefits of this approach.  You might be surprised to find that one of the studies was regarding three different teams at Microsoft!

These teams used various practices along with Scrum to achieve their goals: regular short meetings, planning poker, continuous integrations, test-driven development (unit tests), done criteria, source control, code coverage, peer review, static analysis, and XML documentation.

What did these teams learn from transitioning to this methodology?  They found that initially their productivity decreased slightly while they were learning the process.  However, by their fourth short milestone they observed a significant increase in productivity without an increase in defects!  They also had better quality with regard to defect density compared other teams not using the methodology,  including data benchmarked across forty projects from nine companies!  “These results indicate that [the methodology] combined with sound engineering practices have the potential to yield a higher quality product.” (Scrum and Engineering Practices:  Experiences of Three Microsoft Teams)


Key Concepts

“Scrum is an iterative and incremental Agile software development framework for managing software projects and product or application development. Its focus is on ‘a flexible, holistic product development strategy where a development team works as a unit to reach a common goal’ as opposed to a ‘traditional, sequential approach’. Scrum enables the creation of self-organizing teams by encouraging co-location of all team members, and verbal communication among all team members and disciplines in the project.

A key principle of Scrum is its recognition that during a project the customers can change their minds about what they want and need (often called requirements churn), and that unpredicted challenges cannot be easily addressed in a traditional predictive or planned manner. As such, Scrum adopts an empirical approach—accepting that the problem cannot be fully understood or defined, focusing instead on maximizing the team’s ability to deliver quickly and respond to emerging requirements.” (

The statement below is a high level glance at the philosophy behind the methodology.  It is the agile manifesto.

“We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiation

Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.” (

So, in other words, the objective is to use the Scrum methodology to achieve an organization’s goals, not to follow a process for its own sake.


Other Helpful Items

Scrum can be even more effective when used in tandem with other helpful tools and methods.  Some of these are: user research, A/B testing, testing in production, telemetry, business intelligence data, etc…  For example, on my Xbox SmartGlass client platform team the development and test teams work very closely together.  We leverage unit tests, continuous integration, and a focus on the end-to-end experience, amongst other items during development to keep quality high.  Using these tools we can quickly catch regressions and other issues.  In our case, we work in an open workspace that easily facilitates communication between developers and testers.  However, there are other teams that still successfully use Scrum without this workspace layout.


Training and Implementation

To be successful using Scrum it is very important to get good quality training.  There are many resources available and even certifications, as well.

Teams should give the new process time.  It is common for teams to question or have doubts for the first few milestones or sprints.  For some big teams or organizations it may take longer, as much as twelve to eighteen months in some cases, for them to get into a good rhythm with the new methodology.

By tracking the relative costing of work items and how much work is completed per milestone it leads to a predictability of what could potentially be accomplished in the future.

Another key factor to consider is to get approval and agreement from the entire team, including upper management.  This is so each person not only agrees with the approach but also has a sense of ownership too.


Tailoring the Process to Fit Your Team

Teams can tailor Scrum to fit the way that works best for them.  After having the context and knowledge from training and experience it can be easily changed to meet the team’s needs and desires.  For example, if a team prefers a detailed process with several reports (burn-down graphs, etc…) then they can do that.  However, if they prefer a very lean process with more focus on team members talking in person to resolve priorities and issues (this can work well for small teams) then they can implement that.


Improving Customer Focus

Improving customer focus is a key benefit of using Scrum.  Demonstrations at the end of a milestone is a good way to show progress and gather feedback.  For example, during Xbox One development my Xbox SmartGlass team would do demos to show progress and gather feedback from others in the larger Xbox organization.

For features that are ready to ship or to be adopted early there are other options for gathering feedback.  Pre-releases such as Alphas and Betas can be used to get feedback.  Also, regular releases to customers can be useful.  One example was for the Xbox One release. We leveraged Beta feedback to address issues and improve Xbox SmartGlass prior to the holiday release.  Also, for the third party developers that we work with we did regular monthly releases to meet their needs and gather feedback.

In addition, Scrum provides the flexibility to add work items, or stories, that were not planned and prioritize them before other work items.  An example of this was that we quickly developed and tested features and a demo for the E3 conference in 2012 related to work on the Xbox 360.  We had already planned other work for the milestones in question but were able to be flexible and re-prioritize.

Another advantage of this approach is that teams do not have to follow a predefined schedule for releases.  Products can ship when they are ready and not at a prearranged date, if so desired.  After seeing the results of the team’s labor they come to know how much work can be completed each milestone, and can predict better when releases can occur.  So, essentially a team could ship anytime that they and the customer deem is appropriate.

I have worked on a few teams that have used Scrum, and a few that have claimed to use it.

I would like to emphasize the importance of training for Scrum.  The teams who have had the most success with it were the ones that had training and had ownership in the process.

Examples of teams where Scrum worked well were: Microsoft SharedView,, and currently Xbox SmartGlass.  Sometimes it takes time for members of the team to adjust to the new process and that is okay, and likely should be expected.  However, the process leads to better productivity, more predictability, better quality in terms of the customer-facing results, and better transparency of what is happening on the team through planning and daily standup meetings.  It also lends to a better sense of identity for the team as well.

Conversely, the teams that had mixed results were teams that claimed to use it but did not really follow the process.  In other words, they would say that they were using Scrum but still clung to some old ways of planning, developing, testing, releasing, etc…



In summary, Scrum is an effective methodology that enables shorter software release cycles, more opportunities for gathering and responding to customer feedback, more avenues for innovation, and meanwhile keeping excellent quality.  It is even better when used with additional tools and processes, like those mentioned above.


Additional Resources

More case studies:

Scrum books

Testing Is Like Going To The Doctor


By now, you’ve probably heard the phrase “move quality upstream.” The idea is that we want developers to take more ownership of the validating the quality of the code they produce. There are a couple of common sense practices–most of which you’ve already heard of–that you need to adopt to start moving your org in this direction.

The first practice you need to adopt is unit testing. If your org doesn’t embrace unit testing, this is the first thing you need to go fix. Fortunately there’s been a lot written about unit testing and about how to get a dev org to adopt unit testing as a best practice. There’s even an IEEE standard on how to approach unit testing, if you’re into that kind of thing.

The next practice you need to adopt is test automation. Everything I write is going to assume that you believe test automation is generally a good thing and that you’ve got a standard test harness that you can use to exercise the code under test. Maybe you bought a test harness off the shelf. Maybe you’re using an open source test framework. Maybe you’re a special, unique snowflake and your org has a test harness that’s completely internal. Whatever. The point is, I’m going to assume that you believe in writing tests that are automated, repeatable and maintainable. I’m also going to assume you have some automated sanity tests, and probably an automated way to build and deploy code. If you haven’t got these yet, there’s some great books and blogs out there that’ll help you get the job done.

Great. What’s Next?

OK. So you’ve adopted both unit testing and automated functional testing. You’ve got an automated build and an automated BVT system that tells you when the build is just plain broken. That’s great. The next thing you need to think about is how to move more of the functional testing of software into the hands of the dev that’s writing it.

It makes sense for dev to own at least some functional testing. If you think about it, every time we move code from a developer to a tester we’re introducing overhead. It’s like having to setup and tear down a stack during a function call, or like having to move from “on box” to “on rack” in a cloud computing environment. It’s a tiny cost that repeated hundreds of times add up to a really big cost. The problem is that functional testing is a really big problem space. Are we asking dev to take on basic happy path testing? What about pairwise or stress testing? What about negative testing?

Yes. Yes, to all of it. But we’re going to do it one piece at a time. And when we’re done, we’re still going to have a ton of other stuff to do as quality assurance professionals. It’ll just be different stuff than what we do today.

The first thing you need to ask yourself is, “What is the first thing I do when dev hands me a piece of code to test?” The next thing you need to ask yourself is, “Does this really need to be done at all?” Then ask, “If this needs to be done, am I really the right person to do it?”

The first thing that you do when dev hands you a piece of code is going to be different depending on the kind of software you’re shipping. If you own a web service, you might validate that basic browsing and payment processing are working. If you own a game, you might make sure the menus load. The thing is, there’s probably something that you always do first when a dev tells you that a piece of code is done.

That thing, whatever it is–do you really need to do it? Is it finding bugs? What’s the risk to your product if you don’t find a bug at this stage? If a test doesn’t find bugs and it doesn’t mitigate risk, it’s probably not worth running. If a test always finds bugs, then we need to do something to improve quality because consistent failure is a sign that something’s systemically wrong.

Here’s a table to illustrate the point:

Always find Bugs Rarely Finds Bugs
High Risk We need more quality before we run this test Somebody should run this test
Low Risk We need more quality before we run this test Don’t run the test

Is There A Doctor In The House?

A test in software is kind of like getting a test from the doctor. If the doctor finds that my blood pressure is high, I’m the one who gets on the treadmill to try to get my blood pressure down, not the doctor. If the tests find that something’s consistently wrong, the the developer needs to do something to rectify the consistent failure: more code review, more unit testing–or maybe running the functional test themselves.

If the doctor thinks that it’s worthwhile to have my blood pressure monitored, I could come into the office to do it but that’ll get expensive and time consuming really fast. Or I could buy a blood pressure cuff and monitor my vitals myself, which will be much cheaper. The automated test that you’ve been running every time your dev hands you code? That test is your automatic, home-use blood pressure cuff.

The key here is that the home blood pressure monitor is automatic. I push a button, it squeezes my arm like an anaconda, and it spits out a reading. I don’t need a stethoscope. I don’t even need to know how to spell “systolic”. If somebody hadn’t invented this nifty little automated testing device, I wouldn’t be able to do this test by myself. But they did, and I can, and it saves a ton of time and money.

So if your test is always finding bugs or if the area is high risk, take your automated test and say to your dev partner, “Hey, I run this test every time you hand me code. If you ran it instead it would save us both some time.”

I’m assuming, of course, that your automated test runs pretty quickly, doesn’t generate false failures, and don’t require mastery of the Dark Arts to setup and execute–just like the automated blood pressure cuff. As long as you meet these conditions with your tests, the win for everybody is usually pretty obvious. It’s exactly the same as not driving to the doctor when you have a home blood pressure cuff–don’t go to the tester when you’ve got a quick, easy way of validating quality yourself.

Instagram for Mongo

When Instagram finally released an app for Windows phone­, a year after its release on iPhone and Android, it was missing perhaps its most important feature — the ability to take a picture within the app. Instagram explained they wanted to get the app to users as quickly as possible, and although a few features were missing, they assured everyone these features would come in a future release. Welcome to the new paradigm of shipping software to the cloud.

When I started in Test, we shipped our products on DVD. If a feature was missing or broken, the customer either had to wait for a new DVD (which might take a year or three) or download and install a service pack. Today, our software is a service (Saas). We ship to the cloud, and all customers are upgraded at once. For the Instagram app, the software is upgraded in the cloud, customers are automatically notified, and they can upgrade their phone with a single click.

Mongo likes Instagram.

Mongo likes Instagram.

In both cases, updating software is easier than ever. This allows companies to get their products to market quicker than ever. There’s no longer a need to develop an entire product before shipping. You can develop a small feature set, possibly with some known (but benign) bugs, and then iterate, adding new scenarios and fixing bugs.

In How Google Tests Software, James Whittaker explains that Google rarely ships a large set of features at once. Instead, they build the core of the product and release it the moment it’s useful to as many customers as possible. Whittaker calls this the “minimal useful product”. Then they iterate on this first version. Sometimes Google ships these early versions with a Beta tag, as they did with Android and Gmail, which kept its Beta tag for 4 years.

When I first read Whittaker’s book, I had a hard time accepting you should release what’s essentially an unfinished product. But I hadn’t worked on a cloud service yet, either. Now that I’ve done so for a few years, I’m convinced this is the right approach. This process can benefit everyone affected by the product, from your customers, to your management, to your (wait for it) self.


  1. By shipping your core scenarios as soon as they’re ready, customers can take advantage of them without waiting for less important scenarios to be developed.

    I worked on a project where the highest priority was to expose a set of data in our UI. We completed this in about two weeks. We knew, however, some customers needed to edit this data, and not just see it. So we decided to ship only after the read/write functionality was complete. Competing priorities and unexpected complications made this project take longer than expected. As a result, customers didn’t get their highest-priority scenario, the ability to simply view the data, for 2 months, rather than 2 weeks.

  2. When a product is large and complex, scenarios can go untested due to either oversight or test debt. With a smaller feature set, it’s less likely you’ll overlook scenarios. And with this approach, there shouldn’t be any test debt. Remember, Whittaker is not saying to release a buggy product with a lot of features, and then iterate fixing the bugs. He’s saying to ship a high-quality product with a small feature set, and iterate by adding features.
  3. Many SaaS bugs are due to deployment and scale issues, rather than functional issues. Using an iterative approach, you’ll find and address these bugs quickly because you’re releasing features quickly. By the time you’re developing the last feature, you’ll hopefully have addressed all these issues.
  4. Similarly, you’ll be able to incorporate customer feedback into the product before the last feature has even been developed.
  5. Customers like to get updates! On my phone, I have some apps that are updated every few weeks, and others that are updated twice a year. Sure, it’s possible the apps modified twice a year are actually getting bigger updates than the ones updated often, but it sure doesn’t feel that way.


  1. This approach gets new features to market quicker, giving your company a competitive advantage over, well, your competitors.
  2. By releasing smaller, high-quality features, fewer bugs will be found by customers. And bugs reported by customers tend to be highly visible to, and frowned upon by, management.


  1. If you work on a cloud product, there’s undoubtedly an on-call team supporting it. You’re probably part of it. By releasing smaller feature sets with fewer bugs, the on-call team will receive fewer customer escalations, and be woken up fewer times at night. You’re welcome.

“I see test as a role going away – and it can’t disappear fast enough”

In case you missed it, Alan Page dropped a bombshell in his last post of 2013 – and then immediately (and literally) got on a plane and left. Definitely worth a read.

The Evolution of Software Testing

As I sat down to eat my free Microsoft oatmeal this morning, I noticed how dusty my Ship-It plaque had become. I hadn’t given it much attention lately, but today it struck me how these awards illustrate the evolution of software testing.

Since I started at Microsoft seven years ago, I’ve earned nine Ship-It awards — plaques given to recognize your contribution to releasing a product. The first Ship-It I “earned” was in 2006 for Microsoft Antigen, an antivirus solution for Exchange.

I put “earned” in quotes because I only worked on Antigen a couple of days before it was released. A few days into my new job, I found myself on a gambling cruise — with open bar! — to celebrate Antigen’s release. I was also asked to sign a comically over-sized version of the product DVD box. Three years later I received another Ship-It — and signed another over-sized box — for Antigen’s successor, “Forefront Protection 2010 for Exchange Server”.

Fast-forward to 2011, when I received my plaque for Microsoft Office 365. This time there was no DVD box to sign — because the product wasn’t released on DVD. It’s a cloud-based productivity suite featuring email and Office.

This got me thinking. In 2006, when we shipped Antigen, all the features we wanted to include had to be fully developed, and mostly bug-free, by the day the DVD was cut. After all, it would be another three years before we cut a new one. And it would be a terrible experience for a customer to install our DVD, only to then have to download and install a service pack to address an issue.

By 2011, however, “shipping” meant something much different. There was no DVD to cut. The product was “released” to the cloud. When we want to update Office 365, we patch it in the cloud ourselves without troubling the customer; the change simply “lights up” for them.

This means products no longer have to include all features on day one. If a low-priority feature isn’t quite ready, we can weigh the impact of delaying the release to include the feature, versus shipping immediately and patching it later. The same hold true if/when bugs are found.

What does this all mean for Test? In the past, it was imperative to meet a strict set of release criteria before cutting a DVD. For example, no more code churn, test-pass rates above 95%, code coverage above 65%, etc. Now that we can patch bugs quicker than ever, do these standards still hold? Have our jobs become easier?

We should be so lucky.

In fact, our jobs have become harder. You still don’t want to release a buggy product — customers would flock to your competitors, regardless of how quickly you fix the bugs. And you certainly don’t want to ship a product with security issues that compromise your customer’s data.

Furthermore, it turns out delivering to the cloud isn’t “free.” It’s hard work! Patching a bug might mean updating thousands of servers and keeping product versions in sync across all of them.

Testers now have a new set of problems. How often should you release? What set of bugs need to be fixed before shipping, and which ones can we fix after shipping? How do you know everything will work when it’s deployed to the cloud? If the deployment fails, how do you roll it back? If the deployment works, how do you know it continues to work — that servers aren’t crashing, slowing down, or being hacked? Who gets notified when a feature stops working in the middle of the night? (Spoiler alert: you do.)

I hope to explore some of these issues in upcoming articles. Unfortunately, I’m not sure when I’ll have time to do that. I’m on-call 24×7 for two weeks in December.


Get every new post delivered to your Inbox.

Join 359 other followers

%d bloggers like this: