State-Transition Testing

One of our goals at Expert Testers is to discuss practical topics that can help every tester do their job better. To this end, my last two articles have been about Decision Table Testing and Being an Effective Spec Reviewer. Admittedly, neither of these topics break new ground. That doesn’t mean, however, most testers have mastered these techniques. In fact, almost 50% of the respondents to our Decision Table poll said they’ve never used one.

Continuing the theme of discussing practical topics, let’s talk about State Transition Diagrams. State Transition Diagrams, or STDs as they’re affectionately called, are effective for documenting functionality and designing test cases. They should be in every testers bag of tricks, along with Decision Tables, Pair-Wise analysis, and acting annoyed at work to appear busy.

STDs show the state a system will move to, based on its current state and other inputs. These words, I understand, mean little until you’ve seen one in action, so let’s get to an example. Since I’m particularly busy (i.e., lazy) today, I’ll use a simple example I found on the web.

Below is a Hotel Reservation STD. Each rectangle, or node, represents the state of the reservation. Each arrow is a transition from one state to the next. The text above the line is the input–the event that caused the state to change. The text below the line is the output–the action the system performs in response to the event.

clip_image001

Pasted from <http://users.csc.calpoly.edu/~jdalbey/SWE/Design/STDexamples.html>

One benefit of State Transition Diagrams is that they describe the behavior of the system in a complete, yet easy-to-read and compact, way. Imagine describing this functionality in sentence format; it would take pages of text to describe it fully. STDs are much simpler to read and understand. For this reason, they can show paths that were missed by PM or Developer, or paths the Tester forgot to test.

I learned this when I was testing Microsoft Forefront Protection for Exchange Server, a product that protects email customers from malware and spam. The product logic for determining when a message would be scanned was complicated; it depended on the server role, several Forefront settings, and whether the message was previously scanned.

The feature spec described this logic in sentence format, and was nearly impossible to follow. I took it upon myself to create a State Transition Diagram to model the logic. I printed it out and stuck it on my office (i.e., cubicle) wall. Not a week went by without a Dev, Tester, or PM stopping by to figure out why their mail wasn’t being scanned as they expected.

If you read my article on Decision Tables (DTs), and I’m sure you didn’t, you may be wondering when to use an STD and when to use a DT. If you’re working on a system where the order of events matter, then use an STD; Decision Tables only work if the order of events doesn’t matter.

Another benefit of STDs is that we can use them to design our test cases. To test a system completely, you’d need to cover all possible paths in the STD. This is often either impractical or impossible.

In our simple example, there are only four paths from start of the STD to the end, but in larger systems there can be too many to cover in a reasonable amount of time. For these systems, you can use multiple STDs for sub-systems rather than trying to create a single STD for the entire system. This will make the STDs easier to read, but will not lower the total number of paths. It’s also common to find loops in an STD, resulting in an infinite number of possible paths.

When covering all paths is impractical, one alternative is to ensure each state (node) is covered by at least one test. This, however, would result in weak coverage. For our hotel booking system, we could test all seven states while leaving some transitions and events completely untested.

Often, the best strategy is to create tests that cover all transitions (the arrows) at least once. This guarantees you will test every state, event, action, and transition. It gives you good coverage in a reasonable amount of tests.

If you’re interested in learning more about STDs (it’s impossible to cover them fully in a short blog article) I highly recommend reading A Practitioner’s Guide to Software Test Design. It’s where I first learned about them.

The next time you’re having trouble describing a feature or designing your tests, give a State Transition Diagram or Decision Table a try. The DTs and STDs never felt so good!

 

Advertisements

Test In Production – What is it all about?

I would like to share my thoughts about test-in-production (a.k.a TiP.) This term has become a buzz word in the testers wonderland as the industry is moving more towards providing solutions in the cloud. Here are 4 easy questions thru which I plan to address this.

I like to explain this with an analogy of a box product in the olden days vs cloud services today. In earlier days when software was shipped as a box product or a downloadable executable, testing was much simpler, in a way. Those box products have well-defined system requirements like Operating system (type, version), supported locale, disk space, RAM, yada yada yada. So when testers define the test plan its self-contained within those boundaries defined by the product. When the end-user buys the box product it is at his own decision on which hardware he can install the software. It is a balanced equation I guess, i.e, what’s tested to what’s installed, and works as expected = success if end-user chooses the hardware meeting the system requirements.

With the evolution of today’s cloud oriented solutions, customers want solutions that optimize cost (which is one of the reason cloud is evolving, in my opinion). The companies providing the software service decides on the hardware to suit the scale and performance need. In reality, not all software is custom-made to a h/w. So there are many variables that are associated to the h/w when it comes to testing software services in the cloud. For example, when you host your solution that is used by 100’s of 1000s’ of users you can think of 10’s of 100’s of servers in the data center.

The small software once tested in 1 machine or multiple machines (depending on what software architecture you are testing) now becomes a huge network tied up to various levels of Service Level Agreement (SLA) like performance, latency, scale, fault tolerance, security, blah blah blah.  Although it is very much possible to simulate the data center kind of setup within your corporate environment  there may/will be lot of difference when it comes to the actual setup in the data center. Some of these may include, but are not limited to, load balancers, active directory credentials, different security policy applied on the hosts, domain controller configurations specific to your hosting setup, storage access credentials; and these are just the tip of the iceberg.

What

So what is TiP? My definition for TiP is the end-to-end customer test scenario you can author with the required input parameters and target to run constantly in a predefined interval against the software end points of the hosted service. This validates the functionality and component integration, and provides a binary result: Pass or Fail. There are at least 2 different types of TiP tests you can author: Outside-In(OI) and Inside-Out(IO).

Outside-In(OI): These tests run outside your production environment targeting your software end point.

Inside-Out(IO): These tests run from within your data center targeting different roles you may have to ensure they are all functioning properly.

Why

TiP enables you to proactively find any issues before you could hear from a customer. Since the tests are running against your live site, it is expected to have appropriate monitoring built into the architecture so that the failures from these critical tests are escalated accordingly and appropriate action is taken. TiP is a valuable asset to validate your deployment and any plumbing between different software role* you may have in your architecture. TiP plays a critical role during service deployment or upgrade as it runs end-to-end tests on the production systems before it can go live to take the real-world traffic. Automated TiP scenario tests may save a lot of the testers from manually validating the functionality in production system.

When

TiP is recommended to be running all the time, for as long as you keep your s/w service alive.

How

I’m not going to go into any design in how. Rather it’s a high level thought. Identify from your test plan a few critical test paths that cover both happy path and negative test cases. Give priority to the test case that cover maximum code path and components. For example, if your service has replication, SQL transaction, flush policy, etc., encapsulate all of this into a single test case and try to automate the complex path. This will help ensure that the whole pipeline in your architecture is servicing as expected. There is no right or wrong tools for this. From batch files and shell scripts, to C# and Ruby on Rails, it’s up to you to find the right tool set and language appropriate for the task.

*role – An installation or instance of the operating system serving a specific capability. For example, an authentication system could be one instance of the OS in your deployment whose functionality is just to authenticate all the traffic to access your service.

Being an Effective Spec Reviewer

The first time testers can impact a feature is often during functional and design reviews. This is also when we make our first impression on our co-workers. If you want to make a great initial impact on both your product and peers, you have to be an effective reviewer.

In my seven years in testing, I’ve noticed that many testers don’t take advantage of this opportunity. Most of us fall into one of four categories:

  1. Testers who pay attention during reviews without proving feedback. This used to be me. I learned the feature, which is one of the goals of a review meeting. A more important goal, however, is to give feedback that exposes defects as early as possible.
  2. Testers who push-back (argue) at seemingly every minor point. Their goal is to increase their visibility and prove their worth as much as it is to improve the product. They learn the feature and can give valuable feedback. However, while they think they’re impressing their teammates, they’re actually frustrating them.
  3. Testers who attend reviews with their laptops open, not paying attention. If this is you, please stop; no one’s impressed with how busy you’re pretending to be.
  4. Testers who pay attention and learn the feature, while also providing constructive feedback. Not only do they understand and improve the feature, but they look good doing it. This can be you!

How do you do this, you ask? With this simple recipe that only took me four years to learn, I answer.

1. Read the Spec

Before attending any functional or design review, make sure you read the documentation. This is common sense, but most of us are so busy we don’t have time to read the specs. Instead, we use the review itself to learn the feature.

This was always a problem for me because although I learned the feature during the review, I didn’t have enough time to absorb the details and give valuable feedback during the meeting. It was only afterwards when I understood the changes we needed to make. By then it was too late–decisions had already been made and it was hard to change people’s minds. Or coding had begun, and accepting changes meant wasted development time.

A great idea Bruce Cronquist suggested is to block out the half hour before the review meeting to read the spec. Put this time on your calendar to make sure you don’t get interrupted.

2. Commit to Contributing

Come to every review with the goal of contributing at least one idea. Once I committed to this, I immediately made a bigger impact on both my product and peers. This strategy works for two reasons.

First, it forces you to pay closer attention than you normally might have. If you know you’ll be speaking during the meeting, you will pay closer attention.

Second, it forces you to speak up about ideas you might otherwise have kept to yourself. I used to keep quiet in reviews if I wasn’t 100% sure I was right. Even if I was almost positive, I would still investigate further after the meeting. The result was that someone else would often mention the idea first.

It  took four years for me to realize this is an effective tool.

3. Have an Agenda

It’s easy to say you’ll give a good idea during every review, but how can you make sure you’ll always have a good idea to give? For me, the answer was a simple checklist.

The first review checklist I made was to make sure features are testable. Not only are testers uniquely qualified to enforce testability, but if we don’t do it no one will. Bringing up testability concerns as early as possible will also make your job of testing the feature later-on much easier. My worksheet listed the key tenets of testability, had a checklist of items for each tenant, and room for notes.

At the time, I thought the concept of a review checklist was revolutionary. So much so, in fact, that I emailed Alan Page about it no less than five times. I’m now sure Alan must have thought I was some kind of stalker or mental patient. However, he was very encouraging and was kind enough to give the checklist a nice review on Toolbox–a Microsoft internal engineering website. If you work at Microsoft, you can download my testability checklist here.

I now know that not only are checklists the exact opposite of revolutionary, but there are plenty of other qualities to look for than just testability.

Test is the one discipline that knows about most (or all) of the product features.  It’s easy for us to find and identify inconsistencies between specs, such as when one PM says the product should do X, while another PM says it should do Y. It’s also our job to be a customer advocate. And we need to enforce software qualities such as performance, security, and usability. So I decided to expand my checklist.

My new checklist includes 50 attributes to look for in functional and design reviews. It’s in Excel format, so you can easily filter the items based on Review Type (Feature, Design, or Test) and Subtype (Testability, Usability, Performance, Security, etc.)

Review Checklist

Click this image to download the Review Checklist.

If there are any other items you would like added to the checklist, please list them in the comments section below. Enjoy!

Exploratory Testing == Fun Productivity

You are probably familiar with the testing approaches of black box, white box, and gray box testing.  Each “tool” in the tester’s tool belt can be used in the right circumstances, or misused in the wrong circumstances.  Exploratory Testing (ET) can be used in almost all circumstances, and whether done formally or informally, it is a tool we shouldn’t be afraid to use.

Exploratory testing (ET) is something you probably already do. It is more than just “clicking around” the product.  ET is defined as a test-execution approach where the tester uses information gained while performing tests to intuitively derive additional tests. You can think of it as that little voice in the back of your head telling you “Did I just see something that looked wrong? I better check that out more deeply.” This is subtly different from black-box (BB) testing where you apply tools like Boundary Value Analysis (BVA) and Equivalence Class (EQ) to first develop a list of tests, and second run those tests. It also differs from gray-box (GB) testing where you first use internal knowledge of the structure of the feature and code to develop a list of tests, and second run those tests. You can think of ET as BB and GB testing with a feedback loop—you do test design and test execution at the same time. You are free to explore other avenues of the product in order to track down bugs and issues.

Exploratory testing provides value to the testing effort. It is generally good at evaluating the “look and feel” of a project, but several studies raise important questions about the overall effectiveness and efficiency of behavioral testing and popular exploratory testing approaches to software testing. The details of the studies can be found in chapter six of How We Test Software at Microsoft.

ET can be explained with an analogy (from James Bach’s “Exploratory Testing Explained“):

Have you ever solved a jigsaw puzzle? If so, you have practiced exploratory testing. Consider what happens in the process. You pick up a piece and scan the jumble of unconnected pieces for one that goes with it. Each glance at a new piece is a test case (“Does this piece connect to that piece? No? How about if I turn it around? Well, it almost fits but now the picture doesn’t match…”). You may choose to perform your jigsaw testing process more rigorously, perhaps by concentrating on border pieces first, or on certain shapes, or on some attribute of the picture on the cover of the box. Still, can you imagine what it would be like to design and document all your jigsaw “test cases” before you began to assemble the puzzle, or before you knew anything about the kind of picture formed by the puzzle?

When I solve a jigsaw puzzle, I change how I work as I learn about the puzzle and see the picture form. If I notice a big blotch of color, I might decide to collect all the pieces of that approximate color into one pile. If I notice some pieces with a particularly distinctive shape, I might collect those together. If I work on one kind of testing for a while, I might switch to another kind just to keep my mind fresh. If I find I’ve got a big enough block of pieces assembled, I might move it into the frame of the puzzle to find where it connects with everything else. Sometimes I feel like I’m too disorganized, and when that happens, I can step back, analyze the situation, and adopt a more specific plan of attack. Notice how the process flows, and how it remains continuously, each moment, under the control of the practitioner. Isn’t this very much like the way you would assemble a jigsaw, too? If so, then perhaps you would agree that it would be absurd for us to carefully document these thought processes in advance. Reducing this activity to one of following explicit instructions would only slow down our work.

This is a general lesson about puzzles: the puzzle changes the puzzling. The specifics of the puzzle, as they emerge through the process of solving that puzzle, affect our tactics for solving it. This truth is at the heart of any exploratory investigation, be it for testing, development, or even scientific research or detective work.

Key advantages of ET:

  • Exploratory testing is heavily influenced by the tester’s in-depth system and domain knowledge and experience. The more you know, the better you are at following the paths that are most likely to find bugs or issues.
  • Less preparation is needed.
  • Important bugs are quickly found.
  • ET tends to be more intellectually stimulating than execution of scripted tests.
  • Even if you come back and test the same area again, you are likely to perform your tests in a slightly different way (you aren’t following a script), so you are more likely find more bugs.
  • ET is particularly suitable if requirements and specifications are incomplete, or if there is a lack of time.
  • ET can also be used to  validate that previous testing has found the most important defects.
  • ET is better than just testing.

Key drawbacks of ET:

  • You must manage your time wisely. You need to know when to stop pursuing one avenue and move on to another.
  • You can’t review cases in advance (and by that prevent errors in code and test cases).
  • It can be hard to reproduce tests later unless you are documenting everything you do.  One idea is to use a screen recorder whenever you are doing ET.
  • It can be difficult to know exactly which tests have been run. This can be partially alleviated if you are recording your test steps and creating automation, or if you tracking code coverage.
  • You may end up testing paths that the user would never do. You can use customer data as an addition to your ET so that you don’t spend time testing areas that don’t need be tested.

Use ET when:

  • You need to provide feedback on a new product or feature.
  • You need to quickly learn a new product.
  • You have already been using scripts and seek to diversify the testing.
  • You want to find the single most important bug in the shortest time.
  • You want to check the work of another tester by doing a brief independent investigation.
  • You want to find and isolate a particular defect.
  • You want to determine the status of a particular risk, in order to evaluate the need of scripted tests in that area.
  • You are on a team practicing agile or Extreme Programming.

The last bullet deserves some context. Why would an agile team be interested in ET? Agile teams can suffer from groupthink. The team members spend all day working together, talking, coding, attending meetings, and so on. They tend to start thinking alike. While this helps the agile process, it can hinder testing. Why? Everyone starts to think about the product in the same way and use the product in the same way. Your scripted tests start following the same sequence as the developer’s code. ET can help break that groupthink, randomize the testing, and find issues that the customer would.

Are you a master or an amateur ET tester?

  • Test design: Exploratory tester is first and foremost a test designer. Anyone can design a test accidentally. The excellent exploratory tester is able to craft tests that systematically explore the product. This requires skill such as the ability to analyze a product, evaluate risk, use tools, and think critically, among others.
  • Careful observation: Excellent exploratory testers are more careful observers than novices, and for that matter, experienced scripted testers. The scripted tester will only observe what the script tells them to observe. The exploratory tester must watch for anything unusual or mysterious. Exploratory testers also must be careful to distinguish observation from inference, even under pressure, lest they allow preconceived assumptions to blind them to important tests or product behavior.
  • Critical thinking: Excellent exploratory testers are able to review and explain their logic, looking for errors in their own thinking. This is especially true when reporting the status of a session of exploratory tests investigating a defect.
  • Diverse ideas: Excellent exploratory testers produce more and better ideas than novices.  They may make use of heuristics to accomplish this. Heuristics are devices such as guidelines, generic checklists, mnemonics, or rules of thumb. The diversity of tester temperaments and backgrounds on a team can also be harnessed by savvy exploratory testers through the process of group brainstorming to produce better test ideas.
  • Rich resources: Excellent exploratory testers build a deep inventory of tools, information sources, test data, and friends to draw upon. While testing, they stay alert for opportunities to apply those resources to the testing at hand.

Exploratory testing can be valuable in specific situations and reveal certain categories of defects more readily than other approaches. The overall effectiveness of behavioral testing approaches is heavily influenced by the tester’s in-depth system and domain knowledge and experience. Of course, the effectiveness of any test method eventually plateaus or becomes less valuable and testers must employ different approaches to further investigate and evaluate the software under test (The Pesticide Paradox).

Decision Table Testing

English: The Black eyed peasDecision tables are an effective tool for describing complex functional requirements and designing test cases. Yet, although they’re far from a new concept, I’ve only seen a handful of functional specs and test plans that included one. The best way to illustrate their effectiveness is by regaling you with a tale of when one wasn’t used.

The year was 2009. Barack Obama was recently inaugurated as the 44th President of the United States, and the Black Eyed Peas hauntingly poetic Boom Boom Pow was topping the Billboard charts. I was the lead tester on a new security feature for our product.

Our software could be installed on six server types. Depending the server type and whether it was a domain controller, eleven variables would be set. The project PM attempted to describe these setting combinations in paragraph format. The result was a spec that was impossible to follow.

I tried designing my test cases from the document, but I wasn’t convinced all possible configurations were covered. You know how poor programming can have “code smell“? Well, this had “spec smell”. I decided to capture the logic in a decision table.

According to the always-reliable Wikipedia, decision tables have been around since ancient Babylon. However, my uncle, Johnny “Dumplings”, who is equally reliable, insists they’ve only been around since Thursday. I suspect the true answer lies somewhere in between.

If you’re not familiar with decision tables, let’s go through a simple example. Assume your local baseball squadron offers free tickets to kids and discounted tickets to senior citizens. One game a year, free hats are given to all fans.

To represent this logic in a decision table, create a spreadsheet and list all inputs and expected results down the left side. The inputs in this case are the fan’s age and sex. The expected results are ticket price and hat color.

Each row of a decision table should contain the different possible values for a single variable. Each column, then, represents a different combination of input values along with their expected results. In this example, the first column represents the expected results for boys under 5 — “Free Admission” and “Blue Hat”. The last column shows that female senior citizens get $10 tickets and a pink hat.

Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6
Inputs
Age < 5 Y Y
5 =< Age < 65 Y Y
Age >= 65 Y Y
Sex M F M F M F
Results
Free Admission Y Y
$10 Admission Y Y
$20 Admission Y Y
Blue Hat Giveaway Y Y Y
Pink Hat Giveaway Y Y Y

There are three major advantages to decision tables.

  1. Decision tables define expected results for all input combinations in an easy-to-read format. When included in your functional spec, decision tables help developers keep bugs out of the product from the beginning.
  2. Decision tables help us design our test cases. Every column in a decision table should be converted into at least one test case. The first column in this table defines a test for boys under 5. If an input can be a range of values, however, such as “5 =< Age < 65”, then we should create tests at the high and low ends of the range to validate our boundary conditions.
  3. Decision tables are effective for reporting test results. They can be used to clearly show management exactly what scenarios are working and not working so informed decisions can be made.

It’s important to note that decision tables only work if the order the conditions are evaluated in, and the order of the expected results, doesn’t matter. If order matters, use a state transition diagram instead. I’ll blabber about them in a future article.

Getting back to my story, I used the spec to fill in a decision table representing all possible server combinations and expected results.  The question marks in the table clearly showed that results weren’t defined for several installation configurations. I brought this to the Program Manager’s attention, and we will able to fill in the blanks to lock down the requirements.

Case 1 Case 2 Case 3 Case 4 Case 5 Case 6 Case 7 Case 8 Case 9 Case 10
Inputs
Server Type Type 1 Type 1 Type 2 Type 2 Type 3 Type 3 Type 4 Type 4 Type 5 Type 6
DC Y N Y N Y N Y N * *
Results
Setting1 Y Y Y Y Y ? N N N N
Setting2 Y ? Y Y Y ? N N N N
Setting3 Y Y Y Y Y ? N N N N
Setting4 Y Y Y Y Y ? N ? N N
Setting5 Y Y Y Y Y ? N N N N
Setting6 Y Y Y Y Y ? N N N N
Setting7 ? ? ? ? ? ? ? ? ? ?
Setting8 N N N N N ? Y Y N N
Setting9 ? N N N N ? Y N N N
Setting10 N N ? Y Y ? N N ? N
Setting11 N N Y Y Y ? N N N N

This easy-to-understand table took the place of 2 full pages of spaghetti text in the functional spec. As a result, the developers had a comprehensive set of requirements to work off of, and I had designed a thorough set of test cases to validate their work. Boom Boom Pow, indeed!

%d bloggers like this: