Equivalence Class Testing

When testing even a simple application, it’s often impossible, or at least impractical, to test with all possible inputs. This is true whether you’re testing manually or using automation; throwing computing power at the problem doesn’t usually solve it. The question is, how do you find those inputs that give the most coverage in the fewest tests.

Assume we have a UI with just two fields for creating a user account. User Name can be between 1 and 25 characters, but can’t contain a space. Sex can be either Male or Female.

To test all possible inputs, we would first test single-character names using all possible characters. Then test two-character names using all possible characters in the first position, along with all possible characters in the second position. This process would be repeated for all lengths between 1 and 25. We then need to repeat all these names for Males and Females. I’m no mathematician, but this works out to be close to a kagillion possible inputs.

If this simple UI can have so many inputs, then it’s clear we’ll usually have to test with a very small subset of all possible data. Equivalence classes (ECs) give us the guidance we need to select these subsets.

ECs are subsets of input data, where testing any one value in the set should be the same as testing the others. Here we have two equivalence classes of valid input data.

Field Description
1. User Name Length >=1 and <=25 not containing a space
2. Sex Male or Female

Since we have more than one EC of valid data, we can create a single test that randomly picks one value from each. After all, testing with A/Male should give us the same result as testing with AB/Female or JKHFGKJkjgjkhg/Male. In all three cases, a new user account should be created.

It’s also important to create classes of invalid input. If you have an EC of valid data in the range of 1-10, you should also an EC of invalid data less than 1 and an EC of values greater than 10. We can create five equivalence classes of invalid input for our sample app.

Field Description
3. User Name Empty
4. User Name Length > 25 (no spaces)
5. User Name Length > 25 (containing a space)
6. User Name Length >1 and <=25 containing a space
7. Sex Neither Male nor Female

When designing your tests, it’s important to only pick one invalid value per test case; using more than one can lead to missed product bugs. Suppose you choose a test with both an invalid User Name and an invalid Sex. The product may throw an error when it detects the invalid User Name, and never even get to the logic that validates the Sex. If there are bugs in the Sex-validation logic, your test won’t catch it.

Using our ECs, we’d create the following tests.

  1. Valid name from #1 and valid Sex from #2
  2. Invalid name from #3 and valid Sex from #2
  3. Invalid name from #4 and valid Sex  from #2
  4. Invalid name from #5 and valid Sex from #2
  5. Invalid name from #6 and valid Sex from #2
  6. Valid name from #1 and invalid Sex from #7

Equivalence classes have reduced our number of tests from a kagillion to six. As Adam Sandler might say, “Not too shabby.”

Not only does EC testing make sure you don’t have too many tests, it also ensures you don’t have too few. Testers not familiar with equivalence classes might test this UI with just four pairs of hardcoded values:

  1. John/Male
  2. Jane/Female
  3. John Smith/Male
  4. Joan/Blah

When these tests are automated and running in your test pass, they validate the same values day after day. Using equivalence class testing with random values, we validate different input every day, increasing our odds of finding bugs.

The day I wrote this article I found a UI bug that had been in the product for several months. The UI didn’t correctly handle Display Names of more than 64 characters. Sure enough, the UI tests didn’t use ECs, and tested with the same hardcoded values every day. If they had chosen a random value from an EC, the test would have eventually found this bug.

Another benefit of equivalence class testing is that it leads us Boundary value testing. When an equivalence class is a range, we should create tests at the high and low edge of the range, and just outside the range. This helps us find bugs in cases when the developer may have used a > instead of >=.

Boundary tests, however, should not take the place of your EC tests. In our example, don’t hardcode your EC test to have a User Name length of 25. Your EC tests should choose a random number between 1 and 25. You should have a separate Boundary tests that validate length 25.

And don’t forget to create tests for values far outside the range. The size of your variable is a type of EC class as well! This could lead to other bugs such as exceeding the max length of a string, or trying to shove a long number into an int.

When defining your ECs, the first place to look is the Feature Specification Document (or whatever your Agile equivalent may be.)  The other place is in the product code. This, however, can be risky. If the source code has a statement like if x> 1 && x<5, you would create your EC as 2,3,4, and your tests will pass. But how do you know the source code was correct? Maybe it was intended to be x>=1 && x<=5. That’s why you should always push for valid input rules to be defined in a spec.

Another technique for creating ECs is to is to break up the output into equivalence classes. Then choose inputs that give you an output from each class. In our example, assume the user account can be stored in one of two databases depending on whether the User Name contains Chinese characters. In this case, we would need one EC of valid Chinese User Names and one of valid non-Chinese User Names.

PS – Not only is this article a (hopefully) interesting lesson on equivalence classes, it’s also an interesting experiment for the Expert Testers blog. In this article, I used the word Sex 15 times. I’m curious if it gets more traffic than usual because of it. If it does, you can look forward to my next article, “The Naked Truth – Pairwise Testing Exposed!”

6 Responses

  1. Doesn’t EC 5 violate your rule of one invalid value per test case? The name field here would violate two rules- excess length and containing an illegal character (space). EC 5 would be redundant if you have EC4 and EC6.

    This brings up the difference between branch/decision coverage and condition coverage. The definition of a valid input for a particular field may consist of more than one condition (boolean)- here, length between 1 and 25, and containing no spaces. For the name field, do we want to have just two tests, which conform to and violate the overall definition of valid input, or more tests, which pass and fail each of the two conditions within the definition.

    Also, if a field has only two valid input values (male and female), testing to see if both valid values are accepted should not be avoided.
    If the sex is entered via text field rather than pulldown or radio button, there should be tests for capital, lowercase and mixed case spellings of male and female.

    • Fred, thanks for the feedback.

      EC 5 doesn’t violate the rule of one invalid value per test case. What I was trying to say (but didn’t do it very well!) is that a test case should contain at most one value from an invalid EC. All 6 tests listed above conform to this rule. I’ll modify the wording in the article to be more clear.

      You’re correct about testing mixed case spellings of male and female. Just keep in mind that Equivalence Class testing is only one of many techniques for designing test cases. In addition to your EC tests, you should also write tests for any special values that are likely to cause issues.

  2. Hi,
    you said: “Using equivalence class testing with random values, we validate different input every day, increasing our odds of finding bugs.”
    is it possible to use a gnerator of passwords (for instance) inside an automated test suit? or did you refer to manual testing?
    Thanks,
    Reut

  3. […] Equivalence Class Testing This week we turn to equivalence class testing. Equivalence class testing is a black box software testing technique that divides function variable ranges into classes/subsets that are disjoint. It is beneficial for two cases: When exhaustive testing is required. When there is a strong need to avoid redundancy. Equivalence class testing selects test cases one element from each equivalence class. This helps to reduce redundancy. There are 4 types of equivalence class testing: Weak normal equivalence class testing: The term weak refers to single fault assumption. This is accomplished by using one variable from each equivalence class in the test case. Strong normal equivalence class testing: The term strong refers to multiple fault assumption. This method uses test cases from each element of the cartesian product of the equivalence classes. The cartesian product guarantees the notion of completeness in that all of the equivalent classes are covered and there is one of each possible combination of inputs. Weak robust equivalence class testing: This term combines weak with robust, meaning single fault assumption with invalid values. Strong robust equivalence class testing: This term combines strong with robust, meaning multiple fault assumption with invalid values. The test cases are the cartesian product of all the equivalence classes. This article talks about redundancy in equivalence class testing. It introduces the following problem: Suppose a there is a UI for creating user accounts. The Username is between 1 and 25 characters, spaces are not allowed. The gender can be Male or Female. It would be impossible to test all combinations of inputs instead we turn to equivalence class testing. […]

  4. I have been doing this for 18 years I pulled it from training Neural networks. Funny to see it given a name.

Leave a comment