Testers Caught Sleeping on the Job

At Microsoft, we submit every code change to a peer-review before it’s checked in. I’ve performed hundreds (although it feels like millions) of code reviews in the six years I’ve been in Test. One of my biggest red flags during a code review is the Sleep statement.

The Sleep command suspends the current thread for a specified period. A typical test that uses Sleep might look like this:

  // Send a message 

  // Wait 2 minutes for message to be received.

  // Message must be there. Let's read it.

The problem with Sleep statements is that they usually sleep either too long or not long enough. If the sleep is too short, your test will fail because the expected state hasn’t been reached. In the above example, if it takes three minutes to receive the message, the test will fail since it’ll try to read a message that isn’t there yet.

If the sleep is too long, your test might still fail; the expected state may have come and gone. But even if it passes, the test isn’t as efficient as it could be. This could lead to test passes that take a long time to complete. If this message was received in one minute, the test will take a minute longer than required. One minute might not sound like a lot, but if you have thirty similar tests, you’re wasting a half hour of run time.

When you find a long Sleep statement inside a loop, there could be room for a significant performance improvement. During one code review, I found a test that looked like the following; this one loop wasted more than a half hour of run time.

  // Send a bunch of messages 
  for (int i=1; i<30; i++) 
     // Send one message 

     // Wait 2 minutes for message to be received. 

     // Read the message.   

Long Sleep statements are also a common cause of tests that fail intermittently. The examples we’ve been using assume we’ll receive every message within two minutes. Even if that’s the case 99% of the time, if the test runs every day it’ll still fail every few months. Furthermore, these tests aren’t very portable. If you run them on a faster or slower machine, or on a server with a different load, the tests might fail. Each of these false positives must then be investigated, wasting your semi-precious time.

When you find a Sleep that lasts more than a few seconds, the best solution is to remove the Sleep and subscribe to an event that’s raised when the expected state occurs. This removes all the guessing out of your test case, and ensures you’re not waiting too long or too short. Such an event might not exist, in which case you may be able to write one yourself, or ask the product developer to create one.

If an event-based solution is impractical to use, all hope is not lost. In that case, replace the long Sleep with logic that polls until either the expected event occurs or until a timeout period is reached. In test above, we saved thirty minutes of run time by replacing the two-minute Sleep with the following:

  private static bool PollForMessage() 
     DateTime timeout = DateTime.Now.AddMinutes(2); 
     bool gotIt = MessageExists(); 

     // Loop until message received or two minutes has passed 
     while (!gotIt && DateTime.Now < timeout) 
        gotIt = MessageExists(); 
     return gotIt; 

You may be wondering why I’m whining about Sleep statements, but then went ahead and used one in this solution. That’s because, at most, this routine will only sleep one second longer than it should, which isn’t too shabby. If you find the message isn’t always received within two minutes, you can safely go ahead and change the timeout to three minutes without wasting any run time.

The next time you find yourself writing a long Sleep statement, replace it with either an event, or a function that polls until your expected state occurs. If you do this consistently, your test passes will finish faster with more consistent results.

P.S. After I wrote this article, I found that BJ Rollison recently wrote about the same topic, only much more eloquently, on his own blog. See for yourself.

%d bloggers like this: