Testers Caught Sleeping on the Job

At Microsoft, we submit every code change to a peer-review before it’s checked in. I’ve performed hundreds (although it feels like millions) of code reviews in the six years I’ve been in Test. One of my biggest red flags during a code review is the Sleep statement.

The Sleep command suspends the current thread for a specified period. A typical test that uses Sleep might look like this:

  // Send a message 
  SendMessage();

  // Wait 2 minutes for message to be received.
  Thread.Sleep(120000); 

  // Message must be there. Let's read it.
  ReadMessage();

The problem with Sleep statements is that they usually sleep either too long or not long enough. If the sleep is too short, your test will fail because the expected state hasn’t been reached. In the above example, if it takes three minutes to receive the message, the test will fail since it’ll try to read a message that isn’t there yet.

If the sleep is too long, your test might still fail; the expected state may have come and gone. But even if it passes, the test isn’t as efficient as it could be. This could lead to test passes that take a long time to complete. If this message was received in one minute, the test will take a minute longer than required. One minute might not sound like a lot, but if you have thirty similar tests, you’re wasting a half hour of run time.

When you find a long Sleep statement inside a loop, there could be room for a significant performance improvement. During one code review, I found a test that looked like the following; this one loop wasted more than a half hour of run time.

  // Send a bunch of messages 
  for (int i=1; i<30; i++) 
  { 
     // Send one message 
     SendMessage(i); 

     // Wait 2 minutes for message to be received. 
     Thread.Sleep(120000); 

     // Read the message.   
     ReadMessage(i); 
  }

Long Sleep statements are also a common cause of tests that fail intermittently. The examples we’ve been using assume we’ll receive every message within two minutes. Even if that’s the case 99% of the time, if the test runs every day it’ll still fail every few months. Furthermore, these tests aren’t very portable. If you run them on a faster or slower machine, or on a server with a different load, the tests might fail. Each of these false positives must then be investigated, wasting your semi-precious time.

When you find a Sleep that lasts more than a few seconds, the best solution is to remove the Sleep and subscribe to an event that’s raised when the expected state occurs. This removes all the guessing out of your test case, and ensures you’re not waiting too long or too short. Such an event might not exist, in which case you may be able to write one yourself, or ask the product developer to create one.

If an event-based solution is impractical to use, all hope is not lost. In that case, replace the long Sleep with logic that polls until either the expected event occurs or until a timeout period is reached. In test above, we saved thirty minutes of run time by replacing the two-minute Sleep with the following:

  private static bool PollForMessage() 
  { 
     DateTime timeout = DateTime.Now.AddMinutes(2); 
     bool gotIt = MessageExists(); 

     // Loop until message received or two minutes has passed 
     while (!gotIt && DateTime.Now < timeout) 
     { 
        Thread.Sleep(1000); 
        gotIt = MessageExists(); 
     } 
     return gotIt; 
  }

You may be wondering why I’m whining about Sleep statements, but then went ahead and used one in this solution. That’s because, at most, this routine will only sleep one second longer than it should, which isn’t too shabby. If you find the message isn’t always received within two minutes, you can safely go ahead and change the timeout to three minutes without wasting any run time.

The next time you find yourself writing a long Sleep statement, replace it with either an event, or a function that polls until your expected state occurs. If you do this consistently, your test passes will finish faster with more consistent results.

P.S. After I wrote this article, I found that BJ Rollison recently wrote about the same topic, only much more eloquently, on his own blog. See for yourself.

5 Responses

  1. This post hits all the check marks on the sleep issue. In case the checking condition is costly, one more improvement that I have seen and used is to start with a low sleep time and increase it either linearly or exponentially over the iterations.

  2. For event based automation, developer and tester may want to / should consider using ETW, i.e. http://msdn.microsoft.com/en-us/library/windows/desktop/aa363692(v=vs.85).aspx

  3. […] Testers Caught Sleeping on the Job is not browser-based, but you cannot bash sleeps enough. Also has a pretty amusing photo with it […]

  4. One more issue: on subscribing to an event, be sure to choose timeout appropriately. I suggest to not hard-code the timeout interval, rather use a configurable time so that you can change it in a config file and then re-run the test, or change it dynamically according to test and product run environments. A timeout that’s hard-coded to be too short is too easily dismissed as a false positive (i.e. the test appears to fail, but really it’s that the timeout is too short causing failure when the test would have succeeded otherwise) but really there’s a perf or external dependency issue to examine or account for. Avoid hard-coded or “magic” numbers!

  5. […] • Чем плоха команда Sleep и что считать более приемлемой альтернативой? […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: