Back in February, in his blog titled “Monitoring your service platform – What and how much to monitor and alert?“, Prakash discussed the monitoring of a service running in the cloud, the multitude of alerts that could be set up, and how testers need to trigger and test the alerts. Toward the end he said:
Once it’s successful to simulate [the alerts], these tests results will provide confidence to the operations engineering team that they will be able to handle and manage them quickly and effectively.
When I read this part of his posting, one thing jumped into my Tester’s mind: Who is verifying that there is a working process in place to deal with the alerts once they start coming through from Production? If you have an Operations team, do they already have a system in place that you can just ‘plug’ your alerts into? If you don’t have a system in place, are they responsible for developing the system?
If you, the tester, has any doubt about the alert handling process, here are a few questions you might want to ask. Think of it as a test plan for the process.
- Is the process documented someplace accessible to everyone it affects?
- Is there a CRM (Customer Relationship Management) or other tracking system for the alerts?
- Is a system like SCOM (System Center Operations Manager) being used that can automatically generate a ticket for the alert, and handle some of the alerts without human intervention?
- Is there an SLA (Service Level Agreement) or OLA (Operations level Agreement) detailing turn-around time for each level of alert?
- Who is on the hook for the first level of investigation?
- What is the escalation path if the Operations team can’t debug the issue? Is it the original product team? Is there a separate sustainment team?
- Who codes, tests, and deploys any code fix?
You might want to take an alert from the test system and inject it into the proposed production system, appropriately marked as a test, of course. Does the system work as expected?
More information on monitoring and alerts can be found in the Microsoft Operations Framework.
Filed under: Software as a Service, Test Planning |
Leave a Reply