Thursday, 12 December 2013

Service Management Getting Back to Basics - Part 4 - Critical Incidents vs Number of Emergency Changes


When it comes to service delivery a happy business makes for a happy support team. What happens though when we see the stats in our reporting which should mean the business is happy but their experience doesn't match up to stats. 

Let's look at a scenario
A few weeks after the monthly IT statistics are published the incident manager runs across one of the application managers in the corporate office kitchen. In the process of preparing their morning coffee the application manager relates his most recent experience of unending emergency changes which involves several after hour deployments the past several months. The Incident manager nods and smiles, she puts a lid on her coffee and then heads back to her desk. While walking back she starts thinking about these weekend issues. She replays the the image in her mind of the application manager motioning his hands like a mushroom cloud. “We didn’t have that many outages last month,” she recalls. Like most service management professionals she pulls up the reporting in the ITSM tool to see what was going on and the metrics look like this:

 
Much as she thought, the volume of Priority 1 incidents was low and did not change from the previous month. She contacts the change manager to get her take on the situation. The change manager indicates that while there has been a slight increase in volume of changes over the past 6 months the percentage of emergency changes has made a much larger increase. The two of them get together and map their stats on one chart which looks like this:
 
Further discussion with the customers confirms that we have more issues going on than our metrics are indicating.

This example highlights once again the need to not only communicate within IT but really take a close look at what are metrics are trying to tell us. Separately, the Incident and Change numbers only told us half the story but combined they are able to show us that we were other issues going on. We could see that while the Incidents were related to the changes the priority was never updated, a good majority of them were left as Priority 4. The Incident manager didn’t realize this mainly because the P4 incidents are rarely reviewed since they are of a lower priority.

To remedy these inaccuracies the Incident and Change Manager will need to review their metrics and identify where the issues lie and what actions will need to be taken to ensure that the incidents are prioritized correctly. Completing these regular reviews will give IT the knowledge to appropriately strategize on service improvements.

Like the other posts in this series we might see a drastic change in the metrics when we relook at how they are generated. That’s OK. I know what you are thinking, “easy for you to say as you type away in your blog about a fictitious situation.” But trust me I have been through this. The key is identifying the issues and taking the next steps to correct the issue. we can explain the anomoly to anyone who is looking at this as an exercise in improvement and communication.

Keep in mind that the only people who are in the dark about the service being provided are IT at this point. The business already experiences the issues first hand.


Check out the conclustion - Wrapping it Together

Follow me on Twitter @ryanrogilvie or connect with me on LinkedIn


 

2 comments:

  1. Yes! It is real life stories like these that should serve as examples on how to interpret Incident and Change metrics.

    ReplyDelete
  2. The change manager indicates that while there has been a slight increase in volume of changes over the past 6 months the percentage of emergency changes has made a much larger increase. performance testing software

    ReplyDelete