Tuesday, 5 August 2014

Practice Shouldn’t Always Make Perfect - Using Standard Changes for Break/Fix

When I was younger my parents, teachers, etc. had explained to me that for the most part if you want to be really good at something you need to practice. I would imagine that most people have heard that at some point. Unfortunately from an IT support perspective in some cases we have applied this to operational activities where they really don’t belong.

An example of this is the implementation of a standard or routine change to correct an underlying issue. I know what you are saying, “…surely there must be a problem or some type of investigation for this issue though…right?”

Let’s look at this from this perspective. In an effort to improve service we (the IT organization) have decided to separate our change types by 3 types:

Emergency – are break/fix repair changes
Normal – requires more approvals, possible a CAB review and has high business impact
Standard- may have an automated approval, low risk and visibility

One of the areas of your reporting may provide data on the number of Emergency (break/fix) changes you have each month. This it allows you to align Incidents as well as identifying areas where you can continue to make improvements among other things. But let’s assume for a moment that an IT Ops analyst says, “We know we have an issue, we need to perform fix ‘x’ each week to keep the service up and running. Why can’t we just create standard changes for these, after all the fix is pretty standard, it's just a restart of services."

In my opinion I would suggest to avoid this, and here’s why.

First of all most people are naturally looking to have their “numbers” reflect an environment which is stable. To this analyst’s credit in this scenario they are saying that the risk is low and that we don’t want it to look like the sky is falling when in reality this is not the case. While I understand this position think about this for a moment. Ask yourself, once we take away the visibility of the issue are we really putting ourselves in a position to improve service or have it limp along? Here are the risks.

1.     There may be some underlying infrastructure issue which we are bandaging each week that without the proper knowledge that it exists could in fact get worse. Further to that if we introduce other changes to this issue we could impact a deployment in ways we are unaware of.

2.     The business may not be aware this issue exists. Allowing the business to know what limitations we have may position the collaboration between them and IT to make better decisions on a strategy down the road. The business may suggest that to correct this issue would not be worth the expense so continue to bandage the wound. They may also indicate that this potential performance risk is preventing them from taking their business to the next level and expenditure to repair or replace is in the business best interest.

This is why I like to see these issues remain as an emergency change type. There should be no secrets as it pertains to outlining the weaknesses which are present. After all they already exist whether we want to see them or not.

 

Follow us on Twitter @ryanrogilvie

No comments:

Post a Comment