If you regularly read my blog you’ll know that I’ve already written a fair bit on the tough nut to crack that is problem management. It’s often something that’s started as part of the latest IT service management (ITSM) tool implementation project, but it’s not unusual for this initial investment in problem management (processes) to fail in execution due to one or more reasons.
From a problem management uptake perspective, if you believe what the annual industry surveys report, roughlytwo-thirds of IT organizations are already “doing” problem management. But it’s not always what it should be, i.e. the investment of time and resources to proactively investigate and address recurring IT and business issues, and their root causes. It’s this type of investigation that helps to identify the issues that cause (or may ultimately cause) repetitive and potentially serious IT and business issues or failures. Instead, IT organizations are often just doing major incident reviews, using problem management techniques, as and when needed. It’s problem management of sorts but not truly effective problem management.
In reality, problem management is often somewhat of the “poor relative” to service desk and incident management activities. Whereas service desk and incident management are commonly receiving adequate investment in terms of staff, definition, training, and ongoing operation, problem management, on the other hand, is often “something to be done later” and therefore often not done at all.
In my opinion, the low levels of proactive problem management adoption is quite ironic. The pressure to cut IT operational costs is why many IT organizations don’t do problem management, but it should be the reason why they need to be doing problem management. Of all the major ITIL processes, the investment of time and resources in truly effective problem management activity can provide some of the highest returns to an organization.
So to give you a simple introduction to problem management, I’ll quickly cover:
I refer to ITIL a fair bit, you might think too much, but you can quite easily use your own self-created problem management process and activities or look to alternative sources of ITSM and IT management advice such as ISO/IEC 20000, ISACA’s COBIT, USMBOK, or Microsoft’s MOF.
Problems (definition below) can be identified throughout the IT ecosystem. For example: acceptance into production, changes, updates/patches, vendor products, user errors, production execution, and failures. However, the main source for problem identification with an organization is probably the analysis of incidents as part of what is often called the “proactive problem management process.”
However, not only isproblem management often solely associated with major incidents, another barrier to effective problem management is that problems are often confused with incidents (with the terminology interchanged wrongly). Or they are seen as an incident state rather than a separate entity requiring a different type of ITSM response.
If it helps, an easy way to remember the difference between the two is that:
To succeed at problem management, IT senior management needs to appreciate that far too much costly, and possibly scarce, IT resources are currently spent fighting repetitive fires and that these resources would be better utilized supporting problem management activity to tackle the root causes, rather than the symptoms, of IT failures.
ITIL, the ITSM best-practice framework formally known as the IT Infrastructure Library, uses the term problem to describe:
“The unknown cause of one or more incidents.”
With problem management:
“The process of minimizing the adverse effect on the business of incidents and problems caused by errors in IT infrastructure and systems, and to proactively prevent the occurrence of incidents, problems, and errors.”
A problem will become a “known error” when the root cause is known and a temporary “workaround” or a permanent alternative solution has been identified.
For completeness, although I state my own benefits below, ITIL states that the value of problem management includes:
ITIL defines the objectives of the problem management process as:
Importantly, it can’t operate in a vacuum.
Problem management should have strong relationships with other key IT service management processes. In addition to the more-obvious linkages with incident and change management, it also needs to use configuration management data to help determine the impact of problems and resolutions. Let’s also not forget that availability management has a dependency on problem management information and activity, and some problems will require investigation by capacity management teams and techniques.
Problem management can also be an entry point into IT service continuity activity and major incident management, where a significant problem needs to be resolved before it starts to have a major adverse impact on the business. Finally, from a service level management perspective, problem management contributes to improvements in service levels, and its management information should be used as the basis for service review activity.
While not a linear lifecycle like incident management, you can view a problem going on a journey from identification through to “resolution.” Where resolution might come from error control or the creation of a workaround.
Thus it’s worth understanding that there are two common problem management “sub-processes”:
The result might be one of three outcomes:
It’s important to recognize that these three problem states are not mutually exclusive and that a problem may move between them over time. For instance, when possible, a workaround should still be made available while a problem is awaiting the implementation of a required change.
My simple diagram hopefully provides a snapshot of what can happen with problem management.
In my opinion the key benefits of problem management include, but are not restricted to:
Well there you have it, a quick guide and a simple introduction to problem management. Hopefully you found it helpful.
If you want to read more from me, and few of my friends, on problem management, then please look at: