Follow us

Defining Metrics for Problem Management

By | Originally published on June 24, 2014 in ITIL | Updated on November 28, 2017

Editor’s Note:
While reviewing the level of readership of our blogs, we couldn’t help but notice that certain blogs never lost their popularity over the years. This is one of them – with thousands of unique views every month. We thank Stuart Rance for his words of wisdom that clearly sustain longevity (the advice is as relevant today as it was when it was original published). So, ICYMI, we’re pleased to republish this blog for your convenience.

Defining Metrics for Problem Management

Many people define KPIs for their IT service management processes by looking in books (such as ITIL Service Operation) or by copying metrics that other organizations use. This is rarely going to give good results, because KPIs need to INDICATE the PERFORMANCE of the KEY things you care about (that’s why they’re called Key Performance Indicators). In the worst cases I have seen ITSM processes with huge numbers of so-called KPIs that are measured and reported even though nobody uses the values to drive any changes in behaviour or improvements in business outcomes.

I recently wrote a blog titled Defining Metrics for Change Management in which I explained how you can create KPIs that support what you are trying to achieve. A number of people contacted me after reading that blog to ask for examples of how to derive KPIs for other ITSM processes. I decided to write this blog about problem management KPIs because this is one process where many organizations I have worked with had very poor KPIs. Remember you shouldn’t just copy the outcomes, critical success factors (CSFs) and KPIs that I am describing here, you should use them to understand the approach and methodology I have used, and then think about what is important to you and derive metrics that measure the things you care about.

The first step to defining good KPIs is to identify the objectives of problem management, what outcomes does problem management help us to achieve? For me there are two key outcomes of a good problem management process:

  • Reducing the number of incidents that occur
  • Reducing the business impact of incidents that can’t be avoided

We could just measure the number of incidents and the overall business impact of incidents. These would certainly be valuable things to know, but I’m not sure they’d show how well problem management has been working, because so many other factors could have contributed. So I will break these down a bit and identify some problem management CSFs that could contribute to these outcomes:

  • Identify problems that have caused multiple incidents
  • Implement workarounds that reduce the impact of incidents
  • Initiate changes that reduce the number of incidents

Learn about SysAid Problem Management

It’s worth noting that I didn’t mention root cause analysis (RCA). I see many problem management people who only think about RCA, but this doesn’t actually deliver any benefit, it’s just a technique that we use in problem management. The worst problem management KPIs that I see are “Average time to root cause”, “Percentage of problems with RCA complete in 3 days”, or similar. These KPIs drive behaviours that we really don’t want, by encouraging problem management people to declare that they have found “the” root cause of a complex situation rather than continuing to analyse and understand it even after they have identified one significant contributory factor.

“When it comes to problem management, thinking about root cause analysis doesn't deliver any benefits” - @StuartRance #ITIL #ITSM Click To Tweet

One of my customers has a process for prioritising problems that takes account of the frequency and business impact of the problem, including the mitigation provided by any workarounds that are in place. They then have a KPI of “Average time to reduce problems to P3 priority.” This reduction can be achieved by resolving the problem, or by implementing a good workaround. The point is that they are measuring problem management based on how well they are reducing pain to the business. I’m not going to suggest that KPI here because it requires a fairly sophisticated approach to problem prioritisation, which not many IT organizations can achieve, but if you can measure this then it’s certainly something you could think about.

Here are some suggested KPIs that might help to demonstrate the CSFs I have listed above. Remember you shouldn’t just copy these – use a similar process to identify KPIs that will measure what you care about.

CSF1 - Identify problems that have caused multiple incidents

  • Increased percentage of incidents associated with a problem record or known error
  • Top 5 problem report created every month

CSF2 - Implement workarounds that reduce the impact of incidents

  • Increased percentage of incidents for which a knowledge base article provided the solution
  • Increased percentage of incidents closed by users using self-service incident management
  • Reduced impact of incidents associated with previous months’ top 5 problems

CSF3 - Initiate changes that reduce the number of incidents

  • Reduced number of incidents associated with previous months’ top 5 problems
  • Reduced backlog of outstanding problems

I have worded these KPIs as “Increased…” or “Reduced…” because I don’t have the data needed to set explicit targets. As you make use of metrics like these you can put in place numerical targets, based on the baseline that you create when you first start measuring and reporting.

How well do your problem management metrics measure what your customers care about? Is it time to review your problem management KPIs and align them with your CSFs and objectives?

Learn about SysAid Problem Management

Update: Since writing this blog, Stuart has helped to write the publication ITIL Practitioner Guidance, which includes lots of helpful suggestions on how to define CSFs and KPIs.

Stuart Rance

About Stuart Rance

Stuart is an ITSM and security consultant, working with clients all round the world. He is one of the authors of ITIL 4, as well as an author of ITIL Practitioner, ITIL Service Transition, and Resilia: Cyber Resilience Best Practice. He is also a trainer, teaching standard and custom courses in ITSM and information security management, and an examiner helping to create ITIL and other exams. Now that his children have all left home, he has plenty of time on his hands for contributing to our blog - lucky us!

6 thoughts on “Defining Metrics for Problem Management”

  1. Avatar Joanne Ashford

    Thank you for getting to the heart of what is essential to understand as you help others figure out what is important to their unique success. Appreciate the clarity and straightforward approach.


  2. Avatar Yogesh Chande

    Thanks for sharing this informative guide. This would definitley help to define KPIs based on the actuals as applicable to organizations.


  3. Avatar Tania

    I’m just getting started is this position of Problem Management. I have never been one before and it has been years since this organization has had one. Ali g with that pressure…everyone was excited that there is a problem manager onboard. One problem….where do I begin. What are somethings I should do 1st?


Leave a Reply

Your email address will not be published.