First things first

This is not a business related blog. When I write about KPI (plural), I mean the real stuff. KPI that tells you if your environment is functioning or not. You can, and should, mentally file it as an extension of your monitoring solution, and carrying the same importance, only requiring deeper understanding of your services, and make use of other metrics and resources (e.g. data base, logs, online counters).

Most of what you may read elsewhere regarding Key Performance Indicators is probably true. Everybody says they should be specific, and they are right. A KPI should be descriptive, right again. Targeting the bottom line, not a step in the process, correct. Read as much as you can about it and you will be a better person for it (and less likely to do something stupid). Business KPI are important, but they should not wake you up at night.

Green, red, and maybe yellow

Business KPI will probably be numerical, maybe even a beautiful gauge. Usually they represent trends. Production environment KPI should be Good / Bad. In some cases, you can allow for middle ground (which means not enough information yet, so if you planned to go home, go to sleep or pretend that you have a life, hold on to that and currently keep refreshing) . So we have green for good, red for bad, and yellow where applicable.

Now that we are all aligned, here comes the rules.

1. A red KPI cannot be ignored

This has two meanings. First – You should never ignore a red KPI. If the KPI describes something that can be left alone in its redness for a while – you don’t need it in the killer KPI list. Second – you should not be able to ignore a red KPI. It should hunt you using all forms of communications, send you push notification, phone calls, email, the works.

2. A KPI is only as strong as the SLA that supports it

In other ways, if you have the best KPI, but no team and procedures to handle it, you have nothing. Nothing I tell you! Plan an escalation process, where all team members are contacted (one by one or all at once) until someone takes responsibility of handling the issue. Visit my friends at ayehu.com to see how this process can be taken to the extreme. Even if you cannot afford their solution, it is inspiring to see how far you can take the automation of IT issue handling.

3. Get yourself a decent dashboard that displays status and time

If you want people to go over you KPI – make it easy for them. All the data in on quick glance. Time for the last test of each KPI should also be presented, just in case your KPI mechanism dies and you keep looking at a day old green KPI. Tip – make your dashboard page as simple as possible, and mobile friendly as possible.

4. Decide in advance if a KPI turns back to green automatically

In some cases you will only be interested in the current state. In others, a one time red is enough to require human interaction., and the KPI should stay red until reset (and not turn green again). Make sure your KPI platform can support both kinds.

5. Learn from red KPI

If you experience the same KPI going red over and over again – something is wrong with your environment, or your KPI. In both cases – shame on you. It is perfectly fine to spend a lot of time researching the reasons why a KPI went red, but only for the first time. After that, you should have a procedure. Also, help yourself – add logs and shortcuts that will enable you to pinpoint the issue as quick as possible, and as simply as possible (unless you want to be the KPI woke me at night guy forever).