Service levels have traditionally been associated with major outsourcing arrangements, as well as software supply
and support agreements. But as more IT and business functions – including software (SaaS), platforms (PaaS) and infrastructure (IaaS) – are being provided on a hosted (or “cloud”) basis, the need for suppliers and customers to agree service levels (and associated service credits) is becoming more common. In this post I look at what sort of service levels customers should be thinking about (or perhaps getting their suppliers to think about), and what both customers and suppliers need to bear in mind when negotiating service levels. In the next few weeks there will be a second post on SLAs, in which we look at the consequences of the supplier breaching the service level.
In broad terms, a service level constitutes an agreed standard for a supplier’s performance of services, or a specific aspect of those services. What type(s) of service level are appropriate depends on the precise nature of the services being provided, but in the context of remotely-hosted services the most important service level is likely to be availability.
An availability service level is usually expressed as a percentage, eg an uptime of 99.9%. Whilst at first glance quite straightforward, both customers and suppliers should consider the detail:
- Service availability measurement period: Using a 99.9% availability service level, a monthly measurement period allows the supplier one or more outages not exceeding 43 minutes in total without being in breach of the service level. The same service level measured on a quarterly basis would allow a single outage of more than two hours, which of course could have a more significant effect on the customer’s business or operations but without triggering a service credit or other compensation.
- Individual outage limit: In addition to an aggregate limit on outages during the service availability measurement period, should individual outages be subject to a separate limit/service level?
- Out-of-hours outages: Should outages during out-of-hours be measured in the same way as outages during core hours?
- Outage measurement period: Should the outage be measured from the time it actually starts, or when the customer reports the outage to the supplier?
- Scheduled maintenance: Should scheduled or planned maintenance carried out by the supplier be excluded from the availability calculation? If so, should the amount of time allowed for maintenance in a measurement period be capped and/or limited to out-of-hours? How much advance notice of scheduled maintenance should the supplier have to provide?
- Emergency maintenance: Should emergency maintenance be excluded from the availability calculation?
- Availability of minor functionality: Should the non-availability of minor, non-critical elements of the services be excluded from the availability calculation? If so, what constitutes a “minor, non-critical element”?
- Force Majeure: Who bears the risk of Force Majeure events, ie events which affect the availability of the services but which are outside the control of either the supplier or the customer? Should events which, although technically within the control of one party, may not reasonably be considered preventable constitute Force Majeure, eg the failure of a third party supplier?
Other service levels
For remotely-hosted services, the customer and supplier may want to think about some or all of the following service levels:
Issues to consider
Service delivery date
(the date by which the supplier is to start providing the services (or by which the services are to have passed acceptance tests), with compensation being calculated by reference to the duration of the delay)
1. Should the non-availability of minor, non-critical elements be disregarded for the purpose of the service delivery date (or acceptance)?
2. How long is the “permitted” delay period, ie how long after the agreed service delivery date does the customer need to wait before terminating the services supply agreement?
Technical helpdesk or fault response
(the time within which the supplier must respond to the customer’s request for technical support or the reporting of a fault)
1. Should different severities of fault be subject to different response times?
2. Are response times measured during the supplier working hours, or on a 24/7 (or other) basis?
3. What form should the customer’s fault notification take? Is there a prescribed reporting/fault logging procedure?
4. What constitutes a “response”, eg is the supplier’s acknowledgment sufficient, or does the supplier need to provide an rectification/workaround plan?
5. Following the supplier’s response, should the supplier be required to provide periodic updates and, if so, how frequently?
(the time within which the supplier must provide the customer with the requested technical support or a resolution for a fault)
1. Should different severities of support requests/faults be subject to different resolution times?
2. Is a work around sufficient? If so, for all severity faults, or just minor ones?
© Marcus Andreen 2013. All rights reserved.