LatinCon08 - Guaranteeing Service Availability in SLAs; a Study of the Risk Associated with Contract Period and Failure Process (LatinCon08 - Guaranteeing Service Availability in SLAs; a Study of the Risk Associated with Contract Period and Failure Process)

Andrés J. González (, Bjarne E. Helvik (

1Departamento de Telemática y del Centro Q2S-NTNU
2Facultad de Información y Tecnología NTNU

This paper appears in: Revista IEEE América Latina

Publication Date: Aug. 2010
Volume: 8,   Issue: 4 
ISSN: 1548-0992

Service Level Agreements (SLAs) are a common means to define the obligations of network/service providers and users in business relationships. The terms that define the guaranteed availability for a given period are an important element of these contracts. The appropriate selection values is difficult due to the large number of variables involved, the complexities of the network and service provision and the computational challenge posed by the transient solution, as opposed to a steady state, that is needed. A common policy taken to solve it, is using the steady state availability as a reference. Nevertheless this simplification may put on risk the contract fulfillment as stochastic variation of the measured availability is significant over a typical contract period. This paper analyzes the relevance that the interval availability analysis has on SLAs, and provides suggestions to the network providers on the selection of adequate availability guarantees. The interval availability of unprotected and shared protected connections is studied under exponential and Weibull failure and repair distributions. It is observed that for a single path scenario, a small reduction of the guaranteed availability below the steady state value improve the probability to meet the requirements considerably. The same is the case for connections with shared backup protection. However performing this analysis in the transient domain is quite demanding. Hence, to simplify it, it is proposed to obtain the steady state results and introduce a safeguard factor to control that the availability guarantee is meet. For the Weibull distributed times between failures, where the shape factor is less than one, as observed in operational networks, the the probability of meeting a guaranteed availability over a finite contract period, decrease more radically than for the commonly assumed Poisson failure process. This increases the importance of making a transient analysis.

Index Terms:
Network dependability, failure characterization, Weibull distribution, SLA definition, risk in SLAs   

Documents that cite this document
This function is not implemented yet.

[PDF Full-Text (320)]