Unified Communications.com

The truth about five nines availability in unified communications networks

By Gary Audin

Technology vendors and providers often throw around the term five nines (99.999%) availability when discussing their products or services, and the unified communications (UC) market is no different. The term is used so often that the average technologist doesn't think twice about it. But what does 99.999% mean? Let's take a look at the concept of five nines within the context of unified communications.

First of all, five nines does NOT refer to reliability. It refers to availability. Availability is the probability that a device or service will be working when you go to use it.

Availability is composed of two factors: Mean Time Between Failures (MTBF) or uptime and Mean Time To Repair (MTTR) or downtime. MTBF is the measure of reliability -- how failure-prone is the technology. Both MTBF and MTTR are commonly measured in hours.

Calculating availability for unified communications

Availability is described by the following equation:

Availability = [MTBF ÷ (MTBF + MTTR)] X 100 = 9X.XXX%


 

The R in MTTR stands for repair, but that's not the measurement you should use. The R should refer to the total time to restore the product or service to full operating condition. The restoration number needs to include the time for:

The availability metric does not tell everything you need to know. It doesn't tell you about the severity of an outage or the operational characteristics. Your system could suffer one huge outage or many short outages and still deliver 99+% availability. But the metric is still useful.

The following table translates 99.x% availability into operational terms. As you can see, the total downtime for five nines availability over 24 hours X 365-1/4 days is only five minutes and 15 seconds. This is a hard figure to deliver.

Translating five nines availability into time

Availability Downtime in one year
99.9999% 32 seconds
99.999% 5 minutes, 15 seconds
99.99% 52 minutes, 36 seconds
99.95% 4 hours, 23 minutes
99.9% 8 hours, 46 minutes
99.5% 1 day, 19 hours, 48 minutes
99% 3 days, 15 hours, 40 minutes

Applying MTBF and MTTR to UC hardware and software

So what does the availability figure include? The MTBF and MTTR are almost always related to hardware. In a unified communications environment this includes servers, gateways, switches, routers, power supplies and endpoints, such as PCs and IP phones. It is true that most hardware components are highly available and could meet the 99.999% figure.

Availability figures provided by vendors are rarely based on field experience. The MTBF figure is usually a calculated prediction using the Telcordia parts count method originally developed by Bell Labs for telecommunications systems. It takes two years of operating in the field, without changes, to prove a MTBF figure. It is unusual for a system to remain unchanged for two years. Every time the hardware changes, a new prediction calculation must be produced. So, the availability figures are also predictions.

What is not included in the MTBF calculation is very revealing. The vendors do not include:

There is no formula for predicting the reliability of unified communications software -- or any software. With today's dependence on software, the products and services offered are no better than the software installed. The real reliability figure should be based on the software reliability, which cannot be predicted. Only field experience can be used to determine the software MTBF. Furthermore, unified communications software periodically changes, which does not help to stabilize the reliability or MTBF.

Is five nines availability a worthy pursuit?

Let's assume a situation of an operating unified communications network. The following example covers one year of operation with some modest assumptions of downtime. Just to be very conservative, this calculation assumes there are no hardware failures in one year.

This is a total of 17 hours of outage per year. This produces an availability of 99.8%. That's not bad, but it's not 99.999%. So this begs the question: Does an enterprise unified communications environment ever experience five nines availability? Not likely. However, is five nines availability worth pursuing?

Assume that your enterprise is operating 12 hours per day, five days a week and all 52 weeks in one year. This equates to only 36.6% of the full year. If anything comes down and is fixed outside of working hours, then 99.8% is very acceptable.

Trying to attain five nines availability is very costly because you must have redundancy for nearly every hardware component. There must be near instantaneous switchover from a failed component to an operating component. Also, the software must be very stable. This is a very costly solution that may not be necessary for an enterprise that is closed for 108 hours out of the week. If, however, the enterprise never closes, then the design must include some redundancy for those components most likely to fail.

The points to be made are:

If you would like a more detailed discussion on this topic that includes the calculations for redundant configurations, email [email protected] and mention this article.

About the author: Gary Audin has more than 40 years of computer, communications and security experience. He has planned, designed, specified, implemented and operated data, LAN and telephone networks. These have included local area, national and international networks as well as VoIP and IP convergent networks in the U.S., Canada, Europe, Australia and Asia.


 

11 Nov 2010

All Rights Reserved, Copyright 2008 - 2024, TechTarget | Read our Privacy Statement