How to Reduce Downtime for Your Business

Sign with word unavailable turned into availableWe recently conducted a poll of over 25 businesses, and the results were crystal clear. The absolute, number one, most important aspect of their relationship with their IT vendors was uptime. This was no surprise to us, and it’s probably not a surprise to you either. What is a surprise is that, though downtime is a huge cost to any business, it is rarely quantified. All most businesses know is that they will not survive if their systems are continually down and unavailable.

Do you know how long your business can deal with downtime? And are your IT vendors living up to that expectation? This might surprise you too…

Your Business Needs May Be Different by Department

Whenever I ask a business owner how long their systems can be down, the answer is always the same: Never.

I usually tell them, “Well, anything can be done with enough money!”

No, I’m not joking. When I break down the cost to the business owner of ensuring 100% uptime (not just 99.9% -- but 100%) they typically respond, “Well maybe we’ll be okay with some downtime.”

Businesses get used to having access to their systems – having access at home, on the road and at all hours of the night. But there are costs and complexities involved in creating an environment that has extreme uptime.

So instead of having one generalized uptime requirement for your business, look at and define your uptime requirements by operational area and/or by system.

You may find that it’s critical that your accounting systems have a max of 2 hours per year of downtime, but your business can live with up to a day of marketing software downtime. Document your uptime requirements in each area – this will help manage the expectations of your employees and your providers. It will also help ensure you pay for what you need – no more and no less.

Confirm Your Vendors’ Uptime Guarantees

Especially as you move to more cloud-based solutions, you should always confirm what the vendor’s Service Level Agreement (SLA) is regarding uptime.

An SLA can vary widely based on the vendor and the solution. Typical “best in class” providers will state in their SLA that they guarantee 99.999% (a.k.a. “five 9s”) of uptime, and if they don’t meet that guarantee in a given month they will refund part of your monthly fee. But I have seen SLAs for cloud software vary as much as 75% on the low end to 100% on the high end.

Look at the difference between 90% uptime and “nine 9s” to get a sense of the scope, here:

 

90% Uptime

Downtime per year: 36.5 days

Downtime per month: 72 hours

Downtime per week: 16.8 hours

 

Nine 9s (99.9999999%) Uptime

Downtime per year: 31.5569 milliseconds

Downtime per month: 2.6297 milliseconds

Downtime per week: 0.6048 milliseconds

 

How the Internet Changed Uptime

As business Internet usage has increased, so has the dependency that the service will be reliable and “up” all the time. But there are various levels of Internet service, and understanding their uptime commitment is critical before signing on the dotted line.

For example, if you have a residential or small-business solution from Verizon FiOS or TimeWarner Cable, they will schedule support based on “best effort.” In other words, they will make their best effort to get someone there soon, but they make no guarantees or commitments. This often means days before an appointment. In this case, you need to have a Plan B for the days you are down.

Business-class Internet service from providers such as Level 3 or AT&T typically have a 4 hour response-time guarantee. This will be more expensive, but you may determine your business requirements more than justify it.

In either case, to reduce lost productivity, it is critical for you to define a backup plan in the event of an outage, and communicate that plan to all employees. For example, your backup plan may define that if the Internet outage lasts more than an hour, you will send everyone home to work (ensuring, too, that they have what they need at home to be productive).

What are your business’s single points of failure?

A SPOF (single point of failure) is just as the name describes — it is a device or service that, if it goes down, causes the entire business or a large sub-set of the business to go down too.

Using the example of a business who uses cloud services for all their systems, the SPOF will typically be within their office and include:

  1. Internet service provider router
  2. Firewall
  3. Network switch

If any of these were to break, the entire office would be unable to connect to the Internet to get to their cloud systems. Each one independently has a direct impact on your desired level of uptime. If you need 99.999% uptime, having even one SPOF means you will be unable to meet that requirement. Many times the device may simply need to be restarted, but if it must be replaced it could be days before the office is back online.

The good news is, you can mitigate the risk using N+1 strategy.

In the technical world, the way we mitigate a SPOF is by implementing a strategy called “N+1.” N is the base number of devices to run the company and +1 is the addition of one more device to provide failover should a device fail.

For example, if you have one firewall to run the office, N+1 would require having two firewalls running. If one fails, the other takes over and the company remains connected while the faulty device is repaired or replaced. If you have two network switches, you would need to add a third to take over if one fails.

To be effective, the N+1 model must cascade through all SPOF areas. If you have two firewalls but have not added the third network switch, you still are not fully protected. For this reason, some companies will add a second Internet service in their office — maybe wireless service from another provider — to have redundancy and a failsafe if their primary Internet service goes down.

Obviously implementing N+1 is more expensive because you must have an additional product or service in every part of your infrastructure. For this reason, determining whether N+1 should be deployed is completely a business decision.

For example, you may determine the corporate office must have N+1 while the branch offices don’t need it. Or in a retail environment, you may determine the stores need N+1 in order to ensure they can process sales transactions, but the corporate office does not need it. The requirements are specific to your business and should be defined as such.

Your IT service provider should be able to help you make all of these decisions. We help our clients with this all the time here at Fluid. Would you like to talk to us about your business IT needs? Contact us right here.