According to a study commissioned by the Alliance for Downtown New York, the Real Estate Board of New York, the New York Building Congress, and the Association for a Better New York released this past summer by a group of downtown NYC executives, “Many of the 34,000 customers who lost telephone and Internet capability on Sept. 11 didn’t realize how reliant they were on Verizon Communications Inc., which operates the largest telephone and data system in the city. To be safe, redundant communications systems should be built that don’t use Verizon lines, the report found.” More than one month after the attacks, thousands of residents and businesses are without basic phone service.
Questions about exposed areas in the network that weren’t being dealt with, whether due to financial limitation, “It can’t happen to me” mentality, or just plain ignorance, are now in focus as companies strive for true network survivability.
According to the same report noted above, one of the main conclusions established that landlords and businesses should establish redundant telephone, digital, and wireless communication networks. Prior to 9-11 they probably thought they had redundancy, but now customers, property owners and carriers alike have all raised the bar on those standards.
In most cases, you would have to believe most firms would be self-motivated to take the right proactive measures to minimize or eliminate single points of failure with their communications network. After all, it makes good business sense to eliminate risk of loss. But now it appears the proliferation of this awareness is also overflowing into the government sector. There is some indication that suggests the government will be taking a more active role in trying to ensure that companies plan for disaster recovery and protect themselves with truly redundant telecom facilities. There is just too much at stake to ignore the situation.
The Government Steps In
Several notable figures are now speaking out on issues and deficiencies that have been brought to the surface due to extreme risks that can affect business continuity. For example, on Nov. 9, 2001, former Securities Exchange Commission (SEC) Chairman Harvey L. Pitt made a speech to the Securities Industry Association annual meeting. During the speech he covered in an unusually detailed manner recommendations for establishing true telecommunications redundancy throughout the industry.
“Business continuity planning should seek to avoid reliance on single points of failure in critical systems,” said Harvey. “Since points of failure can occur in ways that are unforeseen, and even odd. The lines of competing telecom providers may all lie side by side in old, obscure conduits.
“Critical functions need backup capabilities with fail-over functionality allowing rapid recovery.”
These statements indicate where the regulatory climate appears to be moving. A disaster contingency plan — prepared by the top three regulators of the U.S. financial system — could push as many as two dozen major banks and securities companies to move their backup operations as much as 200 to 300 miles away from their main sites.
According to a report released on Aug. 29, 2002, by the Fed and the SEC, “Firms that play significant roles in critical financial markets should, at a minimum, plan to recover on the same business day” of a catastrophe.
It suggests business should resume no more than two to four hours after an interruption — as opposed to the four days most markets took to reopen after 9-11. To ensure quick recovery, it urges financial behemoths to develop “fully operational, out-of-region backup facilities for data and operations” or similar “remote outsourced facilities.”
The Federal Reserve is even getting into the mix by trying to establish new disaster recovery regulations on financial institutions so financial business activities can resume within a day after a catastrophic event. Some of the guidelines regulators are putting forward include:
1. Time parameters for business resumption
a. Organizations engaged in “core clearing and settlement” should be able to resume business in two hours.
b. Those processing transactions or communicating changes in customer positions should be able to recover within the business day.
2. Locations of back-up facilities
a. Primary sites and back-up sites should be at least 200 miles apart.
3. Testing procedures
a. Institutions should design cross-organization tests to assure compatibility.
So it is quite evident the way companies deal with managing their local telecommunications infrastructure is changing. Either through regulation or change driven by customers, property owners, and telecom service providers, companies are searching for creative and reliable solutions to help them reduce the possibility of network outages without “breaking the bank.”
In doing so, the goal is to introduce a fair level of network diversity and redundancy without “over-kill.” Finding the key balance between an acceptable cost, risk tolerance, and the right technology is the challenge. The good news is that everyone is now sensitized to the issues and taking some positive steps to combat the problem. The bad news is there is still a legacy of vulnerability that exists in how the customers have configured their networks and how the carriers have deployed their networks. It will take money, time, and effort to chip away at the gaps and fortify the infrastructure in the major business markets.
Network deficiencies have been effectively exposed. According to an Oct. 19, 2001, Wall Street Journal report, “What the 9-11 attacks showed is the vulnerability of the final, local link to phones and computers through the nation’s telecom hubs, or ‘nodes,’ which act as collecting points for traffic. More than one month after the attacks, thousands of residents and businesses are without basic phone service.”
In terms of local network vulnerability, the attacks may represent the event that “awakened the sleeping giant.” However, it is more than just terrorist attacks that pose real threats to a telecom network. When we consider what can cause a network to fail, past history can provide some real clues on what to protect against. We have seen companies are vulnerable to any number of other naturally occurring and man-made accidental or intentional disruptions that can bring down one or many customers for extended periods of time. Outages can occur from any number of causes. When considering a contingency plan it is important to be aware of what can occur. For instance:
• The notorious backhoe digging up the street and taking all the fiber cable feeding a building with it.
• The vendor maintenance window that fixes one problem and creates another.
• The water main break of 100-year-old pipe flooding the local central office.
• The unexplainable and mysterious software failure within a carrier’s network.
• The happy-fingered technician that goes to fix a problem and creates a bigger, different one by patching the wrong cable.
• The unexpected hardware failure that requires a part, which is not readily available.
• The hurricane, cyclone, or tornado that creates all manner of havoc.
We can go on and on and come up with many other scenarios. The fact is that contingency planning is needed for all sorts of prospective outage.
Depending on the level of tolerable risk, a good telecommunication infrastructure contingency plan should offer some degree of resiliency, transparency, redundancy, and diversity. Let’s take a look at each:
• Resiliency: The ability to restore in the event of an outage, in the time necessary, before the impact to a business becomes too serious.
• Transparency: The operation of a completely alternate or “standby” network that is virtually the same as an existing fiber-based network in terms of operational performance, reliability, and security.
• Diversity: A “mirror-image” network to the extent necessary, with alternate carrier to keep business operation flowing, in the event of a primary network failure, in addition to offering automated network load balancing capabilities.
• In-Network Redundancy: The ability to limit or eliminate single points of failure within primary network or secondary network, to the extent necessary to keep business operations flowing with an existing carrier service offering.
Contingency planning is not a “one-size-fits-all” proposition. Most firms will need to consider several factors in order to finalize or implement the right contingency plan. Here are some of the key items that need to be evaluated:
• Customer risk threshold: How much outage time can be sustained? How long can a business be out of service before it begins to cost money or halt normal business operations? Will those losses be recouped after service is restored or are they “permanent” losses?
• Restoration time targets: The MTTR (mean time to repair) Parameter. What is the “fail-over plan” to meet the restoration intervals with both a primary and secondary carrier? How fast will the plan restore adequate telecommunications?
• Financial impact: How much risk exists? How much does a company need to spend to minimize the risk? Does the potential loss of business or cost of potential loss outweigh the cost of protection?
• Level of scalability: How much customization is available to meet an individual company’s unique needs? One customer may need to replicate their entire voice, data, and private line network with their secondary provider. Another may need to just diversify a subset of their primary network with a truly diverse alternate carrier.
Once a company understands its risks and needs, developing a plan, and executing on that plan will come into focus. The key is finding a service partner or vendor capable of delivering the best technology; the right mix of network services; and a network that establishes full diversity from their primary network services vendor’s facilities and network infrastructure. Specifically, that diversification should focus on the characteristics of each component in the “last mile” connection, which was identified as the single point of failure in the most recent instances of terrorism.
What is the best way to find true diversity today? Let’s take a look at digital wireless broadband services.
The Broadband Wireless Way
One way to effectively meet the key contingency plan components (resiliency, transparency, diversity, and in-network redundancy) is to deploy a digital wireless broadband “standby network.” Today most companies will have traditional fiber paths that are serviced by both ILECs (incumbent local exchange carriers) and CLECs (competitive local exchange carriers), and other fiber-based providers.
In fact, in some instances there can be quite a bit of choice for specific tenants in certain buildings. In some cases, as many as five, 10, 15, or more different fiber-based service providers can be providing access and voice and data services to a given location. But the fact is that choice does not necessarily translate into true carrier diversity.
Some of the carriers resell a portion of another carrier’s network, most commonly the ILEC network. Which means you may be getting an invoice from one company, but the services are riding on the same local service path as your other active ILEC services. This is commonly known as Type 2 service. Also, even if the service path is riding on a separate network, there might not be sufficient separation between those competing carriers to establish true diversity. One way to address this could be to build a separate, independent fiber network, but that may be too costly when you factor in optical equipment, fiber construction, inside wiring and overall management and maintenance of that network.
The bottom line is that there is usually some or ample availability of fiber-based services a customer can tap into, but it is a false sense of security to assume that because you are being billed by two different fiber-based carriers, you are fully protected from a single point of failure.
Digital wireless broadband services offer unique and critical solutions for business continuity and disaster recovery. Wireless offers the ability to provide the right level of network diversity; the option to implement a full or partial spectrum of applied voice and data services, available “on standby” and delivered from separate and distinct carrier central offices. It offers an effective way to round out a good “fail-over” plan. Introducing an alternative wireless broadband network can help companies to effectively avert or minimize risk associated with fiber network outages and establish the right business continuity and disaster recovery capabilities.
Properly designed, a wireless broadband network can be flexible, scalable and customized enough to offer a full range of solutions from a “mirror-image” voice/data “hot standby” network to a single diverse private line connecting to a primary carrier.
During an outage of terrestrial based services, a wireless contingency plan can help restore service in less than a day or virtually instantaneously, depending on the network element and how the back-up services are configured.
In addition, it can be potentially very cost-effective compared to capital expenditures for optical equipment, fiber construction/leasing or purchasing an alternative dedicated SONET ring.
Resiliency Measurement Network Attribute Desired For Wireless Solution
Network and service restoration time during a fiber network outage Virtually instantaneous for most voice, data, and private line services if configured in true “standby” mode. A few hours or within the same day to port over existing inbound phone numbers not actively carried by wireless network
Capacity overflow Virtually instantaneous if network is configured with load balancing enabling other users to access trunks. (ex: Dial 7 to access wireless trunks. Dial 8 to access primary network.)
Does service quality have to suffer, because of the wireless technology that is being used as the “standby network?”
Absolutely not. In many cases service parameters can equal or exceed some fiber–based provider networks. In fact, since the network is so reliable, it may make sense not to keep it sitting idle in the background, waiting for a disaster to occur.
In fact a better approach is to use your alternate network actively by load-balancing traffic. By doing so, a business can effectively take some “eggs out of the basket” by alleviating the burden away from a potentially taxed fiber network.
In addition, you consistently exercise the “stand-by” network so it is unquestionably ready and active when disaster occurs. And finally, you establish a competitive balance and increase bargaining power by instituting an effective multi-vendor solution.
Transparency Measurements Network Attribute Desired for Wireless Solution
Reliability 99.999 % of up time (< 6 minutes per year of unplanned outage), which equals or surpasses competitive fiber network performance.
Fast installation/provisioning Equal or better service intervals. 30 days or less on service turn-up depending on connectivity or footprint in a building.
Network management 24 hour, seven-day-a-week network surveillance proactive monitoring and four-hour MTTR (mean-time-to-repair)
Security Radio transmissions cannot be “tapped.” Proprietary radio interfaces between the radio and the in-building electronics. Spectrum can be exclusive within a service area.
A wireless broadband network allows you to separate and distinguish the key network components. The wireless infrastructure provides a virtually guaranteed diverse route linking a building to the network cloud. It is typically designed via a rooftop architecture, which inherently protects against some of the traditional network failures (fiber cuts, floods, and other common fiber-based network outages).
As an added layer of protection and survivability on top of carrier diversity, a redundantly configured wireless broadband network could potentially offer ultimate protection against an extended network failure.
Through the deployment of diverse wireless hubs serving a building, establishing alternate connection to back-up sites, and provisioning a duplicate configuration to distinct voice and data networks, you can effectively deliver the ultimate solution by introducing a fault-tolerant layered design which virtually eliminates all single points of failure at the network level and at the service level. Here is the potential three-layered approach:
1. Existing fiber-based carrier network (primary).
2. A diverse wireless primary connection (alternate or second primary if load-balancing is employed).
3. A diverse wireless redundant connection (secondary connections to buildings and central offices).
Redundancy Measurement Network Attribute Desired for Wireless Solution
Separate circuit path and routing Separate and distinct physical paths connecting building to network
Multiple hubs serving a building If one hub fails the other is operational to the building
Separate connections to customer back-up locations If the customer established a back-up site another wireless broadband connection can link the back-up site to the network
Diverse central office Monitored network via alternative wireless central office instead of fiber-based carrier
Redundant applied/switched services Access to another network of voice, data and internet services
Where’s the Solution?
According to a Gartner-Raging Wire Telecommunications report, 150 of the 350 companies that operated in the World Trade Center before the 1993 bombing were out of business a year later because of the disruption. According to the Wall Street Journal, more than one month after the attacks, thousands of residents and businesses were without basic phone service. Wireless broadband is a proven technology that can deliver resiliency, transparency, diversity, and in-network redundancy the wireless way.
Fabio Campagna is director of product management for IDT Solutions Private Line and Business Continuity Network Services. Campagna has more than 16 years of experience in telecommunications, including Global Crossing, AT&T/TCG, and MCI International. Campagna holds an MBA in business management from Fairleigh Dickinson and a BS in marketing from Seton Hall.