Perhaps no two communities worry more about a disaster and service restoral than the military and financial communities, for which sophisticated telecommunications and distributed processing intelligence are the very lifeblood of their daily activities. However, increasing numbers in the manufacturing, transportation, retail and medical industries are realizing that they, too, are vulnerable to service outages which seriously jeopardize their bottom line profitability and competitiveness.
The Communications Resource
Communications resources are coming under the closer scrutiny of top managers in commerce and industry, in a similar manner to that applied to other critical resources. We easily understand why management looks carefully at finance, personnel, raw materials, plant and equipment, etc. The phone system, however, has rarely been on the list of critical resources to be managed and protected.
Times have changed, and senior managers have come to realize that along with the increased reliance on computers, the strategic link that supports the flow of information between operating units, customers, and suppliers is the telecommunications network, no longer a primitive “phone system.” The reliance on this resource has increased in recent times to a level that demands greater attention to network planning, utilization and protection.
The growing concern about the impact of catastrophic failure on business communications can be gauged by the mushrooming companies that offer disaster recovery protection services, and the many recent seminars on network security that have focused on such events as the Hinsdale Fire which affected some 150,000 business circuits in the greater Chicago area for some two weeks in 1988. One firm invested nearly $600,000 in fees and expenses to resume temporary operations during the Hinsdale fire. The firm calculated the recovery effort and saved themselves $30 million in potential lost sales. In the past two years, major fires in New England and Los Angeles, floods in the Midwest, and earthquakes on the West Coast have exacted huge financial penalties on thousands of businesses, not in loss of capital equipment, but in service outages ranging from several days to several weeks.
The following section briefly reviews how service restoral today is addressed by many public carriers and “disaster recovery” companies. Some only address nodal failures, others address link failures, and some try to contend with both.
Companies involved in disaster protection and recovery are aware of the growing dependence on telecommunications by major clients, where operations rely on high speed data transfer, transaction processing, and voice communications in the highly competitive business environment. Communications carriers and local operating companies are also evaluating the impact of catastrophic failure on business users and are offering terrestrial diversity solutions for disaster prevention and recovery.
Today’s terrestrial service restoral alternatives employ techniques such as ring architectures (Figure 1), portable electronic switches, and microwave link detours. Companies such as SunGard in Philadelphia and Comdisco in Chicago offer “Hot-Rooms,” sites preloaded with modems, muxes, T1 facilities and other data processing equipment; or “Cold Rooms,” where empty rooms are made available for customers to relocate their own equipment and management in temporary quarters. These are the Telemedics of today--computer paramedics. Public carriers have also laid complex restoral plans as a result of the well-publicized rash of incidents involving fire, flood and extensive fiber cuts.
Many of the plans involve alternate routing of cables and transmission lines. These backup options are often compromised by the fact that (a) alternate paths share a common right-of-way for some portion of their span and (b) that these architectures still rely on the ability of the central offices to maintain service throughout the disaster. The Chicago disaster has proven both assumption wrong (Figure 4). Moreover, recovery centers and alternate cable routes lay dormant and until disaster strikes. The high cost of deploying standby strategies is borne by the end user as the price of insurance; invariably high but vital when needed. The stakes are even higher when the alternate paths and emergency equipment deployed in recovery centers are not voice and low-rate data, but for T1 and T3 services and video conference capability. One regional carrier estimates a cost of $500 million for its service restoral backup network.
Hybrid diversity architectures recognize that satellite communications provides a reliable alternative path, one that avoids many of the pitfalls of terrestrial diversity topologies. Satellite links originating and terminating on customers’ premises, for example, provide complete terrestrial bypass, even of the central office and local loop! Moreover, satellite paths can be overlaid on existing terrestrial networks.
A pivotal advantage in hybrid diversity architecture involves economics and system flexibility. The large footprint coverage of the satellite and the flexibility of the earth terminal switching equipment offer three specific cost-saving measures over and above the primary purpose for service restoral:
(1) The satellite channel is viewed as a single-pool telecommunications resource that can be shared by many sites and temporarily allocated to one or more “failed” locations
(2) The satellite links can be used as overflow and expansion venues when not in use during emergency periods
(3) The flexible switching equipment can be configured to reroute only high-priority services during disaster via user-friendly software control.
Terrestrial/satellite hybrid network architectures require that each of the load sharing communications links are able to emulate the technical and operational characteristics of the other. Features such as DAMA (Demand Assigned Multiple Access), Network Synchronization, Protocol Handling, Signaling, Encryption, Network Management, etc. must be compatible in both elements of the hybrid network.
Two major hybrid architectures are emerging. The dominant yet least flexible strategy, termed Host Center Backup, focuses on Nodal failure and is limited to point-to-point and VSAT networks (Figure 5). Here, satellite links are deployed to provide backup for the catastrophic loss of a primary data center. The network operation is easily and quickly redirected to a secondary center. Here, the diversity strategy includes what is termed “electronic vaulting”--i.e., the frequent storage and updating of the primary host database into a secondary location. This procedure ensures a hot standby secondary host site. The small, portable remote satellite terminals can be rapidly deployed to the scene. The large Hub is a permanent installation.
However, contrary to popular belief, most business telecommunication needs are not primarily point-to point links, nor do they exclusively carry computer data traffic. Voice, video conference, and high speed T1/T3 traffic represent the larger percentage of traffic today and for the next decade. The advent of ISDN services only accentuates the need for a more flexible hybrid architecture than merely backing up a central host site. Figure 6 depicts an intelligent satellite network superimposed on existing terrestrial services. The key difference here is that, unlike Figure 5, the satellite capacity is a single pool. Any one location can access any part of this capacity to communicate with any other site, and all sites can receive any single node’s transmission.
Viewed in the larger context of this paper, i.e. the total superimposed telecommunications facility, we will now examine how this architecture can satisfy all three cost-saving measures listed earlier in this section.
Single-Pool Satellite Channel
During steady-state operations the network in Figure 6 is an integrated resource which is sized to carry the aggregate capacity:
Total network capacity=capacity on satcom+capacity on terrestrial.
The key point here is that the partial capacity carried by the satellite links is not dedicated capacity. Hence, unlike the fixed point-to-point terrestrial links, the satellite channel is shared by all sites in the network. So what? The major advantage becomes apparent in the satcom ability to carry voice. Exactly emulating the terrestrial telephone switches, the multiple access satellite subnet equipment can be sized less than the sum of all voice circuits, since not all circuits will be simultaneously busy. In effect, the satellite subnet can carry voice traffic more efficiently than the point-to-point terrestrial links. This is exactly how a telephone switch operates. Similarly, the satellite subnet equipment can be configured to share the satellite channel for some data services, too. Note that the pooling of access to a common satellite channel allows for dynamic balancing of traffic carried among the sites by the satellite; i.e., when site B experiences peak traffic periods, it may borrow more of the satellite channel than site D, which is passing through a low traffic period. Furthermore, mesh topologies (Figure 6), as well as the star topologies (Figure 5), can be easily configured from a user- friendly software console.
Assume that the terrestrial portion of the network is the primary path and carries a mix of data and voice traffic T (D1, D2, D3, V1, V2, V3), and the secondary satellite portion of the network is carrying its share of data, voice, and video traffic S (D4, D5, V4, V5, V6, C1, C2, C3). Now the flexibility of this diversity architecture really comes into play:
(a) Circuit Expansion
If the primary terrestrial network is subjected to temporary or permanent traffic overloads, circuits can be transferred to the unused satcom capacity (if any), or circuits can preempt lower priority satellite circuits; e.g., priority circuits D1, D2 and V1 are transferred from the terrestrial network T to the satellite backup network S. New services can also be added in minutes instead of months; e.g., add two new voice circuits V7 and V8 to the satellite backup network.
(b) Nodal Expansion
The satellite’s broad coverage allows the network to extend quickly to new sites, whether or not adequate terrestrial services are available at these sites. Site E in Figure 6 can join the network as a full partner without disrupting existing services--in days, not months.
This flexibility illustrates how the backup subnet does not lie unused during steady state periods. Reacting to link or nodal failure becomes a matter of preempting selective low priority satellite services as top-priority traffic is routed to the backup satellite subnet. This is done within minutes, not days or weeks. This quick reaction to failures is inherent in the equipment’s switching components at no additional cost to the user.
Software-Defined Priority Rerouting
Figure 7 shows the switching “nerve center” at any protected site (not all sites must be protected). This site could either be at the end user’s premises or at the local central office; i.e., Figure 7 shows details of any single node of Figure 6. Two switches are at play here. The terrestrial switch allocates the site circuits to either terrestrial or satellite subnet. This is done automatically by programming the switch, via a local software console, or remotely via a centralized control console at some other site. The switch can also employ Automatic Route Selection to redirect selected circuits upon detection of failure on either satellite or terrestrial subnets.
The second switch is in the emergency satcom earth terminal. This switch redirects the satellite circuits to their destination nodes (as would a C.O. telephone switch), or redirects all its traffic from a failed node to a predesignated alternate termination. This switch, too, is configured via preprogrammed “connectivity maps,” via a local console, or a remote centralized control console at some other site.
Together, both switches allow this site to maximize its use of the primary and secondary telecommunications resources, yet are flexible enough to accommodate each node’s varying traffic patterns.
The hybrid diversity architecture is a powerful means of providing rapid-response service protection and restoral without the high cost of deploying unused facilities, links and equipment. Today’s technology leaves no room for excuses that reliable networks must be saddled with the high cost of dormant backup telecommunications facilities.
Network diversity and service restoral are in effect warnings not to “put all our eggs in one basket.” Particularly if service outages are not mere inconveniences, but catastrophic to your performance profitability. The unique hybrid architecture presented here is not intended merely to show how to redirect vital traffic quickly, but to afford the user the flexibility of selecting which traffic is of highest priority and cannot incur outages. Moreover, the proprietary architecture allows for the cost-effective use of backup facilities during steady state periods.
In today’s increasingly competitive and shrinking world, businesses jockey fiercely to maintain or improve their market positions. No business strategy reliant on the time-sensitive transfer of information is complete if it does not address the vulnerability of the telecommunications resource.
Written by Dr. S. S. Kamal and Keith Dunford, SPAR Communications Group.
This article adapted from Vo. 3 No. 1, p. 10.