Fall World 2013

Conference & Exhibit

Attend The #1 BC/DR Event!

Spring Journal

Volume 26, Issue 2

Full Contents Now Available!

The Hidden Factor In IT Network Downtime

Written by  Jonathan Buckley Wednesday, 21 November 2007 23:46

Ultimately, all companies across all industries today must maintain a high level of availability of their IT and network systems or face great peril. Certainly this readership does not need a rehashing of the economics of downtime. Let’s just suffice it to say, it’s not good. A disaster to a company need not necessarily be a world-effecting, broadcast event. It could be those not-so-quiet yearly corporate outages requiring a visit to the CEO’s office after systems are restored.

Presumably of more interest to this audience is root cause understanding of network and system downtime and techniques or tools to help avoid such unfortunate occasions. As evidence of this interest is the boom in network and systems sales, and more recently the specific segment of root cause software is an indication of the worldwide appetite for solutions to measure, assess, predict and hopefully, ultimately avoid network and system outages. Sales of these software packages are in the tens of billions of dollars annually by all survey accounts.

Despite our best efforts and the best IT software management packages, failures occur. Why?

This author believes we don’t do enough to manage our IT and network ecosystem as a supply chain. One might think of IT systems as a supply chain linked together from its raw material inputs (electrons, process cooling, operating environments) to its processing and storage (systems), delivery (network) and the like.

 

 

Now, take this supply chain model and put it in a more familiar stack as we are used to seeing in the 7 Layer OSI model (below), but rather simplify the details of processing, storage and delivery, and expound on the inputs such as power, environmental, fire safety, space, and physical security assets.

What you might notice is that the inputs to the IT supply chain seem more like a foundational layer upon which IT depends. What is interesting, however that most corporations lack today is rapid, remote visibility into these supply chain elements even with their billions of dollars in network system management packages and root cause engines. At the same time, study after study shows that somewhere between 30 percent and 50 percent of the failure in the IT supply chain has a root cause in this foundational layer inputs.

 

 


For example, Ontrack, one of the best-known professional data restoration services, studied data loss in more than 50,000 hard drives and other storage devices. They concluded that hardware and system malfunction accounted for 44 percent of all data lost. The list of causes are all related to failure in the IT supply chain input or foundation layer level – power failures, power surges, dust, moisture, heat and physical shock.

Even the United States courts have weighed and indirectly lent credence to this case that the IT supply chain links are inseparable despite our management otherwise. On April 18, 2000, in United States District Court, D. Arizona., AMERICAN GUARANTEE & LIABILITY INSURANCE COMPANY vs. INGRAM MICRO, INC., in summary:

“This case presented an insurance coverage dispute between Plaintiff/Counterdefendant American Guarantee & Liability Insurance Company (“American”) and Defendant/Counterclaimant Ingram Micro., Inc. (“Ingram”). American issued Ingram a property damage policy which insured against certain business interruption and service interruption losses. As a result of a power outage, Ingram’s computer systems were rendered inoperable. Ingram made a claim under its policy to American and American denied the claim. Thereafter, American filed a Complaint for declaratory relief against Ingram and Ingram filed a Counterclaim for breach of contract.

“Pending before the Court were cross-motions for partial summary judgment on the issue of whether a 1998 power outage caused “direct physical loss or damage from any cause, howsoever or wheresoever occurring” to Ingram’s computer system.”

The court concluded:

“At a time when computer technology dominates our professional as well as personal lives, the Court must side with Ingram’s broader definition of “physical damage.” The Court finds that “physical damage” is not restricted to the physical destruction or harm of computer circuitry but includes loss of access, loss of use, and loss of functionality.

“The Court is not alone in this interpretation. The federal computer fraud statute, which makes it an offense to cause damage to a protected computer, defines damage as “any impairment to the integrity or availability of data, a program, a system, or information.””

In this case, the court judged that the interconnectedness of power and the IT machinery it supported were inseparable ... in otherwise an interconnected chain. Why, then, do we not manage uptime equation as a chain, without a divide between IT and facilities?

The reasons for the disconnectedness of the IT supply chain are understandable:

1. The intelligent equipment in the input of foundational layer of the IT supply chain does not lend itself well to the monitoring via IT’s standard SNMP polling (this topic alone would require an article);
2. To date, the tools to remotely monitor this foundational layer have been legacy building control technologies designed for local, proprietary use, not enterprise-wide monitoring incorporated into the rest of the supply chain;
3. Consequently, too few companies have merged the interests of IT and facilities.
Thus the entire IT supply chain has not been effectively managed as an enterprise and this disjointedness, due to lack of root cause understanding or tools to managed and assess these causes at the IT supply chain input level, have lead to famous disasters. Service providers, Internet companies, banks and manufactures alike have spent time in the newspaper because of outages due to failed generators during rolling blackouts, water leaks or simply failed air cooling and unmanned sites.

Keep in mind that there is only so much software tools can help in disaster avoidance within the IT supply chain, but there is a value in being able to rapidly assess the viability of the different supply chain components, post-disaster.

For example, how long might it now take for your company to access the viability of its facilities systems after a major earthquake?

The answer to that question is quite certainly different than, how long would it take to access the viability of your network connection after the natural disaster?

If the IT supply chain were managed as such, the answers would match because you would have remote, unified, global visibility to all areas of the IT supply chain including power, fire, environmental and physical security systems just as you do server health.

New tools are now coming to the market to begin to address this forgotten piece of the IT supply chain. Built on new era architectures that directly monitor and predict the health and well being of the foundational layer and link them to the rest of the IT supply chain, these tools now hold promise to provide the same visibility into the IT enterprise as the CFO would expect of his/her financial system.

In conclusion, proactive managing of the IT supply chain as one complete enterprise can help businesses avoid the costly outages and downtime associated with unplanned failure in their facilities’ machinery. Getting to this information is a challenging task that few companies ever accomplish, especially given the external pressures and implementation obstacles today. As companies seek solutions to automate this entire process, they must look holistically at all the key requirements and leverage the benefits of new technologies coming to the market over the next few years.


Jonathan Buckley (jbuckley@netbrowser.com) is the vice president of marketing and business development for NetBrowser Communications. NetBrowser has pioneered and patented an enterprise monitoring software suite, e-Guardian, for what it calls The Zero Layers, or the facility foundations layer upon which critical IT systems depend. NetBrowser’s Fortune 1000 customer base has plenty of stories of how they avoided disasters using this new technology.

 

Login to post comments