- was brought up with central technology& dedicated networks;
- conscious of the pertinence of workstations, but have trouble adopting it;
- is more experienced;
- usually support mission-critical systems;
- therefore conscious of the pertinence of recovery.
- was brought up in the microcomputer and local area network environment;
- does not understand why mainframes are still in use;
- lacks experience;
- little concerned by mission-critical systems;
- believes that recovery can be handled ad hoc within a few minutes and that procedures are superfluous.
This situation lives on. However, we are now witnessing a new wave that is dragging a number of organizations into the as of yet imprecise wake of electronic commerce. Despite the infatuation that this concept arouses, few clear definitions define it to this day. However, in the context that concerns us, what is at once clear and worrisome can be narrowed down to the following facts:
- they are systems whose appeal to the internet justifies their implementation and stimulates their diffusion;
- they are multi-partner systems, typically with three partners:
- a leader partner organization supporting the system's mission;
- a financial services provider;
- a telecommunication services provider;
- they are systems where each partner's local area network plays a predominant role.
Throughout the required technological fitting, each partner is required to integrate new components to their local area network infrastructure (still the same one, without continuity and recovery mechanisms) by the addition of new components. These may be classical components for the telecommunication realm, but new ones for other organizations, such as switches or routers or yet new components specific to the commerce and open communications domain such as firewalls. The first step then consists in adding these components to the existing infrastructure in order to make it apt to be integrated to the electronic commerce system. Within the same breath, we will sometimes take advantage of the situation to entrust the monitoring of operating activities (performance, security, back-up, etc.) to sub-systems operating on the local area network. When we notice that these responsibilities are mainly dependant on the local area network, that the phenomenon is identical with the other partners, who each have their own culture and means of measuring risk and that there is an important delay to be recovered in the continuity measures, we literally hit a wall.
The solution: the leader partner organization must oversee the implementation of measures aimed at warranting a common stability and availability of the different segments of the concerned local area network. The leader partner organization must take control since, to the eyes of the clients, there is but one visible partner: the leader. This partner will have to deal with this situation on a daily basis by providing and managing first level help-desk services. Arriving at a consensus will be a delicate matter mainly on account of the two following reasons:
' For certain partners, current operations which characterize their mission do not justify the same level of protection until the implementation of electronic commerce. For others, the measures imposed by the needs of electronic commerce are clearly below corporate standards;'
' The partners are not interested in disclosing the nature of mechanisms which ensure continuity and recovery.
The game plan for the leader partner organization is decomposed into three steps:
1. Understanding the problem
It must be clear to all partners that a recovery situation for one is likely to cause continuity problems for the others and losses for all. A delay in a batch job may be bearable for one but harmful for the service levels of another. Each organization has other mission critical systems to run.
Once this statement is understood and accepted by all, it will be necessary for representatives identified to this effect by each partner to form a permanent committee whose goal will be to periodically reevaluate the global measures for continuity and recovery.
This committee will have to:
- Establish a clear means of communication between its members and periodically plan follow-up meetings concerning the technological evolution of the concerned infrastructures;
- Follow the recovery tests of each partner;
- Ensure service during test periods;
- Plan integrated recovery tests;
- Be informed and take action on any technological change brought about by a partner, if this change implies modifications to the recovery and continuity procedures;
- Advise external providers of the prevailing situation, that is to say, the potential simultaneous use of different cold site providers;
- Determine tolerance levels and steps to take concerning functionality in downgraded and partial modes.
Once these principles have been adopted, each must guarantee a minimum level of robustness of the networks involved in the system. Each partner must first examine his own infrastructure (pre commerce) and proceed as required to an upgrade by:
- The definition of service levels to be attained;
- The implementation of mechanisms to ensure their respect;
- The development of required continuity and recovery procedures;
- The establishment of periodical revision methodologies.
Once this upgrade is complete, each partner must redo the exercise by applying it to the new hardware and software components required by the implementation of e-commerce. Particular attention must be drawn to the firewall function which is particularly unique in its type. All organizations managing such a function must determine, in the absence of redundancy, how to ensure service in the event of major unavailability. The selection of systems which allow direct connection of consoles on the firewall is then a solution. Are there periods where we can tolerate having the system operate without a firewall? Are certain clients more at risk than others?
Furthermore, we must, at all times, be able to continue monitoring in the absence of the local area network.
Can we reuse certain other stations or local area network nodes used for other purposes, in order to ensure this function?
The same question must be asked for all operation tasks managed through the local area network. We must then consider security management (including firewall management), back-ups, performance tracking, etc.
3. Inventory of available resources
This exercise in integration can be a complete failure if we get dragged into a mechanism aimed at establishing standards, criteria and norms of all sorts. Even though it is sometimes possible, certain situations will not be accepted unanimously. Instead of trying to convince recalcitrant members, by imposing the conservation of certain critical components in stock on the spot for example, it is easier to draft an inventory of the resources at the disposal of the partners. We are then talking about human resources, material resources, rooms with appropriate power supplies, components and telecommunication links, etc.
This way of doing things enables us to inform the different partners of what the group disposes of in case of need, without mentioning the assets of each in terms of architecture and recovery and continuity mechanisms.
Despite all these precautions, electronic commerce systems are systems that are rolled into production with much pressure and where the integrity of information itself is not often ensured and this, without the implicated partners' knowledge. On the other hand, watchful partners will detect negligent behavior of the 'just to keep things running' type and will more easily be able to find support in order to readjust the aim where deficiencies are found.
Marc Le Brun is a senior consultant working at CGI Group/Quebec city branch. He spent most of his time building security/continuity/recovery plans especially for electronic commerce systems involved in many Canadian governmental agencies.