It is fairly common knowledge that the business world is not immune to Murphy’s Law. However, because disasters do not occur often, even the best disaster recovery plan is subject to an immediate mellowing process. Therefore, in addition to developing a test plan to ensure the effectiveness of your DRP, it is necessary to develop an audit procedure to survey the plan for its effectiveness. The audit process ensures that the plan is adequate as well as current.
The effectiveness of a Disaster Recovery Plan is diminished by changes in the environment that the plan was created to protect. These changes can take many forms. The following are some major factors that tend to reduce the plan’s effectiveness:
New Equipment Acquisition --The ever-expanding base of hardware technological knowledge is decreasing the installed life of most hardware. Changes should lead to a re-evaluation of the risk-analysis planning done for previous configurations. If you happen to subscribe to any of the vendor Disaster Recovery sites and their hardware and technology changes dramatically, thorough testing and auditing of the plan should be made to accommodate and verify that your plan still works in these ever-changing environments.
Staff Changes--The skills of the Information Systems and Business Units staff is constantly changing. Even in this lax state of the economy, new positions are created, others are eliminated and the people that staff these positions are also subject to movement both within and outside the organization. The audit should ensure that any changes in key positions or names which are in the plan are highlighted and brought to the attention of appropriate management.
Shifting Processing Priorities--As the workload of the Information Systems Facility shifts, the protection and recovery requirements in the plan might change significantly. For example, Application-”X” has been processing on the mainframe computers for years and is included in the Disaster Recovery Plan. Because of technological advances and, particularly, finances, this application can now run on a PC; what provision is now made to recovery this application in the overall plan? The reverse may also apply where PC-based applications migrate up to the mainframe computer and compensatory efforts for recovery must be realized. The audit and review of processing priorities should highlight these types of changes.
Increasing Application Complexity--As an application matures and work units or company processes become increasingly dependent on an automated process, some backup procedures (particularly manual ones) are no longer feasible.
Legislation Changes--There is an increasing demand brought on by legislation for information retention both internally and externally. The disaster recovery effort must address these issues.
The audit should be an independent and objective appraisal of the Disaster Recovery Plan for both the Data Center and Business Units as time, technology and logistics change. The audit process should be expected to accomplish the following:
- Identify and evaluate security controls (both physical and data)
- Provide management an opportunity to improve and update the plan
- Provide a stimulus to keep management from becoming complacent
- Uncover areas of vulnerability as they relate to management planning, controls and information security, etc.
FREQUENCY OF AUDITS
Factors to consider when determining frequency include the following:
- Rate of change in the Information Systems Facility (vendor and own site)
- Frequency of other audits
- Number of problems that occur while testing the plan
- Level of outside threat to the operation
Selecting Guidelines for the Audit Team
The audit team should NOT be responsible for Information Systems operations or Business entity functions. This is necessary to ensure objectivity on the part of the team and to comply with the concept of segregation of duties. The team members should have data processing knowledge and a knowledge of auditing principles; in addition, user and business expertise would be a valuable asset to the team as a whole. The audit team should not be responsible for enforcement of procedures; this is a responsibility of Information Systems management and the respective Business Units. The size of the team will depend on the size of the organization, but nevertheless, the team members should be knowledgeable in the following areas:
- Internal audit
- Data Security
- Data Processing
- Business Department and User Community
- Building Management and Engineering
Take note that outside consultants or specialists could be considered to provide some of the necessary skills where applicable.
THE AUDIT PLAN
- The plan should be action oriented (i.e., executable).
- The Data Security policy for the organization should be reviewed and modified to include Business Resumption and Information Asset Protection. It is this policy which dictates the extent of detail in the plan itself.
- The risk analysis portion of the Disaster Recovery Plan should be identified, reviewed and evaluated. From the plan, those vulnerabilities that are significant for this particular installation will be uncovered.
- The documentation relating to the plan should be reviewed to determine if it represents a true picture of the environment and its procedures.
Questions To Be Considered In
Formulating The Plan
- What are the critical applications, software and computer operating environments?
- What are the critical applications, software, and computer operating environments?
- What measures of the Disaster Recovery Plan are tested?
- How can the audit scope be structured to produce maximum results with the least amount of effort and facility disruption?
- Has the Disaster Recovery mechanism been invoked in a recent problem? If so, were the results adequate?
- Does management support the plan? What criteria would be indicative of this support (i.e., meetings, memos, budget, communication).
- What do the employees feel are the main deficiencies of the plan? (An employee involved with the Disaster Recovery process is a great source of information.)Is the plan approved, published and presented to the respective Business Units? Are those responsible for the actions outlined in the plan aware of their responsibilities? Is the plan kept up to date?
Data and Program Backup
1. Determine where critical backup files and vital records are stored
2. Review procedures for identifying critical files and their retention periods
3. Review the current inventory of critical files
4. Determine that records are stored in low fire rated/proof containers
5. Test the ease and accuracy of the file backup system by performing a dry run. Determine if the department holds periodic tests.
6. Determine how backup files are created
7. Review backup and recovery procedures
1. Review plans for computing alternatives. Determine location of the installation, contractual agreements in effect, periodic testing, and working relationships.
2. Evaluate implementation plan for the backup installation. This plan should be reviewed and tested periodically.
3. Determine that equipment and spare parts are available locally and can be acquired expeditiously.
4. Evaluate data security of the files and other sensitive material stored within the computer.
5. Evaluate provisions for physical security during disaster recovery operations (testing) at the backup facility.
The Written Disaster Recovery Plan
1. Evaluate written plan determining that all significant items are covered.
2. Determine who is responsible for each functional area covered by the plan.
3. Review and evaluate the detailed notification procedure for implementation
4. Review criteria for determining the extent of disruption.
5. Determine the responsibility for retaining source documents and data files for each application.
6. Review the disaster recovery training program for Information Systems personnel as well as the different Business Units.
CONDUCTING THE AUDIT
Some tips for you to consider:
- A mix of surprise and scheduled audits is desirable
- The audit should take place at least once a year
- The first step in any audit, planned or surprise, should be to notify the respective management of the audit
- Employee interviews should be scheduled. Different management levels should be interviewed appropriately.
- Tests of the plan should be developed early in the audit
- The audit should be conducted in a friendly manner.
At the completion of the audit, a written report should be prepared immediately. It should include the following:
- An executive summary
- A description of the audit dates, locations, scope, objectives, etc.
- A detailed report of observations made
- Conclusion drawn free from observations
- Recommendations for corrective actions as appropriate
The degree of cooperation should be noted and favorable conclusions given the same prominence as unfavorable ones.
AUDIT FOLLOW UP
The first step of concluding the audit review is to have an “Audit Take-Up” meeting.
The purpose of this meeting is to clarify and ensure that the findings drafted are valid and acceptable to the Auditee, before formally issuing the audit report.
It is through this meeting that amendments are made to the audit report draft before issuance.
The last step is to issue the final report to the appropriate management parties and to anticipate response from the Auditee within a predetermined timeframe.
Finally, the Audit Team emphasis should always be positive--one of helping management to improve security and control of their disaster recovery plan.
By following these practical ideas for auditing and testing the disaster recovery plan, the effectiveness of the audit and its execution will be objectively thorough and valid in its depth and content.
Bruce H. Blank, CDRP, is the Data Security Administrator for an international bank. He has worked extensively with auditors and contingency planning consultants while developing disaster recovery plans for both the data center and business units of several major corporations.
This article adapted from Vol. 4 No. 1, p. 26.