BCP/DR, Remote Backup and Recovery

Business Continuity and Disaster Recovery

  

Table of Contents:        
Introduction    
Business Impact and Risk Analysis     Disaster Recovery Planning (D/R)
Business Continuity Planning (BCP)     Short Term Outage
BCP Life Cycle     Long Term Outage
BCP Life Cycle Definitions     Recovery Time Objective (RTO)
Business Service Inter-Dependencies     Recovery Point Objective (RPO)
Business Service Intra-Dependencies     Appendix

Introduction

Business Continuity Planning (BCP) and Disaster Recovery Planning (D/R) are necessary tools that business's are incorporating as standard practice.  Corporate senior management is the corner stone of an effective BCP plan providing management sponsorship and organizational policies mandating compliance.  Natural disasters and human errors have highlighted to businesses the vulnerability and exposure to financial losses should they not be prepared to react and recover.

Adding to the complexity of business management are the regulatory compliance acts mandating businesses to put in place BCP and D/R action plans to assure vital services can be provided during events causing disruption to mission critical business services.

Sarbanes-Oxley (SOX), Gramm-Leach-Bliley (GLBA) and Health Insurance Portability Accountability (HIPAA) are a few of the regulatory compliance acts that contain specific language and mandates governing the need for business continuity planning as a part of the legislation.

Businesses that are governed by regulatory legislation are required to comply by putting in place BCP and D/R action plans.  The businesses must be able to participate and demonstrate their ability to conduct business during street wide exercises (currently only within the financial services industry).  Other industries may require yearly reviews of their BCP and D/R plans by third party auditors.

Table Of Contents

Business Impact Analysis and Risk Analysis

Business Impact Analysis requires the input of the business's decision makers to determine what mission critical business services will need to be restored and the order of priority to the restoration.  The output of the Business Impact Analysis process is a concise understanding of the business's mission critical services and their value to the survival of the business.

The Business Impact Analysis process documents the financial impact and the vulnerability or exposure to failure that the service is susceptible to.  The data output from the Business Impact Analysis processes is then provided to the areas of the business designated to managing risk.  The risk managers are then tasked to provide solutions to mitigate the vulnerabilities and to provide alternative solutions for the recovery of the mission critical business services.

Risk analysis takes into account all types of threats that can impact a business.   Fires, floods, hurricanes, acts of terrorism, hardware / software failures, virus attacks, cyber crimes and internal exploits are just an example (and a very short list) of the types of events to be analyzed and a probability assessment value assigned to each.

The probability assessments are documented and vetted to outline alternative solutions that may be deployed to mitigate the risk to the business and the potential costs associated with each solution.  Probability assessments can often be subjective by cost constraints to implement a recovery solution.  It is the task of the Business Impact Analysis and risk management processes to weigh the business's survival benefits against the cost of a recovery solution based on the recovery time objective and recovery point objective of the business service.

Table Of Contents

Business Continuity Planning and D/R

Business Continuity Planning (BCP) and Disaster Recovery (D/R) are similar but different.  BCP focuses on Business Impact and Risk Analysis and identification of mission critical business services that must be recovered in an event of a failure.   BCP planning is a direct input to the business's D/R action plans.

Disaster Recovery (D/R) looks at the business as whole and incorporates into the recovery plans all aspects (applications, technologies, 3rd party services, maintenance contracts, hardware providers, etc., are but a few examples) of all the mission critical business services that are identified via the Business Continuity Planning and Business Impact and Risk Analysis processes.

The focus of this discussion will be on the process of BCP and where appropriate, we will highlight the areas where Disaster Recovery planning differs.

To be an effective tool Business Continuity Planning requires support from all management levels of the business.  BCP policies need to be incorporated as a part of the business's standard operating procedures and measured as critical core competencies across all functional areas of the business.  Anything less than full commitment to BCP will leave gaps in the recovery services and could have severe impacts to the business's ability to survive a failure when recovery is needed most.

BCP is defined by the BCP life cycle process and focuses on the identification of mission critical business services necessary to sustain business operations during short term or long term outages.  A primary objective of the BCP life cycle is to identify all of the mission critical business services deemed critical and necessary for recovery in the event of a outage or disaster event.  The identification process includes all of the Inter and Intra dependencies of each service to ensure the readiness of the service upon recovery.   The identification process of the BCP Life Cycle ensures that all aspects of the mission critical business services are identified and includes all of the technology components associated with each service.

The output from the BCP Life Cycle process is used as the impetus for the implementation of the solution required to support the recovery of the critical business service.   The by product of the planning and analysis becomes the input to the Business Impact and Risk Analysis portion of the Business Continuity Plan for all the business services identified as mission critical.

As your business evolves so too will your BCP activities.   Evolutionary Change Management will become an integral part of your core business competencies as the advent of change will be inevitable.   As your business grows and new products and services are introduced the BCP Life Cycle will be revisited to include those services into the BCP and D/R strategies.

BCP activities do not become dormant once they are defined.  Instead, BCP activities take on a life of their own.  Once a BCP solution is put in place and Testing and Validation of the solution has been completed, Routine Maintenance activities will become necessary. Routine maintenance activities will assist in keeping all the solutions put in place at the proper operating levels of manufacturer recommendations, software defect remediation, operational changes and proprietary application enhancements.

The process of Routine Maintenance will require Testing and Validation to occur every time there is a change to any of the platforms under the BCP and D/R umbrella.  It is highly recommended that full BCP and D/R testing be conducted at minimum twice a year.  Testing and validation should also be conducted whenever major configuration changes are made to the technology infrastructure or upon major releases of business proprietary applications.

See the BCP Life Cycle Definitions for a complete description of all the steps within the BCP planning process.

Table Of Contents

Appendix:

Business Continuity Planning (BCP):

Business Continuity Planning (BCP) is the business management life cycle process used to identify, define, implement, test, maintain and re-validate critical business processes and the technology components necessary to fully recover a failed mission critical business service.

See the BCP Life Cycle process flow for the steps involved in BCP planning process.  Also look at the BCP Life Cycle Definitions for the details on what each step of the BCP Life Cycle process entails.

Table Of Contents

BCP Life Cycle:

The BCP life cycle process takes on many dimensions identifying all of the operational attributes of a mission critical business service.  The BCP life cycle is a continuous process and evolves around your business as it changes and as new products and services are introduced.  Failure to include any new mission critical business services into the BCP life cycle process can result in failure of recovering a mission critical business service.

BCP Life Cycle

Table Of Contents


BCP Life Cycle Definitions:
Mission Critical Business Service Identification     Solution Implementation
Document Service Inter-Dependencies     Testing and Validation
Document Service Intra-Dependencies     Routine Maintenance
Recovery Specifications     Evolutionary Change Management

Table Of Contents

  • Mission Critical Business Service Identification:

    A business function that is identified as mission critical to the survival of the business.  The criticality of the service is weighed against it's Recovery Time Objective and Recovery Point Objective attributes as derived from the Business Impact Analysis and Risk Assessment process and determines the priority and the order in which the business service will be recovered.  Identification of mission critical business services is part of the BCP Life Cycle process.

    Some examples: payroll, managing corporate books and records, recovery and protection of client confidential information and patient health information, intellectual property (application source code, business legal documents, copyrights), customer contact lists, and proposals for new business opportunities.

    BCP Life Cycle Definitions

  • Business Service Inter-Dependencies:

    Every business service will have dependencies that have attributes comprised of either Inter services, Intra services, or both.

    Case Study: Your business is dependent on the availability to your sales force the Customer Relationship Management (CRM) database located on your production server running a popular operating system (i.e. UNIX, Linux, Windows, etc.).  The CRM application is an in-house proprietary application utilizing a well known database technology.  See the expanded case study overview for details on how Intra dependencies are identified.

    The Inter dependencies for the CRM application can be identified and documented as examples:

    • The definition of the operating system version and current patch level used to assure application recovery compatibility.
    • The definition of the infrastructure dependencies: server name, DNS or WINS entries and firewall rules.
    • The definition of the database technology version and current patch level used to assure application recovery compatibility.
    • The definition of location where the most recent backup of the server and application is stored and when last successfully completed.
    • The definition of the proprietary application version and current patch level used to assure recovery compatibility.
    • The definition of the application type (client / server, peer-to-peer, etc.)
    • The application directory hierarchy location where it is presently located on the server.
    • The shared drive mappings on how the application is either distributed or accessed.
    • The authentication method used to gain access to the application services.
    • Listing of the valid users authorized to use the application and their access level entitlements.

    BCP Life Cycle Definitions

  • Business Service Intra-Dependencies:

    Every business service will have dependencies that have attributes comprised of either Inter services, Intra services, or both.

    Expanding on the case study: As outlined in the Inter dependency section of the BCP Life Cycle process, your business is dependent on the availability to your sales force the CRM application and database.  Over the years other areas of your business have been utilizing the database for other applications reporting against sales and marketing trends.  Via the BCP Life Cycle analysis it was determined that the application services external to the CRM application are also vital to day-to-day services provided.  All reporting services are classified as critical to the business and included as part of the BCP recovery strategy.

    The Intra dependencies for the CRM platform can include additional services identified and documented as examples:

    • A listing of all of the external applications utilizing the to assure application recovery compatibility.
    • The definition of the infrastructure dependencies: server name, DNS or WINS entries and firewall rules.
    • The definition of the database technology version and current patch level used to assure application recovery compatibility.
    • The definition of location where the most recent backup of the server and application is stored and when last successfully completed.
    • The definition of the proprietary application version and current patch level used to assure recovery compatibility.
    • The definition of the application type (client / server, peer-to-peer, etc.)
    • The application directory hierarchy location where it is presently located on the server.
    • The shared drive mappings on how the application is either distributed or accessed.
    • The authentication method used to gain access to the application services.
    • Listing of the valid users authorized to use the application and their access level entitlements.

    BCP Life Cycle Definitions

  • Recovery Specifications:

    Recovery specifications are the direct byproduct from the Business Impact Analysis and Risk Management processes of the BCP Life Cycle.  Recovery specifications are used to vet the processes and technology solutions (and associated costs) necessary that ensures the recovery of the mission critical business service within it's defined Recovery Time Objective and Recovery Point Objective.

    BCP Life Cycle Definitions

  • Solution Implementation:

    The solution implementation phase of the BCP Life Cycle encompasses all of the associated activities necessary for the installation, configuration and readiness of the recovery solution defined from the Recovery Specifications.

    BCP Life Cycle Definitions

  • Testing and Validation:

    The tasks associated with the testing and validation of the BCP and D/R services.   Test results should be measured to ensure BCP and D/R services are in a ready state and fully functional.

    Testing and validation of the BCP and D/R services must be incorporated into the business's change management processes as a mandatory compliance to the BCP Life Cycle.

    BCP Life Cycle Definitions

  • Routine Maintenance:

    The tasks associated with the deployment and implementation of operating level updates from manufacturer recommendations, software defect remediation, operational changes or proprietary application enhancements.  Routine maintenance activities are necessary in keeping the BCP and D/R solutions current and operationally ready.

    Routine maintenance must be incorporated into the business's change management processes as a mandatory compliance to the BCP Life Cycle and all update activities should undergo Testing and Validation.

    BCP Life Cycle Definitions

  • Evolutionary Change Management:

    Evolutionary Change Management is the continous cycle of review and update of all existing and new services introduced for inclusion within the business's BCP Life Cycle.

    Every major proprietary application enhancement, technology infrastructure change or new service offering should have their respective Business Impact Analysis and Risk Management reviewed for applicability along with their respective Recovery Time Objective and Recovery Point Objective.

    BCP Life Cycle Definitions

Disaster Recovery Planning (D/R):

Every business may at some point in time suffer some form of major disruption to their mission critical business services: fires, floods, hurricanes, acts of terrorism, hardware / software failures, virus attacks, cyber crimes and internal exploits are just some of the areas drawing concern.   This is a short list of potential disastrous events and it is by no means complete.

D/R planning requires the same levels of management sponsorship and organizational policies as with BCP to be an effective tool.  D/R planning requires comprehensive planning and analysis defining each business service provided and the Inter and Intra dependencies to recover the business service components.  But unlike BCP which focuses on individual business processes and services, D/R focuses on the business as a whole.  D/R plans determine what, when and in what order services will be recovered to bring the business back on line.   D/R plans not only account for day one recovery requirements but also takes into account the recovery requirements for extended outages for day two and beyond.  D/R planning encompasses all aspects of the business and all of the business process dependencies.

D/R plans can include alternative work locations, air travel restrictions, emergency exit routes, information hotlines, work force mobilization procedures, alternative meeting locations, routine D/R exercises, maintenance, validation and certification of D/R readiness and D/R employee handbooks.  This is not a complete listing of what needs to be included a D/R plan, but it should get you to thinking more comprehensively about what needs to be in place to get your business back on line.

Table Of Contents

Short-Term Outages:

An interim disruption of service to a defined business process requiring an action plan to be executed based on the nature of disruption.  The tolerance for the disruption duration is determined by the criticality of the business process based on the defined Recovery Time Objective and Recovery Point Objective for the business service.

Short term outages can often contain action plans outlining what processes and services may be necessary to bring on line beyond the defined short term threshold and are then reclassified as a Long Term Outage.  Both short term and long term recovery methods are derived from the Business Impact and Risk Analysis process.

Table Of Contents

Long Term Outages:

Long term outages are failure events extending beyond the defined short term outage threshold as weighed against their Recovery Time Objective and Recovery Point Objective for the business service.

Long term outages can be defined as a Disaster Recovery event if it is determined to be a mission critical business service from the Business Impact and Risk Analysis process.   In either case, long term outage contingency planning is an output of the BCP Life Cycle process.

Table Of Contents

Street Wide Exercises:

Since 1999 the financial services industry has been conducting industry-wide tests with exchanges, trade clearing houses and other industry service providers (i.e. market data, telecommunications and alternate D/R site organizations).  These tests have been coordinated and sponsored by the Securities Industry Association.

The financial industry street wide test is an annual exercise that tests the securities industry's ability to conduct business as usual under a simulated disaster recovery scenario.

Table Of Contents

Recovery Time Objective (RTO):

RTO is the time period from a failure to when the systems or services must be restored and ready for use.  Not all mission critical business services may have the same RTO value.  RTO values are specific to each service as the time tolerable for recovery based on Business Impact and Risk Analysis.

For example, a business requirement to bring back on line a mission critical business service within 15 minutes of a failure event may be necessary for a financial transaction system but may not apply to the Human Resources training and education services.

Table Of Contents

Recovery Point Objective (RPO):

RPO determines the amount of data loss that is tolerable in the event of an outage or disaster.  It is generally anticipated that there will be some data in the event of a service disruption or disaster. The RPO is used to plan at what at point in time (based on the disaster event) do you want to have data recovered.

For example, an RPO of your prior day's backup may be considered tolerable for your business if the disaster event occurred early in the business day.  However, if the disaster event occurred late in the business day (after business close) would the RPO still apply?  A very low RPO adds complexity and increases cost to the data recovery requirement.  All comparative analysis for RPO costs and objectives are inputs to the Business Impact and Risk Analysis process

Table Of Contents

 

BCP/DR, Remote Backup and Recovery Home | About Us | Privacy Policy | Contact Us | Copyright © 2000 - 2017.  FR Technologies, LLC.   All rights reserved.