PT Daya Cipta Mandiri Solusi
Sunday, October 19, 2014

Turning Business Continuity Into A Competitive Advantage

Written by  DENNIS WENKJanuary 13, 2014
The pace of technological change and its complexity is challenging traditional business continuity paradigms. What was once considered a best practice in business continuity (BC) no longer serves the new digital-world, and organizations can’t rely on these outdated processes to reach their future objectives. These best practices, and the standards/guidelines which they are based, are unsuitable for the modern technologically-dependent organization because they were intended to serve a different purpose within a vastly different business environment. Some might question or be puzzled by the notion that long-standing and widely-accepted best-practices could be unreliable, however, that really shouldn’t be so disturbing. After all, blood-letting was once a medical best practice.
It is time to modernize business continuity and align to the genuine needs of today’s technologically dependent organization. Today almost everything an organization does relies in some form or other on information technology. Businesses use IT to link to customers, suppliers and partners; to increase their operating efficiencies, connect global supply chains and more. With advancements in IT, we now do more transactions online, of greater value, and faster than ever before. It could be said that the modern organization is entirely dependent on IT. In a world filled with thousands of servers that are executing petabytes of data and covering hundreds of miles of networks in less than a Nano-second, unforeseen one-in-a-million glitch events can happen in the blink of an eye. The complexity in today’s IT is extremely different from the uniform and homogenous IT environment that was in place when many business continuity best practices were designed. To rely on these old best practices for your business continuity strategy creates blind-spots that may lead to significant oversights (that profoundly affect the reliability of the overall strategy).
The prevailing best practices in business continuity has favored a “better safe than sorry” approach to dealing with risk. In ordinary life “better safe than sorry” seems quite sensible. It does seem better to be safe. However, this paradigm does not work when the cost of the safety is greater than the cost of the risk.
Safety is not an all-or-nothing condition. Risk comes in matters of degrees and mitigation actions have a variety of trade-offs. There are times when this perception of safety causes blind-spots that lead us astray and cause us to overspend or waste valuable resources; but feeling safe is not the same as being safe. As Robert Hahn (in his book “Making Sense of Risk”) points out to Congress, “This leads to a paradox that is becoming increasingly recognized. Large amounts of resources are often devoted to slight or speculative dangers while substantial and well-documented dangers remain unaddressed.”
We can’t bet our organization’s valuable, scarce resources based on intuition and rules-of-thumb. Its harm is when resources are disproportionally allocated to efforts based on precautionary heuristics then those resources will not be available for less obvious but potentially more harmful risks.
Managing continuity in today’s complex IT-dependent organization requires replacing the ‘better safe than sorry’ heuristics with optimal risk-reduction actions. Managing risk depends on measuring the size of the investment and the speculative-ness of the harm. The potential negative consequences of catastrophic events such as floods, fires, hurricanes, tornados, earthquakes, terrorism, pandemics, or a meteor strike is quite significant. The question is not whether these events are hazardous or whether they should be of interest to an organization. It is obvious that the loss of life and resources from catastrophic events can cripple a business, and being unprepared for such an event is equally obvious, but capitalism is not about doomsday prepping. Capitalism is about calculated risk-taking: no risk taking, no innovation, no competitive advantage, and no shareholder value. Congressman Michael G. Oxley points out in a House financial report that, “Capitalism is about taking risk, and that is what makes our system so productive.”
The big question, the question that precautionary principal does not and cannot answer for business continuity is “when to stop-spending resources on safety?”
Many business continuity best practices conceal the precautionary-bias by using legitimate sounding terms such as risk appetite, risk tolerance, and risk aversion, but these terms are never developed beyond heuristics and subjective judgment. These terms are just ordinary perceptions about risk and they are neither measurements of risk nor can they be used to calculate risk. They simply tell us how we feel about risk.
Other business continuity best practices mask their subjectivity and bias through the use of elaborate high-medium-low (HML) matrix-models. These tools don’t calculate risk – they merely rank perceptions of risk, providing no credible information or statistical grounding needed to make a rational decision on how to optimally reduce risk. These models describe how we feel about risk, which does not help answer “what to do” or “how much to spend?”
The precautionary-bias is peppered throughout the many business continuity standards, guidelines, best practices, as well as its certifications. Today it is more important than ever for a balanced approach to business continuity and precautionary guidelines that consistently ignore minor cracks in continuity will not serve that purpose. Our organizations would be better served if business continuity first looked for ways to proactively fill those continuity cracks rather than solving for the next apocalypse. All in all “a stich in time saves nine.”
The real problem with traditional approaches is not that they are wrong, but that they offer no guidance to modern organizations on how to optimally reduce risk; how to fill the cracks. The unintended consequence of these outdated business continuity methods has been that the operational aspects of IT have been systematically neglected, and this might be the biggest blunder in business continuity today.
With all these best practices, these HML-matrix-models and this talk of risk aversion, there seems to be a growing and significant disconnect with what is actually happening in our new digital-world. Business continuity routinely dismisses IT-risks in favor of the prevailing “risk-of-the-month” because the best practices have a close affinity to the precautionary-bias. While few would argue that IT is becoming increasing important to every organization, a business continuity certification consultant recently stated at an industry event, “the ultimate goal of BC activities was to get out of the data center.”
That is an antiquated notion that undoubtedly implies the IT-infrastructure is unworthy of serious attention from business-oriented BCM practitioners. Nothing could be further from the truth.
The precautionary-bias coupled with people’s fear will trigger perceptions about worst-case scenarios that make them appear increasingly plausible. In 2008/2009, the United States suffered a major financial meltdown, one with an impact that many economists have estimated at $1.8 trillion.
While we intuitively understand the consequences of a loss at that scale, most of us fail to recognize the extent of a silent IT disaster unfolding under our virtual noses. According to IT complexity expert and ObjectWatch founder Roger Sessions, organizations in the United States lose $1.2 trillion from IT failures every year. Worldwide, the total comes to $6.2 trillion. Although Sessions’ numbers have been challenged by other economists, the calculations remain sobering, concluding that threat worldwide is only $3.8 trillion.
The most notable aspect of Session’s math is this: the overwhelming majority of the annual $1.2 trillion loss is not caused by the low-probability/high-consequence catastrophes that capture attention, but by high-probability/low-consequence events that occur frequently, such as software bugs, hardware failures and security breaches. Worse, as applications become more complex, involving an ever-larger tangle of codes, data nodes, and systems networks, the exposure to these “smaller” events becomes more frequent and their impact more costly.
The sheer size of these losses due to IT-failure should serve as a wake-up call for anyone related to business continuity. How could the very practices that were intended to provide continuity for our organizations allow interruptions that generate losses of this magnitude? Either business continuity’s target or its aim has been considerably off. While business continuity has been waiting and preparing for a catastrophic event, it ignored the real risk to continuity, IT. Business continuity best practices absolutely must start to do things differently. We need to start thinking rationally about where to devote our efforts and where to place our emphasis. Genuine business continuity best practices must make certain that real and serious risks receive the attention that it deserves.
The big question, as we discussed earlier, covers optimization of scarce resources in the present to achieve the greatest benefit for our organization in the future. After all, it is not about turning the lights back on once they fail; continuity is about ensuring the lights never go off in the first place.
For business continuity the big question has two components: (1) which risks are the serious ones and (2) what are the optimal risk-reduction actions? Traditional methods currently used in business continuity offer little advice to answering the big question. In fact, the current set of heuristics can be dysfunctional because it unknowingly distracts resources to slight or speculative dangers.
Many in the business continuity community share a mistaken belief it is impossible to develop credible quantitative risk estimates. That belief is illusory, as real world experience shows there is a wealth of data on which to base quantitative risk estimates with a degree of precision and confidence sufficient to support sound management decisions. We don’t have to be perfect, in fact we can’t be perfect, and perfection is infinitely expensive. We do need to increase the probability of success by reducing our losses. We need to apply the appropriate level of discipline relative to the complexity of the situation; IT is too complex to use heuristics, rules of thumb, and intuitive judgment.
While precise information is extremely valuable, even small amounts of reliable data is infinitely better than relying on subjective value judgments when it comes to making sound decisions about managing IT-infrastructure risk. Risk-related data is increasingly available. There is a surprising amount of information from which to make realistic calculations and estimates about IT infrastructure and operational risks. As Immanuel Kant said, “We have a duty – especially where the stakes are large – to inform ourselves adequately about the facts of the situation.” All in all, it is far better to use empirical data than rely on intuitive, subjective judgments.
Business continuity must make informed estimates about future losses and then take appropriate action based on those estimates. The underlying economic models must be constructed to accurately portray all of the pertinent risk parameters, as opposed to measuring risk-perceptions. Cost-benefit balancing can be applied to ensure a proper proportional-response. To keep the odds in our favor we must economically quantify the operational risks of the IT-infrastructure so we can properly evaluate the many tradeoffs and reach the optimal risk-reduction solution for our organizations.
With $3 to $6 trillion a year at stake, understanding how to prevent the continuing spiral of IT failures will have substantial benefits. In these difficult economic times, there is a tremendous amount of goodness that $3 to $6 trillion could add to our global economy. Making rational decisions about calculated risks which reduce the economic impact of IT failures will be key to achieving a competitive advantage.
Wenk DennisDennis Wenk is a senior manager in competitive strategy and market insights for Symantec covering cloud/virtualization/big data. He has consulted with large Fortune 500 companies in more than 20 countries.

Log Management: When Disaster Strikes You’ll Be Glad You Did

Written by  MISHA GOVSHTEYNJanuary 13, 2014
How Log Management Can Enhance Infrastructure Security in Disaster Recovery Situations
When an enterprise network goes down – because of natural disaster, accidental system failure, or security breach – the first priority is identifying the impact and restoring the infrastructure to its pre-disaster state. However, how “clean” is the pre-disaster infrastructure? Have servers been tainted with malware that can replicate to multiple data systems on the disaster recovery site and allow security breaches? Were data confidentiality, integrity, and availability compromised in any way, either before or during the incident? And would you have the data required to troubleshoot these issues during an outage?
The bad news is that such security breaches are often not found for weeks or months after they occur. However, log management best practices can provide those answers and help you determine if you’re working with a compromised infrastructure following disaster recovery (DR). Moreover, compliance with PCI, Sarbanes-Oxley, HIPAA, and GLBA regulations demand that log data be collected, regularly reviewed, and securely archived. The need for effective log management may seem obvious, but surprisingly the value of log forensics for supporting infrastructure security is often overlooked. This article will take a look at best practices using log data analysis to enhance the overall security and availability of data after disaster recovery, and the advantages of automated log management and analysis delivered as a service.
The Value of Log Management
Log management is an infrastructure management best practice that supports not only performance management but also security incident response and compliance requirements. It is a complex process of collecting all sources of data in an enterprise environment and making it usable in a normalized, searchable format. Reviewing and analyzing log data regularly is a best practice for meeting compliance regulations, identifying suspicious activity, and generating forensic data for internal investigations. Properly collected, stored, and analyzed log data can provide a holistic view of your data flows, and most importantly, alert you to anomalies that could indicate security breaches. Log analysis can reveal unauthorized system access, failed logins, file access records, malware activity, botnet activity, and other either failed or successful attempts to hack or steal data.
However, effective log management remains a challenge for many companies as the size and types of data sources continue to multiply exponentially across the enterprise. When performed manually, log management requires investments in additional IT staff and product acquisitions, both of which are cost prohibitive to many organizations and ultimately often fail to deliver the consistent, in-depth analysis required for DR infrastructure security.
Automated log management in the cloud as a SaaS offering is overcoming these challenges by simplifying the implementation, management, and analysis of log data, especially in the critical areas of analysis and reporting. This approach provides the consistent, automated collection, normalization, and storage that is critical to the confidentiality, integrity, and availability of enterprise data. Moreover, automated log management gives companies the agility and flexibility to collect and manage data from today’s virtual servers, elastic cloud environments, and hybrid environments, and integrate this data with traditional on-premise sources. The most effective SaaS solutions will also enhance automated functionality with the services of live security experts who can translate complex log data into actionable insight for protecting the infrastructure against security threats.
Recommended log management best practices:
  • Include log management in the incident response plan – Log management is most effective as an infrastructure security measure if it is included as a component of the incident response plan and not a second-thought measure during the chaotic hours after the incident. Specifically, consistent collection and analysis of multiple sources of log information from all data sources is the core process.
  • Store log data securely off-site to ensure availability – Log information can be an attractive target for malicious hackers. Maintain log data securely offsite just as you would your core data to ensure its availability and integrity during a disaster incident.
  • Alert on key activities to get warnings of unusual activity – Beyond its use for after-the-fact forensics, log management can also be a key “early warning system” against possible breaches in progress that could replicate onto a DR infrastructure. In addition to typical log types covering logins and administrator actions, an automated log management system can support infrastructure security by including the following log collections and alerts:
    • Anti-malware software – These logs can indicate that malware was detected, disinfection attempt results, file quarantines, when file-system scans were last performed, when anti-virus signature files were last updated, and when software upgrades have taken place.
    • Applications – Logs can include account changes, user authentication attempts, client and server activity, and configuration changes.
    • Authentication servers – These typically log each and every authentication attempt showing the originating user ID, destination system or application, date and time, and success/failure details.
    • Firewalls – These very detailed and informative logs can show what activity was blocked according to security policies.
    • Intrusion detection and protection systems – These systems record detailed information about suspicious behavior and detected attacks as well as actions taken to halt malicious activity in progress.
    • Network access control servers – These logs can provide a great deal of useful information about both successful/permitted and unsuccessful quarantined network connections.
    • Network devices (routers, switches) – Network device logs can provide information on network communication activity and what types of traffic were blocked.
    • Operating systems – Beyond typical log entries, operating system logs can contain information from security software and system applications that can help identify suspicious activity involving a particular host.
    • Virtual private networks (VPNs) – VPN logs record both successful and failed connection attempts, date and time of connects and disconnects, and the types and amount of data sent and received during a session.
    • Vulnerability management software – Scanning and patch management software log entries such as configuration, missing software updates, identified vulnerabilities, and patch/scan currency downloads.
    • Web application firewalls – WAFs generate “deny logs” which identify blocked application requests, useful in identifying attempted attacks that included applications as a possible entry into systems.
    • Web proxies – Web proxy logs record user activity and URLs accessed by specified users.
  • Have experienced analysts regularly review log data – Warnings of possible threats to the infrastructure are embedded in all of the log data flowing through the above systems. Regular log analyses can reveal them and trigger preventive action. Few companies can afford the time and cost to have in-house IT staff with the expertise to sift through thousands of log entries per day and detect anomalies. But the powerful analytic engines in today’s automated log management systems, combined with the expertise of live security analysts in a Security-as-a-Service environment, can quickly collect and analyze log data to deliver actionable results.
Finding the Needle in the Haystack
Outside of an emergency incident, automated log management systems also support the availability of enterprise data by parsing and normalizing all of the large, multiple flows of diverse log data to make it easily searchable. As such, it is possible to find in seconds that series of failed login requests or privilege escalation that led to a problem later. However, log management systems really pay off when trying to find the “needle in the haystack” during suspicious network incidents and dealing with compliance issues. The following use cases show how log management best practices can enhance infrastructure security and daily compliance tasks alike:
Use case #1: Detection of unauthorized changes to domain policies
A review of log data tracked changes to an administrator account, but the administrator credibly claimed not to have made the changes. Further analysis traced the login to a known attacker who used stolen credentials. Local logs had been deleted, but the customer had secure offsite data available via a log management solution. To prevent further breaches, the customer set an automated alert for admin-level changes. In addition, the daily log review analysis function in the solution “watches” for other suspicious activity.
Use case #2: Finding audit information quickly
Compliance regulations required that a business identify failed login attempts on admin accounts and demonstrate to an auditor that these attempts were identified. Finding the failed attempts was difficult and time consuming. Using a log review function, a daily analyst report on failed admin login attempts could be generated. These reports are stored for a year and are easily available in seconds to show compliance with a daily log review mandate.
Be Proactive with Log Management
What you don’t see can hurt you. In the end, including log management in your incident response plan is a proactive way to gain deeper visibility into your infrastructure and protect against the replication of security breaches during disaster recovery. Make a commitment to consistent collection of all log data from all disparate sources – on-premise, virtual, and cloud – and to consistent analysis that will give you actionable insight into keeping your data always secure and available.
Govshteyn MishaMisha Govshteyn is chief strategy officer and co-founder of Alert Logic, a leading provider of security-as-a-service solutions for the cloud. Govshteyn co-founded Alert Logic in 2002, and is responsible for security strategy, security research and software development at Alert Logic.