Log Management: When Disaster Strikes You’ll Be Glad You Did
Written by MISHA GOVSHTEYNJanuary 13, 2014
How Log Management Can Enhance Infrastructure Security in Disaster Recovery Situations
When an enterprise network goes down – because of natural disaster, accidental system failure, or security breach – the first priority is identifying the impact and restoring the infrastructure to its pre-disaster state. However, how “clean” is the pre-disaster infrastructure? Have servers been tainted with malware that can replicate to multiple data systems on the disaster recovery site and allow security breaches? Were data confidentiality, integrity, and availability compromised in any way, either before or during the incident? And would you have the data required to troubleshoot these issues during an outage?
The bad news is that such security breaches are often not found for weeks or months after they occur. However, log management best practices can provide those answers and help you determine if you’re working with a compromised infrastructure following disaster recovery (DR). Moreover, compliance with PCI, Sarbanes-Oxley, HIPAA, and GLBA regulations demand that log data be collected, regularly reviewed, and securely archived. The need for effective log management may seem obvious, but surprisingly the value of log forensics for supporting infrastructure security is often overlooked. This article will take a look at best practices using log data analysis to enhance the overall security and availability of data after disaster recovery, and the advantages of automated log management and analysis delivered as a service.
The Value of Log Management
Log management is an infrastructure management best practice that supports not only performance management but also security incident response and compliance requirements. It is a complex process of collecting all sources of data in an enterprise environment and making it usable in a normalized, searchable format. Reviewing and analyzing log data regularly is a best practice for meeting compliance regulations, identifying suspicious activity, and generating forensic data for internal investigations. Properly collected, stored, and analyzed log data can provide a holistic view of your data flows, and most importantly, alert you to anomalies that could indicate security breaches. Log analysis can reveal unauthorized system access, failed logins, file access records, malware activity, botnet activity, and other either failed or successful attempts to hack or steal data.
However, effective log management remains a challenge for many companies as the size and types of data sources continue to multiply exponentially across the enterprise. When performed manually, log management requires investments in additional IT staff and product acquisitions, both of which are cost prohibitive to many organizations and ultimately often fail to deliver the consistent, in-depth analysis required for DR infrastructure security.
Automated log management in the cloud as a SaaS offering is overcoming these challenges by simplifying the implementation, management, and analysis of log data, especially in the critical areas of analysis and reporting. This approach provides the consistent, automated collection, normalization, and storage that is critical to the confidentiality, integrity, and availability of enterprise data. Moreover, automated log management gives companies the agility and flexibility to collect and manage data from today’s virtual servers, elastic cloud environments, and hybrid environments, and integrate this data with traditional on-premise sources. The most effective SaaS solutions will also enhance automated functionality with the services of live security experts who can translate complex log data into actionable insight for protecting the infrastructure against security threats.
Recommended log management best practices:
- Include log management in the incident response plan – Log management is most effective as an infrastructure security measure if it is included as a component of the incident response plan and not a second-thought measure during the chaotic hours after the incident. Specifically, consistent collection and analysis of multiple sources of log information from all data sources is the core process.
- Store log data securely off-site to ensure availability – Log information can be an attractive target for malicious hackers. Maintain log data securely offsite just as you would your core data to ensure its availability and integrity during a disaster incident.
- Alert on key activities to get warnings of unusual activity – Beyond its use for after-the-fact forensics, log management can also be a key “early warning system” against possible breaches in progress that could replicate onto a DR infrastructure. In addition to typical log types covering logins and administrator actions, an automated log management system can support infrastructure security by including the following log collections and alerts:
- Anti-malware software – These logs can indicate that malware was detected, disinfection attempt results, file quarantines, when file-system scans were last performed, when anti-virus signature files were last updated, and when software upgrades have taken place.
- Applications – Logs can include account changes, user authentication attempts, client and server activity, and configuration changes.
- Authentication servers – These typically log each and every authentication attempt showing the originating user ID, destination system or application, date and time, and success/failure details.
- Firewalls – These very detailed and informative logs can show what activity was blocked according to security policies.
- Intrusion detection and protection systems – These systems record detailed information about suspicious behavior and detected attacks as well as actions taken to halt malicious activity in progress.
- Network access control servers – These logs can provide a great deal of useful information about both successful/permitted and unsuccessful quarantined network connections.
- Network devices (routers, switches) – Network device logs can provide information on network communication activity and what types of traffic were blocked.
- Operating systems – Beyond typical log entries, operating system logs can contain information from security software and system applications that can help identify suspicious activity involving a particular host.
- Virtual private networks (VPNs) – VPN logs record both successful and failed connection attempts, date and time of connects and disconnects, and the types and amount of data sent and received during a session.
- Vulnerability management software – Scanning and patch management software log entries such as configuration, missing software updates, identified vulnerabilities, and patch/scan currency downloads.
- Web application firewalls – WAFs generate “deny logs” which identify blocked application requests, useful in identifying attempted attacks that included applications as a possible entry into systems.
- Web proxies – Web proxy logs record user activity and URLs accessed by specified users.
- Have experienced analysts regularly review log data – Warnings of possible threats to the infrastructure are embedded in all of the log data flowing through the above systems. Regular log analyses can reveal them and trigger preventive action. Few companies can afford the time and cost to have in-house IT staff with the expertise to sift through thousands of log entries per day and detect anomalies. But the powerful analytic engines in today’s automated log management systems, combined with the expertise of live security analysts in a Security-as-a-Service environment, can quickly collect and analyze log data to deliver actionable results.
Finding the Needle in the Haystack
Outside of an emergency incident, automated log management systems also support the availability of enterprise data by parsing and normalizing all of the large, multiple flows of diverse log data to make it easily searchable. As such, it is possible to find in seconds that series of failed login requests or privilege escalation that led to a problem later. However, log management systems really pay off when trying to find the “needle in the haystack” during suspicious network incidents and dealing with compliance issues. The following use cases show how log management best practices can enhance infrastructure security and daily compliance tasks alike:
Use case #1: Detection of unauthorized changes to domain policies
A review of log data tracked changes to an administrator account, but the administrator credibly claimed not to have made the changes. Further analysis traced the login to a known attacker who used stolen credentials. Local logs had been deleted, but the customer had secure offsite data available via a log management solution. To prevent further breaches, the customer set an automated alert for admin-level changes. In addition, the daily log review analysis function in the solution “watches” for other suspicious activity.
Use case #2: Finding audit information quickly
Compliance regulations required that a business identify failed login attempts on admin accounts and demonstrate to an auditor that these attempts were identified. Finding the failed attempts was difficult and time consuming. Using a log review function, a daily analyst report on failed admin login attempts could be generated. These reports are stored for a year and are easily available in seconds to show compliance with a daily log review mandate.
Be Proactive with Log Management
What you don’t see can hurt you. In the end, including log management in your incident response plan is a proactive way to gain deeper visibility into your infrastructure and protect against the replication of security breaches during disaster recovery. Make a commitment to consistent collection of all log data from all disparate sources – on-premise, virtual, and cloud – and to consistent analysis that will give you actionable insight into keeping your data always secure and available.