Security log files can save you a ton of money
Security log files are gold. They are rich with clues, pointing you to the clues that can prevent data and cyber breaches proactively. Log analytics is a cottage industry, and includes collecting, analyzing, and searching large volumes of security-related log data in order to detect and respond to potential security threats.
In December 2020, the US cybersecurity firm FireEye reported being the victim of an advanced cyber attack. This involved the theft of their proprietary red team tools. The attack exploited a vulnerability in the SolarWinds Orion platform. The perpetrators gained access to FireEye's internal systems by a widely used network monitoring tool.
“Without question, this is the most sophisticated cyber espionage campaign I've observed in my career as a professional cybersecurity defender. The level of sophistication and operational security demonstrated here is unprecedented, and not likely the work of a single nation-state actor.”
Kevin Mandia, CEO of FireEye
FireEye detected the attack by virtue of its extensive use of security log analytics. The company's security team noticed unusual activity on their systems, which led to the discovery of the breach. FireEye immediately notified the relevant authorities and launched an investigation.
Further analysis of the attack revealed that multiple organizations, including US government agencies and major corporations, had also been targeted in what became known as the SolarWinds supply chain attack. The attack was highly sophisticated and well-coordinated, and it took several months to fully understand its scope and impact.
The incident highlighted the importance of security log analytics in detecting and responding to cyber threats. Without the use of security logs and advanced analytics tools, it's possible that the SolarWinds attack may have gone undetected for much longer, potentially causing even greater damage.
What is security log analytics?
Security log analytics refers to the process of collecting, analyzing, and searching large volumes of security-related log data in order to detect and respond to potential security threats.
Before going any further, here are a few common types or examples of cybersecurity practices at work.
Treating / handling / protecting / leveraging log files are just one other type of cybersecurity.
“Security logs are a crucial source of information for detecting and responding to cyber threats, but analyzing them in a scalable and effective manner is a daunting task. Machine learning techniques show promise in automating log analysis, but they face significant challenges in dealing with the large volume of data, inconsistent formats, and limited ground truth in enterprise settings. Developing accurate and reliable automated log analysis tools will require significant research and development efforts, as well as ongoing collaboration between industry and academia.”
Heng Yin, Associate Professor of Computer Science at UC Riverside, Source
There are various forms of analysis that play into a cybersecurity strategy.
- Static analysis
- Dynamic analysis
- Log file analysis
- (there are more, but these are sufficient for proving the point)
Within the flow of troubleshooting anomalies and protecting from bad actors, your cybersecurity team is parsing through these daily. But notice the critical nature of the log file work on the right, no less powerful or important than any of the other fancy words.
There are so many log files we could point to. Here are just the basic, low-hanging fruit ones:
- Firewall logs
Record the activity of network traffic that is either allowed or blocked by the firewall.
- Intrusion detection and prevention system (IDS/IPS) logs
Record events related to unauthorized access attempts, malware infections, and other security threats.
- Event logs
Record various system and application events such as software installations, system reboots, and user logins.
- Application logs
Record events related to specific applications, such as login attempts, data transfers, and errors.
- Network device logs
Record events related to network devices such as routers, switches, and firewalls, including configuration changes and network traffic.
- Operating system logs
Record events related to the operating system, such as system startup, shutdown, and software installations.
- Database logs
Record events related to database access, such as user login attempts and data modifications.
- Authentication logs
Record user authentication events, including login and logout times, and the outcome of authentication attempts.
- Web server logs
Record events related to web server activity, such as requests for web pages, data transfers, and errors.
- VPN logs
Record events related to virtual private network (VPN) connections, including login and logout times and the outcome of VPN connections.
Infrastructure junkies would argue at this point that most of the guts and glory is found in the network logs, but that’s a debatable topic and hard to prove fully. The story behind the story, if there was one, is that IT network and infrastructure engineers are on the hook to be out in front of the battle. That is, to proactively monitor their systems and not just respond to threats or executives’ fearful emails at 2am. It makes them a little jumpy, but with good reason. It’s a tough job. They become postured, and rely on strict policies, and beat those policies into the heads of their IT comrades. The battle-line stance typically spawns PowerPoint slides that talk to five very specific areas:
- Data privacy - “We absolutely must encrypt everything!”
Log data often contains sensitive information about an organization's systems, users, and transactions. This information needs to be protected to prevent unauthorized access and misuse. Organizations need to implement appropriate data privacy and security measures to ensure the confidentiality and privacy of log data.
- Data volume and velocity - “We may run out of space in 6 months!”
The sheer volume and velocity of log data can make it difficult to store, process, and analyze the data in a timely and effective manner. This can lead to delays in detecting and responding to security incidents and can also put a strain on an organization's IT resources.
- Data quality and accuracy - “Someone needs to be working on this!”
The quality and accuracy of log data can be affected by a variety of factors, including human error, system malfunctions, and data corruption. This can make it difficult to detect and respond to security incidents in a reliable and effective manner
- Log tampering and manipulation - “We need to buy the license for that special software that detects all this stuff!”
Attackers can attempt to manipulate or tamper with log data in order to conceal their activities or cover their tracks. This can make it difficult to detect and respond to security incidents, and can also undermine the trustworthiness of log data.
- False positive and false negatives - “My team is chasing their tail half the time but I don’t want to admit it.”
Security log analytics tools and systems can generate false positive and false negatives, leading to missed security incidents or false alarms. This can reduce the effectiveness of security log analytics and also waste valuable IT resources.
“Security is always going to be a cat and mouse game because there'll be people out there that are hunting for the zero day award, you have people that don't have configuration management, don't have vulnerability management, don't have patch management.”
Kevin Mitnick, a famous computer security consultant and hacker
Challenges in doing the digging into those log files
There are a number of factors that may make effective security log analysis difficult:
- Sheer volume of data is typically overwhelming. Enterprises generate often massive volumes of data in their security logs, making it increasingly difficult to identify and respond to potential threats in a timely manner. Every day, some poor network guy is loading up a CSV file into an Access or SQL Server database, hoping to run a few queries and narrow down a handful of targets to show his/her boss. It’s just a ton of data.
- Most logging systems have completely inconsistent formats. Logs can come from multiple vendors and be very different in their column usage and field delimiters, which makes it difficult to extract meaningful insights and patterns in bulk. Each log file has to be sucked in via custom mappings, and that simply takes time.
- There is often a pretty reasonably high semantic gap between the information recorded in logs and what is required to detect a breach. Security analysts must interpret the data and make connections that may not be immediately apparent. The values recorded in system logs simply may not be enough on its own to detect a security breach. The recorded information may not be clear or obvious, and security analysts have to analyze and interpret and map and analyze the data in order to identify any suspicious activities or patterns that could indicate a breach. This process requires a lot of expertise and experience, as the analysts must be able to connect seemingly unrelated pieces of data to uncover any potential security threats.
- There is usually very limited ground truth to work with. “Ground truth” refers to the actual data about real security intrusions that can be used to train machine learning algorithms to recognize and respond to similar patterns or behaviors. Ground truth of real intrusions is limited in an enterprise setting due to the large number of perimeter protections employed. This makes it difficult to train machine learning algorithms to accurately detect and respond to threats.
- Organizations may lack the number of IT professionals with the necessary skills and expertise to effectively analyze security logs and detect potential threats.
- Security logs can generate a high number of false positives, which can be time-consuming and expensive to investigate.
- Data privacy and compliance may require log files to be saved as encrypted data, and use a number of steps to decrypt. This can take time and simply discourage IT professionals from working with or working on the data. This is usually because some types, (most types nowadays), of organizations must comply with so many new and constantly evolving data privacy regulations and may be restricted in their ability to analyze certain types of logs or share log data with third-party vendors.
Analytic approaches employed in security log analysis
Ok so let’s say you have the data out of the log files now, sitting in a clean, not-to-large database, and ready to work on. What now? Well ideally, you’d be able to leverage the latest and greatest set of licensed software to attack it with. This would include artificial intelligence analysis software, and so many other sub-types within that umbrella. Advanced software types for security analysis include rule-based analysis, statistical analysis, machine learning, natural language processing, behavioral analysis, deep learning, and graph analysis. Rule-based systems use predefined rules to classify security events, while statistical models identify patterns in security logs. Machine learning algorithms detect hard-to-spot patterns and anomalies, while NLP extracts meaning from unstructured log data. Behavioral analysis identifies deviations from normal behavior, and deep learning can identify and respond to new and emerging threats. Graph analysis identifies relationships between events to help identify potential threats.
The overarching goal is to use security logs to detect attacks or breaches early, rather than use them in forensic analysis ex post facto. Here are some thoughts about how and why...
- By proactively monitoring and analyzing security logs, organizations can identify areas where their security defenses may be weak or ineffective, and make improvements to strengthen their overall security posture.
- Don’t let go of the ultimate dream of full, real-time monitoring of all systems. It’s entirely possible that your security logs can be monitored in real-time to detect and respond to potential threats as they occur, rather than waiting for a forensic analysis after an attack has already taken place.
- Create a culture of team members seeking to promote early detection. Culturally, the security IT team can be on top of things, and encouraging the proactive engagement to dig in and cuddle-up with those textual log files, (with advanced software of course), can help foster the right mental approach to what can be done to truly detect potential threats at an early stage, before they have a chance to escalate into full-blown attacks.
- “Show me the log files!” <- this can be spoken to your CISO, (Chief Information Security Officer) and he/she should be able to gather it all up and present it. There might be consternation around how much data and how much detail, but it should be doable. If not, something isn’t right. Log files should be as normal and rhythmic as a morning cup of coffee.
- By analyzing historical data and using predictive analytics, organizations can identify patterns and trends that may indicate a potential security threat in the future, and take proactive measures to prevent it from occurring.
- Proactive use of security logs can be far more cost-effective than after-the-fact forensic analysis, as it allows organizations to detect and respond to potential threats before they can cause significant damage.
- A newfound focus on security logs can help organizations comply with regulatory requirements, such as those related to data privacy and security.
- By analyzing security logs proactively, organizations can gather valuable threat intelligence and use it to improve their overall security posture and defend against future attacks.