Incident Response data acquisition, but then scalable & fast

Jun 27, 2024 10:20:14 AM

At Hunt & Hackett we realised that traditional incident response methods rooted in digital forensics, whereby large amounts of data are acquired over several days before initiating an investigation, are no longer sufficient, specifically when dealing with large-scale security incidents, ransomware or advanced persistent threats (APTs). It's time for a shift towards an automated incident response strategy that combines the investigative prowess of a digital detective with a DevOps mindset.

To make efficient data acquisition possible, Hunt & Hackett has developed an innovative cloud-based incident response lab, which provides us with a scalable solution for data acquisition during incident response cases. Leveraging Infrastructure-as-Code and open-source software, the Computer Emergency Response Team feeds the knowledge gathered from previous incident response cases into sets of investigative playbooks, configurations and other repeatable data acquisition methods, which can be re-used during new cases and by the rest of the incident response community.

Our innovative approach towards a cloud-based incident response lab has become so efficient that it has since become the cornerstone of our incident response service. This blog post will walk you through our approach and all the data acquisition methods that are part of our incident response lab.

Embracing DevOps in Modern Incident Response

The Hunt & Hackett Computer Emergency Response Team (CERT) helps organizations to deal with and recover from security incidents, breaches, and cyber threats within today's rapidly evolving global threat landscape[1]. As the frequency and scale of security incidents continues to rise, CERTs face significant challenges. Traditional incident response methods rooted in digital forensics, whereby large amounts of data are acquired over several days before initiating investigation, are no longer sufficient, nor are they efficient enough when dealing with large-scale security incidents and advanced persistent threats (APTs).

A shift towards an automated incident response strategy that combines the investigative prowess of a digital detective with a DevOps mindset is needed. We believe that by adopting this approach, incident response teams can better adapt to the evolving threat landscape, using the following principles that are part of the DevOps mindset:

[Iterative process of continuous improvement]: implementing incremental improvements which allows for faster delivery of features and enhancements based on feedback from incident response cases;
[Repeatability]: automation is leveraged to ensure consistency, reliability and repeatability in the incident response processes and tools, reducing the risk of errors;
[Feedback loops]: insights gathered from incident response cases are used to identify issues, validate assumptions, and make informed decisions to drive continuous improvement;
[Validation]: ensuring that software and processes fulfill the requirements of the CERT but also that of the client (IT team, legal, third-party partners, etc), throughout the development lifecycle, validating the results of automation and ensuring the reproducibility of findings throughout the entire incident response process;
[Collaboration]: breaking down the traditional silos between development and operations to foster collaboration between incident responders, reverse engineers, threat intelligence analysts, red teamers, and SOC analysts, enhancing efficiency and effectiveness. This integration should extend beyond our internal team collaboration to extend direct collaboration with our client’s response team while also including the seamless system integration of threat intelligence and managed detection & response[2], with an incident response lab.

Foundation of our Cloud-based Incident Response Lab

In setting up our cloud native incident response service[3], Hunt & Hackett had the benefit that we could start from scratch, without any legacy, in a cloud native world and with a significant number of lessons learned from the past. Drawing upon these insights, we outlined our lab’s foundation based on the following key principles:

[Reliability]: ensure that services work independently from each other;
[Availability]: guarantee that services can be accessed from anywhere across the globe;
[Agnostic]: support different kinds of data sources and tools;
[Scalability]: scaling based on the demands of the incident response case;
[Isolation]: provide segregated investigation environments for different investigations, ensuring that data and operations are contained and cannot interfere with each other;
[Compliance]: ensure that environments can be governed and deleted in a GDPR-compliant manner, respecting regulatory requirements.

Leveraging the open-source project Digital Forensics & Incident Response Lab[4] as a foundation, we have built a cloud-based incident response lab hosted on the Google Cloud Platform (GCP). The lab is built using Packer to automate the build of images and Terraform for the automatic provisioning of infrastructure on the Google Cloud Platform using Infrastructure as Code, which enables us to do the following:

[Leverage knowledge]: feed the knowledge gathered from incident response cases into sets of investigative playbooks, configurations and other repeatable data acquisition methods (i.e. tools and plugins) as part of the lab, that can be re-used during new cases;
[Automate scalable provisioning]: provision the lab via automation at scale, which ensures consistency, reliability, and repeatability;
[Streamline data acquisition, processing, and analysis]: data acquisition, processing and analysis using various methods within two hours, without the intervention of an incident responder.

Utilizing Infrastructure as Code configuration with Terraform, we can automatically provision the lab for a specific investigation on the Google Cloud Platform within just 15 minutes, resulting in a total of 405 Terraform resources. Currently the lab consists of the following data acquisition methods, as shown in Figure 1:

The deployment of a Velociraptor server to monitor Velociraptor clients;
The creation of Velociraptor collectors for the following data sources:
- The collection of forensic packages of Windows, Unix and MacOS systems;
- The collection of Active Directory objects using SharpHound;
- The collection of memory images of Windows and Linux systems;
The deployment of a system for collecting data from SaaS applications using Elastic Filebeat;
The creation of GCS buckets for uploading investigation material such as disk images, application, firewall and network logging.

Aquire blog image

Figure 1 - Overview of the data acquisition methods of the incident response lab

The Bigger Picture

The incident response lab of Hunt & Hackett is designed to handle the full incident response lifecycle:

[Preparation]: for incident response retainer clients of Hunt & Hackett, a dedicated instance of the lab is available 24/7, so it is already there in case a security incident occurs. This allows our clients to upload investigation material with the click of a button, eliminating the start-up time of an investigation and improving the forensic readiness;
[Detection & Analysis]: the processing of investigation material is highly automated, accelerating the detection and analysis phase of the investigation, and reducing the margin for errors;
[Containment, Eradication & Recovery]: due to the rapid acquisition and processing of investigation material, compromised systems can be identified swiftly and efficiently. This enhances the containment, eradication and recovery phase, by enabling faster remediation of threats, reducing downtime, and minimizing potential damage;
[Reporting & Post-Incident Activity]: the lab’s collaborative functionalities allow Hunt & Hackett to collaborate more closely with incident response clients and to provide them with daily updates of new findings. Additionally, as our lab is cloud native, there are no capacity limitations due to unlimited storage, and data retention for archived investigation material can be set flexibly based on regulatory and other requirements.

As an end-to-end solution, the lab automates the ingestion and processing of all key forensic data sources in a single platform, enabling comprehensive investigation capabilities.

The next paragraphs cover the following (implemented) data acquisition methods in detail:

Dual capabilities of Velociraptor for threat hunting and data acquisition;
Configuration-based data acquisition to prevent logs remaining uncollected;
The underestimated value of memory forensics;
The acquisition of Active Directory objects with SharpHound to identify Active Directory attack paths;
Staying under the radar by performing data acquisition from the hypervisor.

A/ Dual capabilities of Velociraptor for threat hunting and data acquisition

Velociraptor[5] is an open-source digital forensics tool, which can perform targeted collections of forensic artefacts and logs across systems, as well as actively hunt and monitor systems for suspicious activity. Velociraptor serves the following two primary functions:

- [Data acquisition]: Velociraptor collectors are utilized for the acquisition of forensic artefacts, logs, and other data sources;
- [Hunting & monitoring]: Velociraptor clients facilitate the hunting and monitoring of systems for suspicious activity.

In the subsequent sub-paragraphs these methods are described.

Data Acquisition with Velociraptor Collectors

Velociraptor supports the use of collectors, which are executables equipped with pre-defined lists of forensic artefacts, logs, and other data sources. These collectors automatically collect and upload data to a storage system. The collection of forensic artefacts and logs of a system is known as a forensic package.

At Hunt & Hackett, Velociraptor collectors are used to automatically collect forensic packages from Windows, Unix and MacOS systems, uploading them to Google Cloud Storage (GCS) within minutes. This approach already offers speed and scalability, as GCS supports unlimited storage capacity[6]. Within the lab environment, forensic packages are processed with the Dissect incident response framework to perform forensics analysis of timelines, as elaborated on in a previous blog post.

- For Windows systems, the artefact Windows.KapeFiles.Targets can be used, with the target SANS Triage Collection enabled, which is a collection of the most relevant forensic artefacts and logs of the Windows operating system, that is part of the targets of KapeFiles[7]
- For Unix and MacOS systems the artefact Generic.Collectors.File can be leveraged with a predefined list of forensic artefacts and logs inspired by the digital forensic artefact repository[8], a community-sourced knowledge base of forensic artefacts.

By leveraging the known location of forensic artefacts and logs in a data-driven manner, the acquisition of forensic packages is extremely fast in comparison to the acquisition of full disk images of systems, primarily due the lower data size resulting in increased the speed with which relevant forensic artefacts can be acquired.

The creation of Velociraptor collectors is automated by using Terraform to configure the collectors with GCS credentials, the following configuration is an example of the configuration, to create a Windows Velociraptor collector using the target SANS Triage Collection: https://github.com/huntandhackett/ir-automation/blob/main/velociraptor-collector.tf.

Furthermore, Velociraptor collectors can be used to execute other software packages that acquire memory images and Active Directory objects, which will be further explained in the following paragraphs.

The Velociraptor philosophy enhancing threat hunting

The philosophy[9] of Velociraptor is to treat the actual system as the source of truth, by using targeted queries aiming to directly acquire and analyse data. Rather than collecting all the data into a central location, it pushes the queries to the systems and parses data directly on the systems. The results of these queries are then uploaded to a Velociraptor server for further analysis as well as aid in threat hunting.

This philosophy can be brought into existence by deploying Velociraptor clients on running systems. These clients connect to the Velociraptor server and await instructions, which are executed using hunts with the Velociraptor Query Language.

While primarily Velociraptor collectors are used to acquire forensic artefacts and logs of systems, there are cases where Velociraptor clients can be used to perform live analysis on systems, for the following reasons:

- The endpoint detection & response solution deployed on the systems lacks the capability to effectively respond to and track advanced persistent threats (APTs);
- Live memory analysis is needed due to the sophisticated capabilities of APTs;
- Continuous system monitoring over an extended period is required as part of a threat hunt.

B/ Configuration-based data acquisition to prevent logs remaining uncollected

As explained in the previous paragraph, the acquisition of a forensic package relies on a predefined list that contains the paths of the known location of these forensic artefacts and logs. While effective in most scenarios, there is an issue if a system administrator decides to store Windows Events Logs in a non-standard location[10] that is not included in the predefined list. Therefore, these logs remain uncollected, creating a blind spot in the investigation.

To address this concern, the developers of the Dissect incident response framework implemented a configuration-based solution in the software package Acquire[11]. This solution does not collect these logs based on a predefined list but based on the system configuration. Since the location of the Windows Event Logs is configured in the Windows Registry, this can thus be accessed to collect the Windows Event Logs from the configured location. This method represents a more advanced approach compared to the usage of KapeFiles[10], and the method could be implemented in Velociraptor by leveraging the Velociraptor Query Language.

C/ The underestimated value of the acquisition of memory images and live memory analysis

Computer memory stores volatile artefacts, such as the state of the (file) system, processes, file handles, network connections, loaded modules. Memory, being volatile requires power to maintain the stored information, therefore it is important to acquire memory before traces are erased from memory or the system is rebooted by the system administrator.

Advanced persistent threats (APTs) have been increasing utilizing memory to subvert investigations, employing techniques like anti-forensics[12], fileless threats[13] and kernel rootkits[14], which often leave no traces on the filesystem. In response to these threats, it is imperative to broaden data acquisition beyond filesystem traces to include the acquisition of memory. This multi-source data collection approach, encompassing filesystem and memory, offers a comprehensive view of system activity, reducing the risk of overlooking or completely missing crucial traces during investigations.

As part of the lab, the tools WinPmem[15] and AVML[16] are used with Velociraptor collectors to acquire full memory images of Windows and Linux systems. It should be noted that at the time of writing, there is limited to no capability to acquire full memory image of MacOS systems in the current lab. To address this gap, the use of Volexity Surge[17] is currently explored, which could offer reliable acquisition across Windows, Linux and MacOS systems.

Acquiring memory images from multiple systems simultaneously significantly impacts the acquisition time and processing of these images, because an automated processing pipeline for memory images is not (yet) an integral part of the lab. Such a pipeline should also support the use of a large variety of Linux and MacOS profiles that in addition contains the kernel data structures and debug symbols of these systems. In addition, we currently also use Velociraptor clients to hunt for malicious processes using the Velociraptor Query Language[18]. This approach enables real-time tracking of APTs across thousands of systems in a scalable manner, enhancing our incident response capabilities.

D/ The acquisition of Active Directory objects with SharpHound to identify Active Directory attack paths

Active Directory (AD)[19] is a directory service developed by Microsoft that stores information about network objects and regulate user and system controls within a network. AD is widely used in many organizations and is a critical IT asset, which makes it the perfect target for threat actors seeking to understand the network of an organization. In particular, if this threat actor aims to further move through multiple systems within the network by performing lateral movement and maintain persistence within an AD environment.

Hence, it is vital to acquire and analyse AD objects to gain insights into potential attack paths utilized by threat actors to compromise an AD environment. BloodHound[20] is an application capable of identifying and analysing attack paths in AD environments, by using the data collector SharpHound[21], that collects AD objects from domain controllers and domain-joined Windows systems.

As part of the current approach, Velociraptor collectors are utilized to execute Sharphound[22] on domain-joined Windows systems which in turn collect AD objects. This can then be used to identify attack paths, in conjunction with timelines of events of AD domain controllers and other domain-joined systems. Automatic processing of these objects and the deployment of BloodHound is part of the backlog and the plan is to integrate it into the lab in the near future.

E/ Staying under the radar by performing data acquisition from the hypervisor

The methods described thus far involve actions that require interaction with systems, that potentially could overwrite traces and leave behind traces behind that could alert Advanced Persistent Threats (APTs), to incident response activities. Depending on the incident response case and the capabilities of the APT, it might be preferred to remain unnoticed for as long as possible to prevent further escalation of the incident. Ultimately, you want to prevent a situation where recovery and mitigation actions have been performed, after which the APT returns into the network, because you were not aware enough of the capabilities and level of (persistent) access of the APT.

To prevent this from happening, it can be crucial to conduct data acquisition covertly, ensuring that APTs remain unaware whilst incident responders are gaining insights into their tactics, techniques, and procedures (TTPs). This knowledge can then be leveraged to contain the incident and prevent the APT from getting access again. Therefore, exploring undetectable data acquisition methods are therefore not just beneficial, but even essential.

Recognising this challenge, the developers of the Dissect incident response framework made a solution in the software package Acquire[23] for collecting data from virtual machines running on the VMware ESXi hypervisor[24]. Integrating Acquire on the VMware ESXi hypervisor is part of our backlog[25] and will be added to the lab in the future.

Data acquisition should not be limited to the filesystem alone. Hence, possibilities for memory image acquisition from various hypervisors are also explored. This approach minimizes impact on running systems by snapshotting memory, without being detectable by malware running on it.

The validation of data acquisition and processing

This blog post strongly emphasizes the importance of automation, yet it is essential to recognize that incident responders still play a crucial role in incident response. While automation streamlines certain tasks, the intervention of incident responders remains vital, particularly for the creative and research part of incident response. Besides that, there is no silver bullet, humans cannot fully trust the use of automation, but it should support them easing the burden of analysis. This is where the investigative prowess of a digital detective comes into play, ensuring the validation of results and the reproducibility of findings throughout the entire incident response process, from data acquisition to analysis. The validation of the acquisition and processing of investigation material is currently in the process of becoming integrated in the lab. At the time of writing, it has not been integrated yet, but it is a priority for ongoing development. In our next blog post, the importance of validating data acquisition and processing will be the main topic. This is essential for identifying bugs and ensuring the accuracy of automated results.

Open-source software

Observant as you are, dear reader, you might have noticed the word ‘open-source’ to be mentioned several times in this blog post. At Hunt & Hackett we are huge fans of using open-source software as part of our lab. Leveraging open-source software offers several benefits, including the ability to view and understand the source code, collaborate with other developers, and contribute to the improvements of these software packages.

Our lab makes use of a variety of open-source software packages, including:

Microsoft: AVML;
SpectorOps: SharpHound;
Google: Timesketch[26], Plaso and WinPmem;
Rapid7: Velociraptor;
KROLL: KapeFiles;
Fox-IT: Dissect;
Elastic: Filebeat, Logstash, Kibana and Elasticsearch;
Jupyter: Jupyter Notebook;
HashiCorp: Terraform, Packer and Vault.

Hunt & Hackett would like to thank everyone that has contributed to these open-source software packages. Additionally, we’re proud to have made contributions ourselves to Velociraptor[27], KapeFiles[28], Dissect[29] and Timesketch[30].

Conclusion

This blog post provides incident responders with a foundation for leveraging cloud native services and open-source software, combined with a DevOps mindset to perform scalable data acquisition within minutes. We did this by explaining how the following data acquisition methods work within our cloud-based incident response lab:

Dual capabilities of Velociraptor for threat hunting and data acquisition;
Configuration-based data acquisition to prevent logs remaining uncollected;
The underestimated value of memory forensics;
The acquisition of Active Directory objects with SharpHound to identify Active Directory attack paths;
Staying under the radar by performing data acquisition from the hypervisor.

As an end-to-end solution, our cloud-based incident response lab automates the ingestion and processing of all key forensic data sources in a single platform, enabling comprehensive investigation capabilities. Going beyond the more traditional investigation capabilities, especially when needed due to more sophisticated capabilities of APTs. For example, attack paths can be identified faster, including the activity of lateral movement and persistent access.

Also, the importance of validating data acquisition and processing was addressed because it is essential for identifying bugs and ensuring the accuracy of automated results. Of course, the various automations we are working on and that will be implemented over time, are complimentary to the investigative prowess of a digital detective, ensuring the validation of results, the reproducibility of findings and the needed creativity in the process.

Moving forward, we will continue to share our work and progress with the incident response community. By doing so, we aim to improve our collective capabilities and contribute to the development of more effective incident response solutions.

References

Incident Response NIS2