Deliver fast time-to-insights and blazing fast queries on exabytes of data directly on the cloud data lake for threat hunting, incident response and security investigations.
Security teams are measured by their ability to collect and analyze as much data as possible in a very short period of time. But with so many types of data flowing at massive volumes, security teams are experiencing:
The result is a limited ability to to triage modern threats. SOC teams are overwhelmed with alerts and data and existing SIEM platforms merely compounds the problem.
The data lake architecture is gaining momentum in many verticals as it offers a modern and agile alternative to data warehouses and solves storage, access and scale challenges. For threat hunting and anomaly detection, that are heavily relying on the ability to analyze massive amounts of complex data in a very short timeframe, the benefits are dramatic:
Indeed, many vendors in the space are making strategic investments in data lake-based solutions. But with the current inefficiencies of data lake analytics platforms (90% of compute is “wasted” on data scanning and filtering), the move towards the data lake often means organizations will need to compromise on price / performance balance which tends to limit the workloads to experimental and non-production.
Varada’s security data lake platform runs in the customer’s cloud environment (VPC), enabling SOC analysts, threat detection, anomalies and incident management applications, and essentially any SQL consumer to easily query any data source on the data lake.
Varada leverages the power of autonomous indexing and caching to accelerate queries by 10x-100x. Performance advantages will improve as queries are more complex and selective (needle in a haystack threat analysis), yielding a 40%-60% cost reduction.
Varada’s workload-level observability component enables data teams to seamlessly monitor, optimize and accelerate workloads to meet dynamic business requirements. Data teams can easily set priorities, performance requirements, and budget caps.
To explain what it means to be “autonomous”, you can break it down to three critical components: adaptive, dynamic and elastic.
1. Be Adaptive
Unlike partitioning-based optimizations, which are limited to several columns, Varada can index any column and automatically decides which data to index and which index to use on each nano-block (small chunk of data, 64K rows, of a single column).
Each nano-block is mapped to the original data set, and includes any of:
Varada’s indexing suite includes a variety of indexes such as Bitmap, Dictionary, Trees, Bloom Lucene (text searches), etc. Based on the format of the data, structure and cardinality, the platform automatically assigns the most effective index and driving optimal performance.
Organizations collect massive amounts of data on various events from many different applications and systems. These events need to be analyzed effectively to enable real-time threat detection, anomalies and incident management. In various security-related use cases, text analytics is leveraged to provide deep insights on traffic and user behavior (segmentation, URL categorization, etc.).
Text analytics has proven to be critical for security information and event monitoring (SIEM) and other SOC tools in reducing the overall time and resources required to investigate a security incident while being as effective and efficient as possible.
Text searches with Apache Lucene are a native part of the platform and are applied automatically by the platform.
2. Be Dynamic. Stop worrying about peaks.
Varada automatically accelerates queries according to workload behavior and automatic detection of hot data and bottlenecks. The platform also enables data teams to define business priorities and accordingly adjust performance and budgets, eliminating the need to build separate silos for each use case.
The platform seamlessly chooses which queries to accelerate and which data to index.
3. Be Elastic
Agility is not limited to the type of queries but also to the volume of queries, which means volatility in compute for query processing is expected to be high. Data teams are often measured on how quickly they can react to spikes in demand.
Varada’s architecture is extremely elastic to enable teams to add more clusters and use cases quickly and dynamically scale out and in, delivering the most effective TCO. Effective separation of compute and storage enables to elastically scale and add additional clusters as query traffic fluctuates, avoiding overprovisioning and idle resources.
An index-once approach enables to speed up warm-up time by 10x-20x compared to indexing data from scratch — as the platform creates new indexes, they are also stored in a designated folder on the customer’s data lake (“warm data”), in addition to the cluster’s SSDs (“hot data”).
When the cluster is scaled in or eliminated, and some (or all) nodes are shut down, indexes remain available as warm data. Warm indexes enable fast warming up when scaling back out, adding new clusters, and adding SSD resources to cluster(s).
The End Result: Any SQL Query, Any Data, Blazing Fast. Period.