Analytics teams are increasingly challenged with enabling BI systems and dashboards to remain interactive as business users demand to view data across extensive time periods and require enriched multi-dimensional data. One option is to import limited datasets into internal BI tools, but that typically creates challenges with data freshness. It may also restrict data volume, concurrency consistency and the ability to scale. The result is that data-driven decisions leverage only a small portion of available data, keeping business executives blind to the big picture.
The common alternative is to focus on virtualization in big data and directly connect to the data lake. This often requires a resource-heavy and lengthy SQL performance tuning and optimizations of each specific data set to support analytics needs. Extensive data modeling and pre-aggregations can limit the flexibility for end-users to ask any question and deal with evolving business requirements. It rarely works...
Indeed, query engines on top of data lakes can support rapidly changing data requirements and large data volumes. But, they often rely on large data scans, and therefore result in unacceptable response times as well as high and unpredictable costs.
Varada’s cloud data virtualization technology seamlessly connects to any SQL BI tool and instantly operationalizes the entire data lake, without compromising on interactivity and at a predictable cost.
Varada serves as a smart acceleration layer on the data lake, which remains the single source of truth, and runs in the customer cloud environment. Our secret sauce is our ability to automatically and dynamically index relevant data, at the structure and granularity of the source. Varada enables any query to meet various performance and concurrency requirements, running at the same speed as the internal in-memory databases without exponentially growing the cost.
Varada adaptively and dynamically indexes relevant columns on trillions of rows, supporting any ANSI SQL query. Varada automatically chooses which queries to accelerate, based on continuous monitoring and priorities set by data teams.
Varada seamlessly connects directly to a wide range of data sources, including the data lake (AWS S3, on-prem Hadoop, etc.), data catalogs (Hive Metastore, AWS Glue) and other sources (MySQL, PostgreSQL, etc.).
Deliver x100 faster query response time across any data source, using the index to filter, join and aggregate data. Deliver x300 faster response time for JOIN queries supporting star and snowflake schemas.
Support hundreds of active users while delivering interactive response time. You can easily support different workload priorities and budgets.
Significantly accelerate time-to-market for creating and updating dashboards, as well as enable drill-downs by cutting down on manual modeling and data preparation.
Varada runs in your cloud environment, keeping data in your full control and in your own VPC, so you can employ existing security policies.