Looking back at 2020 and into 2021, we asked 130+ data experts and executives from North America about the pace of change in data virtualization, the challenges and the route to supporting enterprises to truly embrace the data lake architecture.
When we asked whether data virtualization is a viable alternative to a data warehouse, 60% indicated that data virtualization is a strong alternative to a data warehouse. 38% see it as an alternative with some limitations. Only 2% of respondents indicated there are significant limitations.
Not surprisingly, the top benefit for 71% of companies with a 10TB+ data lake is reducing and simplifying data ops.
For companies with less than 10TB, who haven’t experienced the challenges of truly massive amounts of data yet, the top benefit is the ability to run all queries on a single platform.
Looking into 2021, the number of organizations with a significant amount of data virtualization footprint (50%+ of workloads) is expected to double.
88% of companies face challenges that impact their migration efforts to a data virtualization platform. Query rewrites and cost are a top concern. In addition, almost half respondents indicated that performance is a critical challenge.
Varada is a data platform that is deployed in your VPC and on top of your data lake. Queries from any data consumer are routed via Varada, which acts as the query engine. Any SQL app, BI tool or even analysts and data scientists can easily query any data source in your data lake, without the need to rewrite queries, move data, prepare or model it in advance.
Queries perform so much faster based on Varada’s dynamic and adaptive indexing technology. Unlike partitioning-based platforms, Varada indexes any column in any table so we can fetch data extremely fast. The indexing is adaptive to the type of data and Varada’s engine knows automatically which data to index based on a smart observability layer that continuously monitors demand. Indexing is best for complex queries that run on highly dimensional data that would have otherwise required extensive modeling to achieve acceptable response time.
We didn’t just stop at performance. Different queries have different priorities and requirements. Admins can now easily assign budgets and priorities to each set of queries so you can say goodbye to notorious budget-busting surprises. You can also expect a 40%-60% reduction in TCO because Varada’s query engine is very light on compute resources and doesn’t require any data duplication or additional ETLs.
See how Varada’s big data indexing dramatically accelerates queries vs. AWS Athena:
To see Varada in action on your data set, schedule a short demo!