Varada continuously collects query execution metadata from the query engine. Query execution metadata is then transformed into a summarized model that is optimized for insight extraction. The modelled metadata is stored in columnar ORC format in an admin-defined S3 bucket.
Varada continuously collects query execution metadata from the query engine. Query execution metadata is then transformed into a summarized model that is optimized for insight extraction. The modelled metadata is stored in columnar ORC format in an admin-defined S3 bucket.
Using historical query and data usage patterns, based on the Collector output, Varada creates an actionable set of insights:
Insights are continuously revised, based on real-time usage and query performance, and translated into two types of acceleration strategies: cache and index strategies are used to automatically create acceleration instructions on which data to index, and how, and which data to cache.
Based on the frequency of data usage and its business priority, the platform uses SSD columnar nanoblock caching to speed up data access.
The platform is adaptive to the data and uses different indexing technologies to speed up data searches, filters and joins. The impact of each index is evaluated separately based on data type and level of selectively so that the platform can use the optimal index.
Using historical query and data usage patterns, based on the Collector output, Varada creates an actionable set of insights:
Insights are continuously revised, based on real-time usage and query performance, and translated into two types of acceleration strategies: cache and index strategies are used to automatically create acceleration instructions on which data to index, and how, and which data to cache.
Based on the frequency of data usage and its business priority, the platform uses SSD columnar nanoblock caching to speed up data access.
The platform is adaptive to the data and uses different indexing technologies to speed up data searches, filters and joins. The impact of each index is evaluated separately based on data type and level of selectively so that the platform can use the optimal index.
Based on the query execution metadata, Varada delivers actionable workload-level observability that enables administrators to easily understand how data is used by different workloads and users, how resources are allocated among different workloads and users, how and why bottlenecks occur, etc.
This deep observability also enables data teams to gain control by effectively identifying and optimizing high priority workloads, instead of optimizing each queries one-by-one.
Administrators and data consumers can prioritize workloads and set budget caps to ensure the platform meets business requirements across different use cases. Workload prioritization is used by the platform to drive cache and indexing strategies.
Though acceleration instructions are generated automatically by the platform, administrators have full control -- view, manage and override specific instructions via Varada’s Control Center, and determine which datasets to accelerate and which strategies to apply.