The First Step to Optimize PrestoDB & Trino: The Presto Analyzer

By Roman Vainbrand
June 29, 2021
June 29, 2021

It’s no secret that we are huge Presto fans. All flavors work 🙂

A while back we had to place a bet on which data lake query engine will win the data lake analytics race. Presto stood out immediately. It’s extremely flexible, and coupled with the vibrant community it was a perfect fit with Varada’s indexing-based acceleration technology.

To help our customers better understand their PrestoDB and Trino clusters, we recently released the Presto Workload Analyzer. Just like any other robust platform, observability is critical. Data platform teams are consistently in need to monitor their Presto cluster performance and identify bottlenecks in order to meet data workers demand. 

The Analyzer extracts and aggregates query metrics and other data so it can be surfaced in charts that provide greater visibility into the performance of a cluster running either PrestoDB or Trino. This makes it easier to identify SQL queries that are consuming large amounts of compute resources and discover what data is being accessed most frequently and how to improve overall JOIN performance. To get a clear sense of the value of the Analyzer, check out a sample report of the Analyzer.

If you have questions on the Analyzer or want to share feedback, join the Analyzer Slack workspace

Schedule A Demo

In this blog post I’ll describe some of the charts I find the most impactful and suggest options to optimize and improve your clusters performance and resource utilization (i.e. budget!).

Peak memory used by queries

One of the factors for concurrency is the amount of peak memory used by the queries. As highlighted below certain queries are using almost a Terabyte of RAM. These queries definitely need further investigation and optimization.

Also queries using large memory are doing large aggregations using distributed hash tables and can be optimized by repartitioning the data. This is interesting to a data engineer to see if there is resource contention when certain queries have very high memory utilization. There is a possibility to move these to a separate cluster if required or optimize it.

The Presto Workload Analyzer | Peak memory used by queries | Chart 7

Presto / Trino wall time usage by operator type

One of the aspects of query tuning is to monitor the amount of data that is scanned as part of each operator in the query plan. Certain operators like ScanFilerAndProject, LookupJoin can take advantage of indexing capabilities.

ScanFilerAndProject operators can bring in large amounts of data in the Presto cluster and perform join and filtering operations. This can result in increased use of compute resources. Leveraging indexing and data caching capabilities reduces the amount of data scanned by the operators thereby improving the performance as well as reducing the overall compute cost.

Similarly for LookupJoin we can analyze the query and see if the joins are not sorted correctly or there might be lookups that result in significantly more misses than hits (dynamic filtering can help) or add statistics for Cost Based Optimizer to re-order joins and also leverage broadcast joins instead of partitioned joins. All these optimizations can reduce the overall CPU time of the query thereby improving performance and reducing cost.

The Presto Workload Analyzer | Wall time usage by operator type | Chart 17

Presto / Trino wall time of table scans by selectivity bins

Identifying the % of data that is selected compared to the input provides a good indication of optimization opportunities.

The (X-axis) selectivity bin value determines the amount of data that is selected out of the total data scanned. So a selectivity bin of 0.0-0.1 indicates that upto 10 % of the data is selected while 90% of it is discarded 80% of times. This means that queries that are highly selective can benefit from reducing the amount of data scanned. This can be achieved by a variety of techniques like partition pruning and indexing.

The bins on the extreme right of the graph indicate full scans and predicate pushdown is efficient in these scenarios and the filtering is taken care of at the data source and not by Presto.

The Presto Workload Analyzer | Wall time of table scans by selectivity bins | Chart 19

Presto / Trino joins distribution

When you have joins in the queries one of the key principles for optimal performance is to ensure the right side of the join is smaller than the left side. This ensures that Presto can leverage broadcast joins wherein each node participating in the query builds a hash table from all of the data (data is replicated to each node). This results in less data movement and I/O thereby reducing the query performance.

Note, if you have CBO with statistics enabled it can help re-ordering the joins dynamically.

This is interesting for a Data Engineer to ensure that join reordering can be used in certain queries. This is another data point to reduce/save cluster resources.

Based on the graph we see that larger right side joins did not use REPLICATED joins. However we can safely use REPLICATED joins up to 1 MB of the right hand side. So it might be beneficial to either re-write these queries (highlighted in purple) or use statistics.

The Presto Workload Analyzer | Joins distribution | Chart 27

Clone the Presto Analyzer from Github

See how you can accelerate Presto queries by 10x-100x with autonomous indexing.

We use cookies to improve your experience. To learn more, please see our Privacy Policy