Varada vs. AWS Athena: Performance Benchmarking

David Krakov
By David Krakov
I
August 13, 2020
August 13, 2020

We ran a set of 19 queries on AWS Athena and compared them to a Varada cluster. We used different use cases to illustrate the performance uplift data teams can expect over a wide range of workloads.

On average, Varada delivered x30 faster response time and as much as x70, at a minimal compute footprint. On full scan queries, Varada ran slower than Athena due to the small cluster.

Below is the detailed analysis of the queries we ran and the results.

1. Ride Sharing Data (Queries Generated by Tableau)

We used a schema and query inspired by a ride sharing analysis and reporting use case. For this schema we had a single table with details from each trip that include geographic data, timeseries, and metadata such as trip fee.

For the queries we used complex predicates with various casting and field parsing to represent messy raw data. We used a variety of queries ranging from highly selective to cohort analysis and one full scan.

The queries were auto-generated by Tableau, with no rewrites.

In this use case we used a 4-node Varada cluster.

QueryCategoryAthena Response time (seconds)Varada Response time (seconds)Varada Performance Uplift
01Selective15.090.373978%
02Selective23.950.683422%
03Selective48.060.686968%
04Selective100.951.456862%
05Selective153.403.704046%
06Cohort11.513.65215%
07Cohort9.922.85248%
08Full scan25.1692.10-73%

2. Logs Analysis

We used a fluentbit compatible output format for files in S3 (CSV), and used jenkins server logs as a baseline. Logs are duplicated by 20, bringing the total size of gzipped CSVs to 200GB with 4B log rows.

For an apples-to-apples comparison we converted the log data to Parquet (compressed to about 100GB of Parquet files).

In this use case, we use a single node Varada cluster.

QueryDescriptionAthena Response time (seconds)Varada Response time (seconds)Varada Performance Uplift
09Count all errors (~10M matches)461.393209%
10Count all dev messages (~90M)723.491963%
11Count all dev messages for a specific job  (largest)342.001600%
12Count all errors in a specific build410.2416083%

3. TPC-DS Sample Queries

TPC-DS is one of the popular benchmarks that is being used today for big-data / Data Warehouse and decision support systems (DSS). It includes a blend of queries on a wide spectrum: from interactive queries to deep analytical / full table aggregations.

While the majority of TPC-DS queries aren’t selective as the ones listed in the previous use cases, the benefits of Varada’s distributed indexing, smarter CBO and Dynamic Filtering yields highly interactive queries at lower latency for the end-users vs. Athena, including sub-second response times for interactive queries, even when tested just on a single node.

TPC-DS QueryCategoryAthena Response time (seconds)Varada Response time (seconds)Varada Performance Uplift
Q55Ad-hoc / Interactive21.11.021969%
Q42Ad-hoc / Interactive19.440.961925%
Q52Ad-hoc / Interactive19.721.071743%
Q96Ad-hoc / Interactive13.740.931377%
Q66Reporting / BI125.098.441382%
Q03Reporting / BI22.941.161878%
Q94Deep Analytics13.6529.8-54%

Queries Syntax

1. Ride Sharing Data (Queries Generated by Tableau)

–Q01, Selective
SELECT trips_data.t_hour,
        count(*)
FROM demo.trips_data trips_data
WHERE ((trips_data.lon >= -122.407608)
        AND (trips_data.lon <= -122.40453960000001)
        AND (trips_data.lat >= 37.752309099999998)
        AND (trips_data.lat <= 37.7598585)
        AND (trips_data.t_hour >= CAST (
    CASE
    WHEN TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 1)) AS DOUBLE) >= 0 THEN
    FLOOR(TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 1)) AS DOUBLE))
    ELSE CEIL(TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 1)) AS DOUBLE))
    END AS BIGINT))
        AND (trips_data.t_hour <= CAST (
    CASE
    WHEN TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 2)) AS DOUBLE) >= 0 THEN
    FLOOR(TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 2)) AS DOUBLE))
    ELSE CEIL(TRY_CAST((SPLIT_PART(‘20,23’, ‘,’, 2)) AS DOUBLE))
    END AS BIGINT))
        AND (trips_data.d_weekday >= 5)
        AND (trips_data.d_weekday <= 7)
        AND (trips_data.fare < 7.21)
        AND (trips_data.last_point = 1))
GROUP BY  1
ORDER BY  2;

–Q02, Selective
SELECT trips_data.d_date,
        trips_data.rider_age,
        count(*)
FROM demo.trips_data trips_data
WHERE ((trips_data.lon >= -122.4083376)
        AND (trips_data.lon <= -122.4045181)
        AND (trips_data.lat >= 37.733525500000001)
        AND (trips_data.lat <= 37.7416029)
        AND (trips_data.t_hour >= CAST (
    CASE
    WHEN TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 1)) AS DOUBLE) >= 0 THEN
    FLOOR(TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 1)) AS DOUBLE))
    ELSE CEIL(TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 1)) AS DOUBLE))
    END AS BIGINT))
        AND (trips_data.t_hour <= CAST (
    CASE
    WHEN TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 2)) AS DOUBLE) >= 0 THEN
    FLOOR(TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 2)) AS DOUBLE))
    ELSE CEIL(TRY_CAST((SPLIT_PART(‘08,12’, ‘,’, 2)) AS DOUBLE))
    END AS BIGINT))
        AND (trips_data.rider_age <= 21)
        AND (trips_data.last_point = 1))
GROUP BY  1,2
ORDER BY  1,2;

–Q03, Selective
SELECT d_date,fare
FROM demo.trips_data trips_data
WHERE ((trips_data.lon >= -122.407608) AND 
(trips_data.lon <= -122.40453960000001) AND 
(trips_data.lat >= 37.752309099999998) AND 
(trips_data.lat <= 37.7598585) AND
      (trips_data.rider_id = 2735610) AND
      (trips_data.d_weekday >= 0) AND       (trips_data.d_weekday <= 4)  AND
      (trips_data.fare >= 7.21)   AND
      (trips_data.last_point = 1));

–Q04, Selective Join (Dynamic Filtering)
SELECT 
count(distinct a.tripid) as cnt_trips, 
min(a.fare) as min_fare,   max(a.fare) as max_fare
FROM demo.trips_data AS a
INNER JOIN
    (SELECT *
    FROM demo.trips_data
    WHERE tripid=200607076) AS b
ON a.rider_id=b.rider_id;

–Q05, Selective Join (Dynamic Filtering)
SELECT a.d_date,
        a.driver_first,
        a.driver_last
FROM demo.trips_data trips_data
INNER JOIN
    (SELECT *
    FROM demo.trips_data trips_data
    WHERE rider_first= ‘Parker’
            AND rider_last=’DAVIS’) a
    ON trips_data.rider_id = a.rider_id
WHERE a.rider_age=59
GROUP BY  1,2,3;

–Q06, Cohort
SELECT trips_per_week.cnt AS cnt,
        trips_per_week.trips_per_week AS trips_per_week
FROM
    (SELECT t /(date_diff(‘day’, cast(‘2020-05-01’ AS date), cast(‘2020-07-30’ AS date))/ 7.0) AS trips_per_week, count(rider_id) AS cnt
    FROM
        (SELECT rider_id,
        count(distinct tripid) AS t
        FROM demo.trips_data as trips_data
        WHERE d_date <= cast(‘2018-01-01’ AS date)
                AND d_date >= cast(‘2018-03-30’ AS date)
                AND rider_age = 19
                AND d_weekday = 2
                AND leg_duration > 4
        GROUP BY  rider_id)
        GROUP BY  t
        ORDER BY  t limit 50 ) trips_per_week
    GROUP BY  1, 2
limit 1;

–Q07, Cohort
SELECT trips_data.rider_age AS rider_age,
        trips_data.rider_gender AS rider_gender,
        SUM(1) AS cnt
FROM demo.trips_data trips_data
WHERE ((trips_data.d_date >= CAST(‘2018-01-01’ AS DATE))
        AND (trips_data.d_date <= CAST(‘2018-03-31’ AS DATE))
        AND rider_age = 23
        AND d_weekday = 5)
GROUP BY  1, 2
limit 1;

–Q08, Full Scan
SELECT year(d_date),
        month(d_date),
        rider_age,
        count(distinct tripid)
FROM demo.trips_data as trips_data
GROUP BY  1,2,3
ORDER BY  1,2;

2. Logs Analysis

–Q09, Count all errors (~10M matches)
SELECT job, build, COUNT(*) 
FROM jenkins_logs_parquet_x20 
WHERE log LIKE ‘%error%’ 
GROUP BY 1,2 
ORDER BY 3 DESC 
LIMIT 100;

–Q10, Count all dev messages (~90M)
SELECT job, build, COUNT(*) 
FROM jenkins_logs_parquet_x20 
WHERE REGEXP_LIKE(log,’dev.*’) 
GROUP BY 1,2 
ORDER BY 3 DESC 
LIMIT 100;

–Q11, Count all dev messages for a specific job (largest)
SELECT job, build, COUNT(*) 
FROM jenkins_logs_parquet_x20 
WHERE regexp_like(log,’dev.*’) AND job=’new/branches/develop’ 
GROUP BY 1,2 
ORDER BY 3 DESC 
LIMIT 100;

​–Q12, Count all errors in a specific build
​SELECT job, build, COUNT(*)
FROM jenkins_logs_parquet_x20 
WHERE log LIKE ‘%error%’ AND build =’74’ 
GROUP BY 1,2 
ORDER BY 3 DESC LIMIT 100;

3. TPC-DS Sample Queries

–Q55 (TPC-DS, Ad-hoc / Interactive)
SELECT i_brand_id brand_id ,
        i_brand brand ,
        sum(ss_ext_sales_price) ext_price
FROM date_dim , store_sales , item
WHERE (d_date_sk = ss_sold_date_sk)
        AND (ss_item_sk = i_item_sk) AND(i_manager_id = 28)
        AND (d_moy = 11)
        AND (d_year = 1999)
GROUP BY  i_brand, i_brand_id
ORDER BY  ext_price DESC, i_brand_id ASC
LIMIT 100

–Q42 (TPC-DS, Ad-hoc / Interactive)
SELECT dt.d_year ,
        item.i_category_id ,
        item.i_category ,
        sum(ss_ext_sales_price)
FROM date_dim dt , store_sales , item
WHERE (dt.d_date_sk = store_sales.ss_sold_date_sk)
        AND (store_sales.ss_item_sk = item.i_item_sk)
        AND (item.i_manager_id = 1) AND(dt.d_moy = 11)
        AND (dt.d_year = 2000)
GROUP BY  dt.d_year, item.i_category_id, item.i_category
ORDER BY  sum(ss_ext_sales_price) DESC, dt.d_year ASC, item.i_category_id ASC, item.i_category ASC
LIMIT 100

–Q52 (TPC-DS, Ad-hoc / Interactive)
SELECT dt.d_year ,
        item.i_brand_id brand_id ,
        item.i_brand brand ,
        sum(ss_ext_sales_price) ext_price
FROM date_dim dt , store_sales , item
WHERE (dt.d_date_sk = store_sales.ss_sold_date_sk)
        AND (store_sales.ss_item_sk = item.i_item_sk)
        AND (item.i_manager_id = 1)
        AND (dt.d_moy = 11)
        AND (dt.d_year = 2000)
GROUP BY  dt.d_year, item.i_brand, item.i_brand_id
ORDER BY  dt.d_year ASC, ext_price DESC, brand_id ASC
LIMIT 100

–Q96 (TPC-DS, Ad-hoc / Interactive)
SELECT count(*)
FROM store_sales , household_demographics , time_dim , store
WHERE (ss_sold_time_sk = time_dim.t_time_sk)
        AND (ss_hdemo_sk = household_demographics.hd_demo_sk)
        AND (ss_store_sk = s_store_sk)
        AND (time_dim.t_hour = 20) AND(time_dim.t_minute >= 30)
        AND (household_demographics.hd_dep_count = 7)
        AND (store.s_store_name = ‘ese’)
ORDER BY  count(*) ASC
LIMIT 100

–Q66 (TPCDS, Reporting / BI)
SELECT w_warehouse_name ,
        w_warehouse_sq_ft ,
        w_city ,
        w_county ,
        w_state ,
        w_country ,
        ship_carriers ,
        year ,
        sum(jan_sales) jan_sales ,
        sum(feb_sales) feb_sales ,
        sum(mar_sales) mar_sales ,
        sum(apr_sales) apr_sales ,
        sum(may_sales) may_sales ,
        sum(jun_sales) jun_sales ,
        sum(jul_sales) jul_sales ,
        sum(aug_sales) aug_sales ,
        sum(sep_sales) sep_sales ,
        sum(oct_sales) oct_sales ,
        sum(nov_sales) nov_sales ,
        sum(dec_sales) dec_sales ,
        sum((jan_sales / w_warehouse_sq_ft)) jan_sales_per_sq_foot ,
        sum((feb_sales / w_warehouse_sq_ft)) feb_sales_per_sq_foot ,
        sum((mar_sales / w_warehouse_sq_ft)) mar_sales_per_sq_foot ,
        sum((apr_sales / w_warehouse_sq_ft)) apr_sales_per_sq_foot ,
        sum((may_sales / w_warehouse_sq_ft)) may_sales_per_sq_foot ,
        sum((jun_sales / w_warehouse_sq_ft)) jun_sales_per_sq_foot ,
        sum((jul_sales / w_warehouse_sq_ft)) jul_sales_per_sq_foot ,
        sum((aug_sales / w_warehouse_sq_ft)) aug_sales_per_sq_foot ,
        sum((sep_sales / w_warehouse_sq_ft)) sep_sales_per_sq_foot ,
        sum((oct_sales / w_warehouse_sq_ft)) oct_sales_per_sq_foot ,
        sum((nov_sales / w_warehouse_sq_ft)) nov_sales_per_sq_foot ,
        sum((dec_sales / w_warehouse_sq_ft)) dec_sales_per_sq_foot ,
        sum(jan_net) jan_net ,
        sum(feb_net) feb_net ,
        sum(mar_net) mar_net ,
        sum(apr_net) apr_net ,
        sum(may_net) may_net ,
        sum(jun_net) jun_net ,
        sum(jul_net) jul_net ,
        sum(aug_net) aug_net ,
        sum(sep_net) sep_net ,
        sum(oct_net) oct_net ,
        sum(nov_net) nov_net ,
        sum(dec_net) dec_net
FROM
  (SELECT
    w_warehouse_name ,
    w_warehouse_sq_ft ,
    w_city ,
    w_county ,
    w_state ,
    w_country ,
    concat(concat(‘DHL’, ‘,’), ‘BARIAN’) ship_carriers , d_year YEAR ,
    sum ((CASE WHEN(d_moy = 1)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) jan_sales ,
    sum ((CASE WHEN(d_moy = 2)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) feb_sales ,
    sum ((CASE WHEN(d_moy = 3)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) mar_sales ,
    sum ((CASE WHEN(d_moy = 4)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) apr_sales ,
    sum ((CASE WHEN(d_moy = 5)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) may_sales ,
    sum ((CASE WHEN(d_moy = 6)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) jun_sales ,
    sum ((CASE WHEN(d_moy = 7)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) jul_sales ,
    sum ((CASE WHEN(d_moy = 8)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) aug_sales ,
    sum ((CASE WHEN(d_moy = 9)  THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) sep_sales ,
    sum ((CASE WHEN(d_moy = 10) THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) oct_sales ,
    sum ((CASE WHEN(d_moy = 11) THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) nov_sales ,
    sum ((CASE WHEN(d_moy = 12) THEN (ws_ext_sales_price * ws_quantity) ELSE 0 END)) dec_sales ,
    sum ((CASE WHEN(d_moy = 1)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) jan_net ,
    sum ((CASE WHEN(d_moy = 2)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) feb_net ,
    sum ((CASE WHEN(d_moy = 3)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) mar_net ,
    sum ((CASE WHEN(d_moy = 4)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) apr_net ,
    sum ((CASE WHEN(d_moy = 5)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) may_net ,
    sum ((CASE WHEN(d_moy = 6)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) jun_net ,
    sum ((CASE WHEN(d_moy = 7)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) jul_net ,
    sum ((CASE WHEN(d_moy = 8)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) aug_net ,
    sum ((CASE WHEN(d_moy = 9)  THEN (ws_net_paid * ws_quantity) ELSE 0 END)) sep_net ,
    sum ((CASE WHEN(d_moy = 10) THEN (ws_net_paid * ws_quantity) ELSE 0 END)) oct_net ,
    sum ((CASE WHEN(d_moy = 11) THEN (ws_net_paid * ws_quantity) ELSE 0 END)) nov_net ,
    sum ((CASE WHEN(d_moy = 12) THEN (ws_net_paid * ws_quantity) ELSE 0 END)) dec_net
  FROM web_sales , warehouse , date_dim , time_dim , ship_mode
  WHERE (ws_warehouse_sk = w_warehouse_sk)
          AND (ws_sold_date_sk = d_date_sk)
          AND (ws_sold_time_sk = t_time_sk)
          AND (ws_ship_mode_sk = sm_ship_mode_sk)
          AND (d_year = 2001)
          AND (t_time BETWEEN 30838 AND(30838 + 28800))
          AND (sm_carrier IN(‘DHL’ , ‘BARIAN’))
  GROUP BY  w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year
  UNION ALL
  SELECT
    w_warehouse_name ,
    w_warehouse_sq_ft ,
    w_city ,
    w_county ,
    w_state ,
    w_country ,
    concat(concat(‘DHL’, ‘,’), ‘BARIAN’) ship_carriers , d_year YEAR ,
    sum((CASE WHEN(d_moy = 1)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) jan_sales ,
    sum((CASE WHEN(d_moy = 2)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) feb_sales ,
    sum((CASE WHEN(d_moy = 3)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) mar_sales ,
    sum((CASE WHEN(d_moy = 4)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) apr_sales ,
    sum((CASE WHEN(d_moy = 5)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) may_sales ,
    sum((CASE WHEN(d_moy = 6)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) jun_sales ,
    sum((CASE WHEN(d_moy = 7)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) jul_sales ,
    sum((CASE WHEN(d_moy = 8)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) aug_sales ,
    sum((CASE WHEN(d_moy = 9)  THEN (cs_sales_price * cs_quantity) ELSE 0 END)) sep_sales ,
    sum((CASE WHEN(d_moy = 10) THEN (cs_sales_price * cs_quantity) ELSE 0 END)) oct_sales ,
    sum((CASE WHEN(d_moy = 11) THEN (cs_sales_price * cs_quantity) ELSE 0 END)) nov_sales ,
    sum((CASE WHEN(d_moy = 12) THEN (cs_sales_price * cs_quantity) ELSE 0 END)) dec_sales ,
    sum((CASE WHEN(d_moy = 1)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) jan_net ,
    sum((CASE WHEN(d_moy = 2)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) feb_net ,
    sum((CASE WHEN(d_moy = 3)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) mar_net ,
    sum((CASE WHEN(d_moy = 4)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) apr_net ,
    sum((CASE WHEN(d_moy = 5)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) may_net ,
    sum((CASE WHEN(d_moy = 6)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) jun_net ,
    sum((CASE WHEN(d_moy = 7)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) jul_net ,
    sum((CASE WHEN(d_moy = 8)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) aug_net ,
    sum((CASE WHEN(d_moy = 9)  THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) sep_net ,
    sum((CASE WHEN(d_moy = 10) THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) oct_net ,
    sum((CASE WHEN(d_moy = 11) THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) nov_net ,
    sum((CASE WHEN(d_moy = 12) THEN (cs_net_paid_inc_tax * cs_quantity) ELSE 0 END)) dec_net
  FROM catalog_sales , warehouse , date_dim , time_dim , ship_mode
  WHERE (cs_warehouse_sk = w_warehouse_sk)
            AND (cs_sold_date_sk = d_date_sk)
            AND (cs_sold_time_sk = t_time_sk)
            AND (cs_ship_mode_sk = sm_ship_mode_sk)
            AND (d_year = 2001)
            AND (t_time BETWEEN 30838 AND (30838 + 28800))
            AND (sm_carrier IN (‘DHL’ , ‘BARIAN’))
  GROUP BY  w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year ) x
GROUP BY  w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year
ORDER BY  w_warehouse_name ASC
LIMIT 100

–Q03 (TPC-DS, Reporting / BI)
SELECT dt.d_year,
        item.i_brand_id brand_id,
        item.i_brand brand,
        sum(ss_ext_sales_price) sum_agg
FROM date_dim dt, store_sales, item
WHERE (dt.d_date_sk = store_sales.ss_sold_date_sk)
        AND (store_sales.ss_item_sk = item.i_item_sk)
        AND (item.i_manufact_id = 128)
        AND (dt.d_moy = 11)
GROUP BY  dt.d_year, item.i_brand, item.i_brand_id
ORDER BY  dt.d_year ASC, sum_agg DESC, brand_id ASC
LIMIT 100

–Q94 (TPCDS, Deep Analytics)
SELECT count(DISTINCT ws_order_number) order count ,
        sum(ws_ext_ship_cost) total shipping cost ,
        sum(ws_net_profit) total net profit
FROM web_sales ws1 , date_dim , customer_address , web_site WHERE(d_date
    BETWEEN CAST(‘1999-2-01’ AS DATE)
        AND (CAST(‘1999-2-01′ AS DATE) + INTERVAL ’60’ DAY))
        AND (ws1.ws_ship_date_sk = d_date_sk)
        AND (ws1.ws_ship_addr_sk = ca_address_sk)
        AND (ca_state = ‘IL’)
        AND (ws1.ws_web_site_sk = web_site_sk)
        AND (web_company_name = ‘pri’) AND(EXISTS
    (SELECT *
    FROM web_sales ws2
    WHERE (ws1.ws_order_number = ws2.ws_order_number)
            AND (ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) ))
        AND (NOT(EXISTS
    (SELECT *
    FROM web_returns wr1
    WHERE (ws1.ws_order_number = wr1.wr_order_number) )))
ORDER BY  count(DISTINCT ws_order_number) ASC
LIMIT 100


Take Varada for a spin! Click here to schedule a live demo.

We use cookies to improve your experience. To learn more, please see our Privacy Policy
Accept