feat: add sort_pushdown_inexact benchmark for RG reorder by zhuqi-lucas · Pull Request #21674 · apache/datafusion

zhuqi-lucas · 2026-04-16T12:28:08Z

Which issue does this PR close?

Rationale for this change

The existing sort_pushdown benchmarks only cover the Exact path (sort elimination). The Inexact path — where TopK is preserved and row group reorder by statistics helps threshold convergence — had no benchmark to measure its impact.

What changes are included in this PR?

New benchmark: sort_pushdown_inexact with 4 DESC LIMIT queries (narrow/wide rows, small/large LIMIT)
Data generation: single large parquet file with shuffled row groups (simulates append-heavy workloads)
Enables pushdown_filters in sort_pushdown benchmarks so TopK's dynamic filter is pushed to the parquet reader for late materialization

How to run

./bench.sh data sort_pushdown_inexact
./bench.sh run sort_pushdown_inexact

Or on GKE: @alamb benchmark sort_pushdown_inexact

Are these changes tested?

Benchmark code only — validated locally.

Are there any user-facing changes?

No. New benchmark only.

Copilot

Pull request overview

Adds a new benchmark scenario to measure DataFusion’s sort-pushdown Inexact path behavior (TopK preserved) and the impact of row-group reorder by statistics, while also enabling Parquet filter pushdown so TopK’s dynamic filters can benefit from late materialization.

Changes:

Enable execution.parquet.pushdown_filters in the sort_pushdown benchmark runner to allow TopK dynamic filter pushdown to Parquet decoding.
Add sort_pushdown_inexact to bench.sh, including data generation (single Parquet file) and a dedicated run target.
Add 4 new SQL queries under benchmarks/queries/sort_pushdown_inexact/ covering narrow/wide and small/large LIMIT variants for ORDER BY ... DESC.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
benchmarks/src/sort_pushdown.rs	Turns on Parquet filter pushdown for sort_pushdown benchmark runs.
benchmarks/bench.sh	Wires in `sort_pushdown_inexact` data + run targets and generates a single-file dataset.
benchmarks/queries/sort_pushdown_inexact/q1.sql	DESC TopK (narrow) query for Inexact path coverage.
benchmarks/queries/sort_pushdown_inexact/q2.sql	DESC TopK (narrow) with larger LIMIT to stress threshold convergence.
benchmarks/queries/sort_pushdown_inexact/q3.sql	DESC TopK (wide `SELECT *`) to show late materialization benefits.
benchmarks/queries/sort_pushdown_inexact/q4.sql	Wide `SELECT *` with larger LIMIT for cumulative effects.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-16T12:36:35Z

+run_sort_pushdown_inexact() {
+    INEXACT_DIR="${DATA_DIR}/sort_pushdown_inexact"
+    RESULTS_FILE="${RESULTS_DIR}/sort_pushdown_inexact.json"
+    echo "Running sort pushdown Inexact benchmark (row group reorder by statistics)..."
+    debug_run $CARGO_COMMAND --bin dfbench -- sort-pushdown --sorted --iterations 5 --path "${INEXACT_DIR}" --queries-path "${SCRIPT_DIR}/queries/sort_pushdown_inexact" -o "${RESULTS_FILE}" ${QUERY_ARG} ${LATENCY_ARG}


run_sort_pushdown_inexact passes --sorted, which tells DataFusion the file is ordered by l_orderkey (via WITH ORDER). However, the generated shuffled.parquet is intentionally not globally sorted by l_orderkey (row groups are arranged by the bucket key). This benchmark therefore relies on the planner staying on an Inexact path (keeping TopK/Sort for correctness). Please document this assumption explicitly and/or add a guard to ensure the benchmark can’t silently become incorrect if future optimizations start treating the reversed scan as Exact and eliminate the ordering operator.

I'm not sure what the --sorted flag is supposed to do but worth checking @zhuqi-lucas

--sorted adds WITH ORDER (l_orderkey ASC) which is needed to trigger the reverse scan path in try_pushdown_sort — currently the only path where reorder_by_statistics is called.

After reviewing this more carefully, I realized we need two benchmark suites to cover the full optimization path:

sort_pushdown_inexact (with --sorted, DESC queries) — tests the reverse scan path where RG reorder is already supported

sort_pushdown_inexact_unsorted (without --sorted, ASC+DESC queries) — tests the Unsupported path where RG reorder will be supported in a follow-up PR (feat: reorder row groups by statistics during sort pushdown #21580)

Updated in the latest push. This way each follow-up PR can run its corresponding benchmark to show the improvement.

adriangb · 2026-04-16T12:41:30Z

+        let mut config = self.common.config()?;
+        // Enable filter pushdown so TopK's dynamic filter is pushed to the
+        // parquet reader for late materialization — only sort-column rows
+        // pass the filter, non-sort columns are skipped for filtered rows.
+        config.options_mut().execution.parquet.pushdown_filters = true;


Does this enable it for all benchmarks? Generally we control this on a per-run basis.

Good catch @adriangb ! Fixed — reverted the sort_pushdown.rs change and moved pushdown_filters to an env var only for the inexact benchmark run. Other sort_pushdown benchmarks are unaffected.

adriangb · 2026-04-16T12:53:54Z

+run_sort_pushdown_inexact() {
+    INEXACT_DIR="${DATA_DIR}/sort_pushdown_inexact"
+    RESULTS_FILE="${RESULTS_DIR}/sort_pushdown_inexact.json"
+    echo "Running sort pushdown Inexact benchmark (row group reorder by statistics)..."
+    debug_run $CARGO_COMMAND --bin dfbench -- sort-pushdown --sorted --iterations 5 --path "${INEXACT_DIR}" --queries-path "${SCRIPT_DIR}/queries/sort_pushdown_inexact" -o "${RESULTS_FILE}" ${QUERY_ARG} ${LATENCY_ARG}


I'm not sure what the --sorted flag is supposed to do but worth checking @zhuqi-lucas

adriangb · 2026-04-16T12:56:02Z

+    # Use datafusion-cli to bucket rows into 64 groups by a deterministic
+    # scrambler, then sort within each bucket by orderkey. This produces
+    # ~64 RG-sized segments where each has a tight orderkey range but the
+    # segments appear in scrambled (non-sorted) order in the file.


I think another interesting benchmark, which is our case at least, is when there is overlap between the row groups but some general logic. E.g.:

rg,min,max 0,0,5 1,4,7 2,5,32

In our case, this happens because there's a stream of data that's coming in with time stamps, and it should arrive at around the same time it was created, but there are always some network delays, time skew, etc. that means that it's not perfect. But data that arrived 30 minutes later is guaranteed to have timestamps in a different range than data that arrive 30 minutes before it.

Great suggestion! Partially overlapping RGs from streaming data is a very realistic scenario. I will add a benchmark variant for this pattern when I update the PR — something like time-ordered chunks with small overlaps between adjacent chunks to simulate network delays / time skew.

Did you plan to include it in this PR or a followup?

Will add it in this PR — I'll create a third data variant with partially overlapping RGs (simulating streaming data with network delays) alongside the current shuffled data.

Amazing thanks so much

Add two benchmark suites for the Inexact sort pushdown path: 1. sort_pushdown_inexact (--sorted, DESC queries): Tests reverse scan path where RG reorder by statistics is applied. 4 queries: narrow/wide rows x LIMIT 100/1000. 2. sort_pushdown_inexact_unsorted (no WITH ORDER): Tests Unsupported path for future RG reorder support without declared file ordering. 6 queries: ASC/DESC x narrow/wide. Both use a single shuffled parquet file with out-of-order row groups and enable pushdown_filters via env var for late materialization. Closes apache#21582

Copilot AI review requested due to automatic review settings April 16, 2026 12:28

Copilot started reviewing on behalf of zhuqi-lucas April 16, 2026 12:28 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

adriangb reviewed Apr 16, 2026

View reviewed changes

zhuqi-lucas mentioned this pull request Apr 16, 2026

feat: reorder row groups by statistics during sort pushdown #21580

Open

zhuqi-lucas force-pushed the feat/sort-pushdown-inexact-benchmark branch from ab5fe05 to bc87f85 Compare April 16, 2026 12:48

adriangb reviewed Apr 16, 2026

View reviewed changes

zhuqi-lucas force-pushed the feat/sort-pushdown-inexact-benchmark branch from bc87f85 to 9290773 Compare April 16, 2026 14:21

Conversation

zhuqi-lucas commented Apr 16, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How to run

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhuqi-lucas Apr 16, 2026 •

edited

Loading

zhuqi-lucas Apr 16, 2026 •

edited

Loading