Skip to content

Handle canceled partitioned hash join dynamic filters lazily#21666

Open
adriangb wants to merge 11 commits intoapache:mainfrom
pydantic:codex/hash-join-empty-partition-reporting
Open

Handle canceled partitioned hash join dynamic filters lazily#21666
adriangb wants to merge 11 commits intoapache:mainfrom
pydantic:codex/hash-join-empty-partition-reporting

Conversation

@adriangb
Copy link
Copy Markdown
Contributor

@adriangb adriangb commented Apr 16, 2026

Which issue does this PR close?

Rationale for this change

Partitioned hash join dynamic filters assumed every build-side partition would eventually report build data to the shared coordinator. That assumption breaks when an upstream partitioned operator legally short-circuits and drops a child hash-join partition before it is ever polled far enough to report.

In the original reproducer, a parent RightSemi join completes early for partitions whose own build side is empty. That causes child partitioned hash-join streams to be dropped while still waiting to build/report their dynamic-filter contribution. Sibling partitions then wait forever for reports that will never arrive.

What changes are included in this PR?

  • teach the shared partitioned dynamic-filter coordinator to distinguish terminal partition states:
    • reported build data
    • canceled before build data was known
  • mark unreported partitioned hash-join streams as canceled on Drop
  • treat canceled partitions as true in the synthesized partitioned filter so they do not block completion or incorrectly filter probe rows
  • preserve existing empty-partition behavior so known-empty partitions still contribute false
  • preserve the existing compact filter plan shapes when there are no canceled partitions, including the single-branch collapse used in hash-collision mode
  • add a regression test for the cancellation pattern that previously hung

Are these changes tested?

  • cargo fmt --all
  • cargo test -p datafusion-physical-plan test_partitioned_dynamic_filter_reports_empty_canceled_partitions -- --nocapture
  • cargo test -p datafusion --test core_integration physical_optimizer::filter_pushdown::test_hashjoin_dynamic_filter_pushdown_partitioned -- --nocapture
  • cargo test -p datafusion --test core_integration physical_optimizer::filter_pushdown::test_hashjoin_dynamic_filter_pushdown_partitioned --features force_hash_collisions -- --nocapture
  • verified that test_partitioned_dynamic_filter_reports_empty_canceled_partitions times out on the pre-fix revision and passes on this branch

cargo clippy --all-targets --all-features -- -D warnings still fails on an unrelated existing workspace lint in datafusion/expr/src/logical_plan/plan.rs:3773 (clippy::mutable_key_type).

Are there any user-facing changes?

No.

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Apr 16, 2026
@adriangb adriangb marked this pull request as ready for review April 16, 2026 11:55
@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4259875560-1362-nk27n 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (d17d5e4) to 5c653be (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4259875560-1361-d7znb 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (d17d5e4) to 5c653be (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4259875560-1363-qrwrt 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (d17d5e4) to 5c653be (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                   HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │           1.19 / 4.37 ±6.28 / 16.93 ms │              1.19 / 4.46 ±6.37 / 17.19 ms │     no change │
│ QQuery 1  │         14.88 / 15.01 ±0.11 / 15.17 ms │            15.14 / 15.45 ±0.17 / 15.63 ms │     no change │
│ QQuery 2  │         43.43 / 43.74 ±0.26 / 44.21 ms │            43.98 / 44.39 ±0.30 / 44.90 ms │     no change │
│ QQuery 3  │         41.57 / 44.12 ±1.69 / 46.34 ms │            42.77 / 44.97 ±1.35 / 46.94 ms │     no change │
│ QQuery 4  │      282.16 / 293.79 ±9.62 / 307.92 ms │         285.44 / 296.97 ±7.26 / 308.10 ms │     no change │
│ QQuery 5  │      337.41 / 342.71 ±2.83 / 345.05 ms │         352.81 / 358.06 ±4.34 / 364.86 ms │     no change │
│ QQuery 6  │            4.98 / 5.41 ±0.22 / 5.55 ms │              5.30 / 9.08 ±3.72 / 15.49 ms │  1.68x slower │
│ QQuery 7  │         16.78 / 17.38 ±0.47 / 18.15 ms │            17.51 / 17.64 ±0.10 / 17.76 ms │     no change │
│ QQuery 8  │      415.76 / 424.23 ±8.58 / 440.27 ms │        410.23 / 429.54 ±12.05 / 446.87 ms │     no change │
│ QQuery 9  │      648.84 / 658.48 ±9.24 / 671.44 ms │         631.88 / 643.91 ±7.25 / 654.13 ms │     no change │
│ QQuery 10 │         92.78 / 95.81 ±2.44 / 99.64 ms │            92.15 / 94.38 ±1.43 / 96.26 ms │     no change │
│ QQuery 11 │      105.44 / 106.03 ±0.58 / 107.09 ms │         105.57 / 107.75 ±2.37 / 111.84 ms │     no change │
│ QQuery 12 │      342.54 / 352.42 ±9.61 / 370.28 ms │         334.21 / 339.65 ±3.70 / 345.28 ms │     no change │
│ QQuery 13 │      461.57 / 470.47 ±8.18 / 484.10 ms │        448.28 / 464.82 ±10.82 / 476.52 ms │     no change │
│ QQuery 14 │      342.47 / 343.60 ±0.93 / 345.24 ms │         343.75 / 346.33 ±2.36 / 349.96 ms │     no change │
│ QQuery 15 │     345.85 / 364.92 ±26.66 / 416.23 ms │        346.80 / 366.10 ±19.82 / 398.34 ms │     no change │
│ QQuery 16 │     698.51 / 714.81 ±12.20 / 730.96 ms │        712.57 / 727.80 ±15.51 / 752.28 ms │     no change │
│ QQuery 17 │      709.30 / 715.14 ±4.25 / 720.85 ms │         701.39 / 712.84 ±8.43 / 722.39 ms │     no change │
│ QQuery 18 │  1440.24 / 1484.07 ±37.55 / 1529.27 ms │     1411.15 / 1454.53 ±35.56 / 1512.41 ms │     no change │
│ QQuery 19 │        37.57 / 47.40 ±18.04 / 83.45 ms │          36.08 / 65.17 ±51.55 / 167.88 ms │  1.38x slower │
│ QQuery 20 │     716.24 / 769.21 ±74.56 / 914.63 ms │        716.23 / 740.68 ±36.54 / 813.18 ms │     no change │
│ QQuery 21 │     764.55 / 784.51 ±14.91 / 804.97 ms │         760.98 / 768.17 ±6.05 / 777.61 ms │     no change │
│ QQuery 22 │   1129.82 / 1135.88 ±6.00 / 1147.22 ms │      1131.04 / 1135.15 ±2.58 / 1138.99 ms │     no change │
│ QQuery 23 │ 3062.46 / 3218.46 ±160.84 / 3497.84 ms │     3097.45 / 3120.40 ±19.89 / 3147.34 ms │     no change │
│ QQuery 24 │       99.25 / 104.82 ±3.72 / 110.55 ms │         102.05 / 105.35 ±2.89 / 109.04 ms │     no change │
│ QQuery 25 │      138.23 / 140.44 ±1.84 / 142.65 ms │         139.62 / 141.81 ±2.43 / 146.55 ms │     no change │
│ QQuery 26 │      101.02 / 104.52 ±1.91 / 106.72 ms │         100.96 / 104.84 ±2.15 / 106.89 ms │     no change │
│ QQuery 27 │      846.99 / 853.43 ±8.79 / 870.68 ms │         843.80 / 850.00 ±4.96 / 858.11 ms │     no change │
│ QQuery 28 │  3256.52 / 3283.26 ±23.95 / 3320.66 ms │     3196.58 / 3236.01 ±22.69 / 3263.48 ms │     no change │
│ QQuery 29 │         50.38 / 54.22 ±4.93 / 63.77 ms │            50.65 / 57.18 ±7.12 / 70.39 ms │  1.05x slower │
│ QQuery 30 │      367.60 / 370.98 ±2.69 / 374.68 ms │        358.05 / 368.90 ±10.14 / 386.78 ms │     no change │
│ QQuery 31 │      367.20 / 382.34 ±9.63 / 396.41 ms │         370.31 / 378.77 ±5.44 / 386.87 ms │     no change │
│ QQuery 32 │  1207.62 / 1293.57 ±62.74 / 1382.69 ms │     1034.23 / 1158.96 ±97.51 / 1291.15 ms │ +1.12x faster │
│ QQuery 33 │  1495.17 / 1561.23 ±50.59 / 1642.46 ms │     1453.73 / 1464.62 ±11.31 / 1485.93 ms │ +1.07x faster │
│ QQuery 34 │  1542.70 / 1599.44 ±40.23 / 1662.87 ms │     1450.90 / 1488.12 ±22.30 / 1512.07 ms │ +1.07x faster │
│ QQuery 35 │      391.34 / 401.63 ±6.18 / 410.23 ms │         379.14 / 386.43 ±5.84 / 396.76 ms │     no change │
│ QQuery 36 │      112.72 / 119.13 ±4.25 / 124.89 ms │         118.38 / 120.30 ±1.19 / 121.57 ms │     no change │
│ QQuery 37 │         49.84 / 50.67 ±0.72 / 51.75 ms │            47.24 / 49.95 ±1.77 / 52.79 ms │     no change │
│ QQuery 38 │         74.20 / 75.96 ±1.12 / 77.60 ms │            74.12 / 75.68 ±1.42 / 78.11 ms │     no change │
│ QQuery 39 │      214.53 / 225.33 ±6.52 / 232.69 ms │         201.01 / 207.48 ±6.44 / 215.99 ms │ +1.09x faster │
│ QQuery 40 │         22.69 / 26.00 ±1.80 / 27.81 ms │            22.12 / 26.61 ±2.91 / 30.73 ms │     no change │
│ QQuery 41 │         19.88 / 20.98 ±0.71 / 21.98 ms │            18.39 / 19.77 ±1.22 / 21.79 ms │ +1.06x faster │
│ QQuery 42 │         19.78 / 20.76 ±1.19 / 23.06 ms │            19.10 / 19.96 ±0.69 / 21.19 ms │     no change │
└───────────┴────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 23140.67ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 22568.97ms │
│ Average Time (HEAD)                                      │   538.16ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   524.86ms │
│ Queries Faster                                           │          5 │
│ Queries Slower                                           │          3 │
│ Queries with No Change                                   │         35 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 116.8s
Peak memory 39.2 GiB
Avg memory 28.8 GiB
CPU user 1082.4s
CPU sys 102.9s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 113.9s
Peak memory 41.0 GiB
Avg memory 29.4 GiB
CPU user 1069.7s
CPU sys 89.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │              6.98 / 7.37 ±0.69 / 8.75 ms │               7.09 / 7.54 ±0.69 / 8.90 ms │     no change │
│ QQuery 2  │        145.73 / 146.31 ±0.32 / 146.60 ms │         146.27 / 147.38 ±0.76 / 148.07 ms │     no change │
│ QQuery 3  │        114.06 / 114.75 ±0.58 / 115.63 ms │         115.04 / 116.01 ±0.69 / 116.89 ms │     no change │
│ QQuery 4  │    1410.96 / 1432.76 ±11.84 / 1445.23 ms │     1421.36 / 1449.73 ±18.93 / 1470.63 ms │     no change │
│ QQuery 5  │        174.22 / 175.39 ±1.01 / 177.11 ms │         175.06 / 175.90 ±0.69 / 176.96 ms │     no change │
│ QQuery 6  │       845.04 / 889.68 ±26.52 / 915.48 ms │        857.43 / 886.16 ±21.55 / 910.23 ms │     no change │
│ QQuery 7  │        348.38 / 350.63 ±2.63 / 354.78 ms │         348.24 / 350.59 ±1.51 / 352.86 ms │     no change │
│ QQuery 8  │        117.72 / 118.21 ±0.41 / 118.88 ms │         117.47 / 118.30 ±0.76 / 119.43 ms │     no change │
│ QQuery 9  │        102.35 / 105.91 ±2.84 / 109.15 ms │        102.56 / 111.85 ±11.17 / 133.78 ms │  1.06x slower │
│ QQuery 10 │        107.24 / 108.52 ±0.75 / 109.28 ms │         109.81 / 111.12 ±0.86 / 112.11 ms │     no change │
│ QQuery 11 │     1007.00 / 1021.26 ±9.23 / 1034.34 ms │      1007.07 / 1024.01 ±8.78 / 1031.00 ms │     no change │
│ QQuery 12 │           45.77 / 49.05 ±1.75 / 50.94 ms │            46.99 / 48.84 ±1.52 / 50.68 ms │     no change │
│ QQuery 13 │        413.26 / 417.25 ±5.89 / 428.86 ms │         408.54 / 411.67 ±3.27 / 417.80 ms │     no change │
│ QQuery 14 │     1009.07 / 1017.56 ±7.45 / 1029.43 ms │      999.17 / 1018.18 ±11.21 / 1031.36 ms │     no change │
│ QQuery 15 │           16.03 / 17.07 ±1.02 / 18.34 ms │            13.46 / 14.11 ±0.57 / 15.08 ms │ +1.21x faster │
│ QQuery 16 │             7.71 / 8.42 ±0.82 / 10.03 ms │              8.17 / 8.97 ±0.70 / 10.09 ms │  1.07x slower │
│ QQuery 17 │        230.76 / 232.93 ±1.84 / 235.83 ms │         264.16 / 265.73 ±1.29 / 267.81 ms │  1.14x slower │
│ QQuery 18 │        129.11 / 130.77 ±0.99 / 131.81 ms │         126.14 / 127.62 ±1.63 / 130.54 ms │     no change │
│ QQuery 19 │        157.98 / 159.68 ±1.07 / 161.01 ms │         157.97 / 160.54 ±1.39 / 162.22 ms │     no change │
│ QQuery 20 │           14.66 / 14.92 ±0.24 / 15.28 ms │            14.43 / 14.97 ±0.40 / 15.62 ms │     no change │
│ QQuery 21 │           20.07 / 20.47 ±0.23 / 20.75 ms │            19.56 / 20.43 ±0.48 / 21.02 ms │     no change │
│ QQuery 22 │        495.02 / 498.04 ±2.66 / 501.21 ms │         498.85 / 503.37 ±3.27 / 508.29 ms │     no change │
│ QQuery 23 │       906.32 / 918.96 ±14.83 / 947.35 ms │         904.95 / 917.85 ±9.09 / 931.57 ms │     no change │
│ QQuery 24 │        393.52 / 397.02 ±2.61 / 401.58 ms │         393.38 / 397.47 ±3.17 / 402.93 ms │     no change │
│ QQuery 25 │        349.38 / 351.01 ±1.44 / 352.96 ms │         319.05 / 322.76 ±2.66 / 325.97 ms │ +1.09x faster │
│ QQuery 26 │           82.33 / 84.45 ±1.31 / 86.02 ms │            82.27 / 84.29 ±1.17 / 85.90 ms │     no change │
│ QQuery 27 │              7.31 / 7.71 ±0.33 / 8.25 ms │               7.42 / 7.57 ±0.10 / 7.68 ms │     no change │
│ QQuery 28 │        150.00 / 152.07 ±2.03 / 155.86 ms │         151.41 / 154.11 ±2.28 / 157.50 ms │     no change │
│ QQuery 29 │        287.62 / 289.81 ±1.96 / 293.12 ms │         261.80 / 263.62 ±1.34 / 265.32 ms │ +1.10x faster │
│ QQuery 30 │           45.06 / 46.90 ±2.11 / 50.83 ms │            44.31 / 47.17 ±2.99 / 52.74 ms │     no change │
│ QQuery 31 │        172.82 / 174.61 ±1.44 / 176.55 ms │         173.06 / 175.43 ±1.50 / 176.95 ms │     no change │
│ QQuery 32 │           58.31 / 59.03 ±0.47 / 59.55 ms │            58.55 / 59.72 ±1.01 / 61.10 ms │     no change │
│ QQuery 33 │        142.58 / 144.46 ±1.37 / 146.65 ms │         143.26 / 144.66 ±0.84 / 145.74 ms │     no change │
│ QQuery 34 │              7.58 / 7.85 ±0.20 / 8.15 ms │               7.58 / 7.86 ±0.40 / 8.66 ms │     no change │
│ QQuery 35 │        108.25 / 110.55 ±1.38 / 112.22 ms │         112.09 / 112.68 ±0.70 / 114.00 ms │     no change │
│ QQuery 36 │              6.80 / 6.93 ±0.12 / 7.08 ms │               6.86 / 7.20 ±0.38 / 7.89 ms │     no change │
│ QQuery 37 │             8.75 / 9.46 ±0.91 / 11.20 ms │               8.72 / 9.33 ±0.36 / 9.80 ms │     no change │
│ QQuery 38 │           84.75 / 89.94 ±4.99 / 99.24 ms │            84.51 / 89.22 ±4.63 / 98.06 ms │     no change │
│ QQuery 39 │        128.40 / 130.29 ±1.38 / 131.72 ms │         130.12 / 132.66 ±2.17 / 135.25 ms │     no change │
│ QQuery 40 │        113.02 / 119.36 ±5.74 / 129.07 ms │         114.91 / 120.91 ±6.13 / 132.45 ms │     no change │
│ QQuery 41 │           14.95 / 16.04 ±1.21 / 17.60 ms │            15.47 / 16.31 ±1.03 / 18.16 ms │     no change │
│ QQuery 42 │        107.46 / 109.74 ±1.51 / 111.04 ms │         110.15 / 111.56 ±1.54 / 113.94 ms │     no change │
│ QQuery 43 │              6.28 / 6.95 ±1.06 / 9.06 ms │               6.41 / 6.47 ±0.06 / 6.56 ms │ +1.07x faster │
│ QQuery 44 │           12.29 / 12.64 ±0.27 / 13.10 ms │            12.00 / 12.77 ±0.67 / 14.01 ms │     no change │
│ QQuery 45 │           52.73 / 53.71 ±0.87 / 54.78 ms │            49.30 / 50.52 ±0.68 / 51.31 ms │ +1.06x faster │
│ QQuery 46 │              9.02 / 9.27 ±0.24 / 9.61 ms │              8.83 / 9.55 ±0.68 / 10.68 ms │     no change │
│ QQuery 47 │       765.87 / 780.37 ±11.90 / 800.33 ms │         777.23 / 782.52 ±3.26 / 786.69 ms │     no change │
│ QQuery 48 │        290.90 / 298.19 ±5.32 / 306.01 ms │         290.08 / 300.28 ±6.94 / 311.90 ms │     no change │
│ QQuery 49 │        254.22 / 256.67 ±2.54 / 259.93 ms │         253.39 / 257.06 ±2.61 / 260.94 ms │     no change │
│ QQuery 50 │        230.79 / 236.93 ±4.85 / 244.81 ms │         232.60 / 237.21 ±2.99 / 241.39 ms │     no change │
│ QQuery 51 │        181.69 / 184.70 ±3.85 / 191.91 ms │         183.36 / 185.15 ±1.25 / 186.51 ms │     no change │
│ QQuery 52 │        108.23 / 110.22 ±1.94 / 112.97 ms │         108.85 / 110.10 ±0.77 / 110.96 ms │     no change │
│ QQuery 53 │        104.62 / 105.64 ±1.10 / 107.73 ms │         104.07 / 104.74 ±0.54 / 105.69 ms │     no change │
│ QQuery 54 │        149.20 / 151.72 ±1.57 / 153.93 ms │         148.23 / 149.33 ±0.84 / 150.54 ms │     no change │
│ QQuery 55 │        107.44 / 109.58 ±1.84 / 112.25 ms │         108.66 / 109.76 ±0.67 / 110.57 ms │     no change │
│ QQuery 56 │        140.98 / 144.07 ±2.29 / 147.58 ms │         142.51 / 145.21 ±2.20 / 148.40 ms │     no change │
│ QQuery 57 │        174.73 / 177.74 ±2.30 / 181.53 ms │         176.21 / 178.75 ±1.92 / 182.09 ms │     no change │
│ QQuery 58 │        296.62 / 299.49 ±2.62 / 304.22 ms │         293.92 / 299.65 ±6.07 / 310.52 ms │     no change │
│ QQuery 59 │        201.64 / 203.67 ±1.17 / 204.95 ms │         200.42 / 202.67 ±1.68 / 204.76 ms │     no change │
│ QQuery 60 │        144.32 / 145.47 ±1.11 / 146.96 ms │         145.76 / 147.27 ±1.68 / 149.56 ms │     no change │
│ QQuery 61 │           13.40 / 13.68 ±0.32 / 14.21 ms │            13.52 / 13.80 ±0.25 / 14.21 ms │     no change │
│ QQuery 62 │      917.49 / 969.47 ±31.93 / 1012.15 ms │       909.86 / 940.20 ±34.01 / 1005.96 ms │     no change │
│ QQuery 63 │        102.76 / 105.09 ±1.59 / 107.33 ms │         105.51 / 106.84 ±1.27 / 108.79 ms │     no change │
│ QQuery 64 │        699.39 / 704.27 ±4.71 / 710.91 ms │         695.69 / 702.11 ±4.87 / 708.56 ms │     no change │
│ QQuery 65 │        267.81 / 270.51 ±2.68 / 275.65 ms │         264.08 / 266.55 ±1.84 / 269.28 ms │     no change │
│ QQuery 66 │        252.58 / 261.43 ±8.61 / 271.97 ms │         252.22 / 261.07 ±5.75 / 267.11 ms │     no change │
│ QQuery 67 │        315.54 / 327.77 ±8.16 / 336.44 ms │         324.90 / 326.54 ±1.56 / 328.45 ms │     no change │
│ QQuery 68 │            9.93 / 11.59 ±1.39 / 13.55 ms │            11.37 / 12.03 ±0.42 / 12.65 ms │     no change │
│ QQuery 69 │        103.94 / 105.06 ±1.15 / 106.88 ms │         104.48 / 107.07 ±2.06 / 109.30 ms │     no change │
│ QQuery 70 │       325.61 / 349.88 ±14.55 / 369.65 ms │         360.98 / 369.08 ±7.01 / 378.65 ms │  1.05x slower │
│ QQuery 71 │        136.25 / 139.69 ±4.27 / 148.04 ms │         134.49 / 137.33 ±2.17 / 140.13 ms │     no change │
│ QQuery 72 │       631.27 / 644.57 ±10.03 / 654.27 ms │     3224.68 / 3303.87 ±41.92 / 3343.97 ms │  5.13x slower │
│ QQuery 73 │              7.01 / 8.18 ±1.10 / 9.99 ms │              7.56 / 8.89 ±0.97 / 10.07 ms │  1.09x slower │
│ QQuery 74 │        631.33 / 635.03 ±3.65 / 640.88 ms │         638.45 / 648.92 ±8.96 / 659.56 ms │     no change │
│ QQuery 75 │        280.59 / 283.43 ±1.91 / 286.55 ms │         279.43 / 283.06 ±2.74 / 286.19 ms │     no change │
│ QQuery 76 │        134.37 / 135.35 ±1.10 / 137.16 ms │         131.74 / 135.45 ±2.15 / 137.77 ms │     no change │
│ QQuery 77 │        188.53 / 190.93 ±2.67 / 195.68 ms │         188.45 / 191.14 ±1.83 / 193.30 ms │     no change │
│ QQuery 78 │        350.04 / 353.68 ±2.48 / 357.68 ms │         351.20 / 356.89 ±3.79 / 362.64 ms │     no change │
│ QQuery 79 │        238.31 / 244.67 ±3.95 / 249.32 ms │         246.32 / 248.66 ±3.13 / 254.76 ms │     no change │
│ QQuery 80 │        325.79 / 328.11 ±2.14 / 331.55 ms │         322.59 / 328.01 ±5.48 / 336.60 ms │     no change │
│ QQuery 81 │           26.61 / 28.23 ±1.28 / 30.20 ms │            26.69 / 28.05 ±0.78 / 28.83 ms │     no change │
│ QQuery 82 │        200.20 / 201.28 ±0.95 / 203.02 ms │         201.73 / 203.23 ±2.10 / 207.35 ms │     no change │
│ QQuery 83 │           39.65 / 40.63 ±1.23 / 42.85 ms │            39.48 / 40.14 ±0.67 / 41.43 ms │     no change │
│ QQuery 84 │           50.05 / 51.18 ±0.95 / 52.45 ms │            49.80 / 50.38 ±0.47 / 51.06 ms │     no change │
│ QQuery 85 │        150.07 / 151.32 ±1.63 / 154.31 ms │         151.64 / 153.39 ±1.40 / 155.66 ms │     no change │
│ QQuery 86 │           39.46 / 40.90 ±0.92 / 42.09 ms │            39.80 / 40.70 ±0.46 / 41.15 ms │     no change │
│ QQuery 87 │           84.65 / 90.97 ±5.20 / 97.63 ms │            86.86 / 90.93 ±3.64 / 97.69 ms │     no change │
│ QQuery 88 │        101.82 / 103.50 ±1.44 / 105.76 ms │         102.38 / 103.76 ±1.32 / 105.87 ms │     no change │
│ QQuery 89 │        119.48 / 121.03 ±1.20 / 123.07 ms │         120.23 / 121.92 ±1.14 / 123.33 ms │     no change │
│ QQuery 90 │           24.24 / 24.73 ±0.31 / 24.99 ms │            24.15 / 24.97 ±0.56 / 25.71 ms │     no change │
│ QQuery 91 │           65.04 / 65.96 ±0.67 / 67.13 ms │            64.12 / 65.41 ±1.19 / 67.23 ms │     no change │
│ QQuery 92 │           58.60 / 59.91 ±1.42 / 62.09 ms │            57.83 / 58.98 ±0.69 / 59.82 ms │     no change │
│ QQuery 93 │        191.50 / 194.73 ±1.89 / 196.34 ms │         190.36 / 192.97 ±2.24 / 196.15 ms │     no change │
│ QQuery 94 │           61.71 / 63.17 ±1.37 / 65.51 ms │            62.95 / 64.33 ±1.40 / 66.98 ms │     no change │
│ QQuery 95 │        131.47 / 132.82 ±0.78 / 133.90 ms │          99.53 / 100.71 ±0.63 / 101.26 ms │ +1.32x faster │
│ QQuery 96 │           74.82 / 75.25 ±0.44 / 76.03 ms │            71.29 / 75.21 ±2.25 / 77.81 ms │     no change │
│ QQuery 97 │        127.11 / 130.63 ±3.04 / 134.26 ms │         127.13 / 129.78 ±1.91 / 131.88 ms │     no change │
│ QQuery 98 │        157.22 / 161.33 ±2.07 / 162.60 ms │         157.53 / 159.58 ±1.56 / 161.73 ms │     no change │
│ QQuery 99 │ 10864.11 / 10904.87 ±35.39 / 10961.30 ms │  10865.66 / 10930.68 ±36.39 / 10967.71 ms │     no change │
└───────────┴──────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 32274.96ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 34947.05ms │
│ Average Time (HEAD)                                      │   326.01ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   353.00ms │
│ Queries Faster                                           │          6 │
│ Queries Slower                                           │          6 │
│ Queries with No Change                                   │         87 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 161.7s
Peak memory 5.4 GiB
Avg memory 4.4 GiB
CPU user 267.0s
CPU sys 17.7s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 175.0s
Peak memory 5.6 GiB
Avg memory 4.6 GiB
CPU user 417.3s
CPU sys 20.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark tpcds

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4260131019-1367-smhnv 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (d17d5e4) to 5c653be (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │              6.55 / 7.03 ±0.79 / 8.60 ms │               6.62 / 7.11 ±0.87 / 8.84 ms │     no change │
│ QQuery 2  │        143.53 / 143.94 ±0.32 / 144.23 ms │         142.69 / 144.27 ±0.86 / 145.12 ms │     no change │
│ QQuery 3  │        113.45 / 113.89 ±0.36 / 114.35 ms │         113.86 / 114.43 ±0.60 / 115.53 ms │     no change │
│ QQuery 4  │    1334.29 / 1362.57 ±21.81 / 1394.94 ms │     1360.31 / 1390.10 ±16.45 / 1405.65 ms │     no change │
│ QQuery 5  │        170.59 / 172.99 ±1.76 / 175.28 ms │         172.45 / 174.47 ±1.77 / 177.23 ms │     no change │
│ QQuery 6  │       826.41 / 876.05 ±36.80 / 938.37 ms │        862.48 / 884.01 ±15.84 / 909.86 ms │     no change │
│ QQuery 7  │        339.57 / 344.15 ±2.45 / 346.94 ms │         340.62 / 345.25 ±2.84 / 348.46 ms │     no change │
│ QQuery 8  │        115.88 / 116.71 ±0.58 / 117.66 ms │         116.03 / 116.77 ±0.46 / 117.50 ms │     no change │
│ QQuery 9  │        100.96 / 103.05 ±2.10 / 106.73 ms │         100.29 / 106.99 ±5.92 / 114.42 ms │     no change │
│ QQuery 10 │        106.11 / 106.48 ±0.35 / 107.03 ms │         107.15 / 108.84 ±1.32 / 110.87 ms │     no change │
│ QQuery 11 │        930.37 / 943.58 ±8.27 / 952.46 ms │         956.74 / 960.95 ±3.40 / 966.02 ms │     no change │
│ QQuery 12 │           44.35 / 46.11 ±1.12 / 47.41 ms │            45.31 / 46.13 ±0.70 / 47.37 ms │     no change │
│ QQuery 13 │        399.86 / 401.27 ±1.29 / 402.88 ms │         400.47 / 403.82 ±2.05 / 406.38 ms │     no change │
│ QQuery 14 │     992.83 / 1007.54 ±10.23 / 1022.21 ms │        991.24 / 996.86 ±4.77 / 1002.00 ms │     no change │
│ QQuery 15 │           16.18 / 16.98 ±0.81 / 18.47 ms │            12.63 / 13.16 ±0.69 / 14.53 ms │ +1.29x faster │
│ QQuery 16 │              6.92 / 7.61 ±0.88 / 9.31 ms │               7.21 / 7.59 ±0.42 / 8.41 ms │     no change │
│ QQuery 17 │        227.27 / 230.80 ±1.97 / 233.06 ms │         260.00 / 262.30 ±1.30 / 263.95 ms │  1.14x slower │
│ QQuery 18 │        126.07 / 127.16 ±1.03 / 128.81 ms │         120.95 / 122.72 ±2.12 / 126.75 ms │     no change │
│ QQuery 19 │        153.77 / 155.14 ±0.79 / 156.14 ms │         155.05 / 156.46 ±0.77 / 157.17 ms │     no change │
│ QQuery 20 │           13.55 / 14.26 ±0.58 / 15.09 ms │            13.69 / 14.39 ±0.50 / 15.09 ms │     no change │
│ QQuery 21 │           18.98 / 19.52 ±0.48 / 20.33 ms │            19.28 / 20.07 ±0.57 / 20.97 ms │     no change │
│ QQuery 22 │        489.02 / 491.75 ±2.21 / 494.72 ms │         479.34 / 485.29 ±5.35 / 493.36 ms │     no change │
│ QQuery 23 │        874.09 / 884.62 ±7.47 / 896.56 ms │         860.00 / 872.32 ±9.91 / 885.16 ms │     no change │
│ QQuery 24 │        380.25 / 383.23 ±2.75 / 388.16 ms │         382.04 / 385.17 ±2.55 / 388.46 ms │     no change │
│ QQuery 25 │        338.10 / 340.05 ±1.53 / 341.97 ms │         310.76 / 312.97 ±2.58 / 316.31 ms │ +1.09x faster │
│ QQuery 26 │           80.05 / 82.05 ±1.91 / 85.51 ms │            80.82 / 83.45 ±1.46 / 85.20 ms │     no change │
│ QQuery 27 │              6.95 / 7.61 ±0.60 / 8.50 ms │               7.01 / 7.19 ±0.17 / 7.44 ms │ +1.06x faster │
│ QQuery 28 │        148.91 / 150.04 ±1.18 / 151.93 ms │         148.98 / 150.92 ±1.79 / 153.92 ms │     no change │
│ QQuery 29 │        280.58 / 283.06 ±1.41 / 284.44 ms │         253.40 / 255.68 ±1.39 / 257.33 ms │ +1.11x faster │
│ QQuery 30 │           41.88 / 44.45 ±1.89 / 47.71 ms │            43.93 / 44.97 ±0.98 / 46.25 ms │     no change │
│ QQuery 31 │        168.14 / 171.18 ±1.55 / 172.37 ms │         168.34 / 172.03 ±1.89 / 173.49 ms │     no change │
│ QQuery 32 │           57.14 / 57.93 ±0.64 / 58.58 ms │            57.35 / 58.01 ±0.83 / 59.45 ms │     no change │
│ QQuery 33 │        139.01 / 140.87 ±1.01 / 142.11 ms │         141.92 / 143.19 ±1.44 / 145.94 ms │     no change │
│ QQuery 34 │              6.96 / 7.11 ±0.18 / 7.44 ms │               6.80 / 7.23 ±0.35 / 7.73 ms │     no change │
│ QQuery 35 │        105.44 / 107.39 ±1.24 / 109.26 ms │         108.27 / 109.72 ±1.22 / 111.38 ms │     no change │
│ QQuery 36 │              6.65 / 6.84 ±0.16 / 7.06 ms │               6.70 / 6.83 ±0.14 / 7.06 ms │     no change │
│ QQuery 37 │              8.23 / 8.76 ±0.28 / 8.98 ms │               8.22 / 8.54 ±0.28 / 9.05 ms │     no change │
│ QQuery 38 │           84.46 / 88.06 ±4.03 / 95.63 ms │            83.88 / 86.97 ±3.18 / 92.75 ms │     no change │
│ QQuery 39 │        120.84 / 126.50 ±3.14 / 129.43 ms │         126.61 / 128.15 ±1.22 / 129.94 ms │     no change │
│ QQuery 40 │        109.43 / 114.57 ±4.81 / 121.98 ms │         109.14 / 114.41 ±4.76 / 122.94 ms │     no change │
│ QQuery 41 │           14.34 / 15.35 ±0.89 / 16.68 ms │            14.72 / 16.00 ±0.91 / 17.31 ms │     no change │
│ QQuery 42 │        106.72 / 109.59 ±1.60 / 111.52 ms │         107.04 / 108.95 ±1.25 / 110.85 ms │     no change │
│ QQuery 43 │              5.95 / 6.24 ±0.19 / 6.52 ms │               5.91 / 6.14 ±0.21 / 6.53 ms │     no change │
│ QQuery 44 │           11.92 / 12.35 ±0.45 / 13.22 ms │            11.54 / 11.92 ±0.33 / 12.43 ms │     no change │
│ QQuery 45 │           51.10 / 51.66 ±0.35 / 52.08 ms │            47.86 / 48.45 ±0.38 / 48.86 ms │ +1.07x faster │
│ QQuery 46 │              8.34 / 8.64 ±0.28 / 9.11 ms │               8.26 / 8.72 ±0.27 / 9.07 ms │     no change │
│ QQuery 47 │        689.32 / 694.93 ±3.78 / 700.37 ms │         703.02 / 717.25 ±8.67 / 727.94 ms │     no change │
│ QQuery 48 │        284.37 / 291.10 ±3.59 / 294.46 ms │         285.49 / 290.64 ±3.24 / 294.21 ms │     no change │
│ QQuery 49 │        252.09 / 254.02 ±1.30 / 256.01 ms │         251.74 / 254.61 ±2.33 / 258.33 ms │     no change │
│ QQuery 50 │        222.61 / 225.53 ±3.69 / 232.47 ms │         219.64 / 226.93 ±4.47 / 232.77 ms │     no change │
│ QQuery 51 │        177.97 / 181.66 ±3.63 / 188.50 ms │         182.49 / 184.50 ±1.53 / 186.33 ms │     no change │
│ QQuery 52 │        107.14 / 107.61 ±0.43 / 108.39 ms │         106.56 / 108.59 ±1.36 / 110.46 ms │     no change │
│ QQuery 53 │        102.58 / 103.15 ±0.61 / 104.26 ms │         102.60 / 105.01 ±1.30 / 106.57 ms │     no change │
│ QQuery 54 │        145.21 / 147.67 ±1.98 / 150.22 ms │         145.60 / 147.41 ±1.87 / 150.93 ms │     no change │
│ QQuery 55 │        106.03 / 108.38 ±1.53 / 109.99 ms │         106.12 / 108.19 ±1.59 / 110.14 ms │     no change │
│ QQuery 56 │        138.98 / 141.90 ±1.82 / 143.74 ms │         140.54 / 142.74 ±1.40 / 144.46 ms │     no change │
│ QQuery 57 │        171.93 / 174.45 ±1.86 / 177.43 ms │         172.63 / 175.75 ±2.53 / 178.75 ms │     no change │
│ QQuery 58 │        289.22 / 298.36 ±5.56 / 305.43 ms │         290.79 / 299.43 ±6.11 / 309.87 ms │     no change │
│ QQuery 59 │        200.66 / 203.01 ±1.31 / 204.31 ms │         197.14 / 199.53 ±3.23 / 205.71 ms │     no change │
│ QQuery 60 │        144.39 / 146.08 ±1.11 / 147.39 ms │        143.75 / 154.53 ±12.18 / 172.91 ms │  1.06x slower │
│ QQuery 61 │           13.22 / 13.48 ±0.21 / 13.78 ms │            13.18 / 13.70 ±0.36 / 14.15 ms │     no change │
│ QQuery 62 │    1026.94 / 1087.68 ±51.27 / 1182.26 ms │       917.43 / 941.71 ±38.20 / 1016.17 ms │ +1.16x faster │
│ QQuery 63 │        107.07 / 109.36 ±1.40 / 111.17 ms │         102.50 / 105.94 ±1.90 / 107.72 ms │     no change │
│ QQuery 64 │        701.02 / 709.35 ±8.82 / 726.37 ms │         676.43 / 681.30 ±3.51 / 685.59 ms │     no change │
│ QQuery 65 │        259.07 / 266.35 ±4.42 / 270.99 ms │         251.78 / 266.12 ±7.43 / 273.09 ms │     no change │
│ QQuery 66 │        253.83 / 260.64 ±4.46 / 266.95 ms │         251.75 / 263.38 ±6.36 / 269.18 ms │     no change │
│ QQuery 67 │        309.00 / 324.33 ±9.90 / 340.10 ms │         313.28 / 322.40 ±8.67 / 338.52 ms │     no change │
│ QQuery 68 │             9.37 / 9.98 ±0.50 / 10.69 ms │             9.49 / 10.11 ±0.69 / 11.36 ms │     no change │
│ QQuery 69 │        100.90 / 104.02 ±2.91 / 109.46 ms │         104.87 / 107.05 ±2.16 / 110.54 ms │     no change │
│ QQuery 70 │       318.57 / 341.84 ±18.33 / 372.24 ms │         337.85 / 347.83 ±9.34 / 360.11 ms │     no change │
│ QQuery 71 │        135.34 / 136.48 ±1.35 / 139.11 ms │         138.33 / 140.65 ±1.35 / 141.78 ms │     no change │
│ QQuery 72 │       609.61 / 631.81 ±18.10 / 656.17 ms │     3158.98 / 3200.01 ±30.95 / 3249.97 ms │  5.06x slower │
│ QQuery 73 │             7.45 / 9.15 ±1.31 / 11.25 ms │               6.50 / 7.32 ±0.54 / 8.02 ms │ +1.25x faster │
│ QQuery 74 │        638.77 / 647.33 ±5.31 / 653.18 ms │        587.67 / 610.11 ±17.70 / 631.91 ms │ +1.06x faster │
│ QQuery 75 │        278.30 / 282.95 ±3.00 / 286.74 ms │         273.74 / 277.11 ±2.04 / 279.36 ms │     no change │
│ QQuery 76 │        132.14 / 134.00 ±1.62 / 136.85 ms │         130.93 / 132.38 ±1.59 / 135.32 ms │     no change │
│ QQuery 77 │        188.88 / 191.01 ±1.22 / 192.51 ms │         186.77 / 190.45 ±2.17 / 193.32 ms │     no change │
│ QQuery 78 │        343.14 / 347.84 ±3.52 / 352.99 ms │         337.85 / 342.34 ±5.76 / 353.73 ms │     no change │
│ QQuery 79 │        240.90 / 243.12 ±1.40 / 245.01 ms │         229.94 / 234.76 ±3.13 / 239.09 ms │     no change │
│ QQuery 80 │        323.29 / 325.41 ±1.58 / 327.51 ms │         321.08 / 324.22 ±3.30 / 329.16 ms │     no change │
│ QQuery 81 │           26.82 / 27.71 ±0.81 / 29.12 ms │            28.77 / 29.65 ±0.81 / 30.90 ms │  1.07x slower │
│ QQuery 82 │        196.24 / 200.34 ±2.47 / 203.64 ms │         198.51 / 202.02 ±2.38 / 204.68 ms │     no change │
│ QQuery 83 │           38.59 / 39.23 ±0.66 / 40.24 ms │            38.61 / 39.22 ±0.68 / 40.50 ms │     no change │
│ QQuery 84 │           48.71 / 49.00 ±0.18 / 49.21 ms │            48.11 / 49.10 ±0.50 / 49.45 ms │     no change │
│ QQuery 85 │        147.82 / 149.07 ±1.22 / 151.32 ms │         149.45 / 150.01 ±0.32 / 150.37 ms │     no change │
│ QQuery 86 │           38.46 / 39.61 ±1.01 / 41.13 ms │            39.77 / 40.59 ±0.93 / 42.30 ms │     no change │
│ QQuery 87 │           85.43 / 87.09 ±1.43 / 89.51 ms │            86.33 / 88.41 ±2.34 / 91.86 ms │     no change │
│ QQuery 88 │         99.07 / 100.54 ±1.34 / 103.04 ms │          99.85 / 100.60 ±0.91 / 102.37 ms │     no change │
│ QQuery 89 │        119.81 / 120.85 ±0.83 / 121.94 ms │         118.38 / 120.15 ±1.25 / 121.74 ms │     no change │
│ QQuery 90 │           23.63 / 24.41 ±0.73 / 25.79 ms │            23.00 / 23.65 ±0.52 / 24.39 ms │     no change │
│ QQuery 91 │           61.29 / 63.25 ±1.75 / 65.81 ms │            61.33 / 63.97 ±1.50 / 65.77 ms │     no change │
│ QQuery 92 │           57.52 / 59.49 ±1.18 / 61.23 ms │            57.33 / 57.98 ±0.55 / 58.69 ms │     no change │
│ QQuery 93 │        187.02 / 188.50 ±1.23 / 190.50 ms │         186.69 / 188.54 ±1.05 / 189.88 ms │     no change │
│ QQuery 94 │           61.18 / 61.91 ±0.51 / 62.56 ms │            61.57 / 62.17 ±0.67 / 63.47 ms │     no change │
│ QQuery 95 │        126.74 / 129.68 ±1.77 / 131.27 ms │            96.36 / 97.12 ±0.61 / 98.21 ms │ +1.34x faster │
│ QQuery 96 │           71.55 / 73.60 ±1.32 / 75.30 ms │            73.49 / 74.96 ±1.49 / 77.72 ms │     no change │
│ QQuery 97 │        127.56 / 128.98 ±1.06 / 130.82 ms │         123.50 / 125.88 ±1.45 / 128.07 ms │     no change │
│ QQuery 98 │        153.79 / 154.99 ±0.75 / 156.09 ms │         151.88 / 154.09 ±1.87 / 157.21 ms │     no change │
│ QQuery 99 │ 10793.82 / 10824.89 ±19.01 / 10848.29 ms │  10821.63 / 10881.53 ±35.06 / 10929.49 ms │     no change │
└───────────┴──────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 31793.82ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 34213.55ms │
│ Average Time (HEAD)                                      │   321.15ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   345.59ms │
│ Queries Faster                                           │          9 │
│ Queries Slower                                           │          4 │
│ Queries with No Change                                   │         86 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 159.3s
Peak memory 5.6 GiB
Avg memory 4.7 GiB
CPU user 262.1s
CPU sys 17.2s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 171.4s
Peak memory 5.6 GiB
Avg memory 4.7 GiB
CPU user 407.4s
CPU sys 18.8s
Peak spill 0 B

File an issue against this benchmark runner

@Omega359
Copy link
Copy Markdown
Contributor

Omega359 commented Apr 16, 2026

query 72 took a bit of a hit here.

Query72
select  i_item_desc
      ,w_warehouse_name
      ,d1.d_week_seq
      ,sum(case when p_promo_sk is null then 1 else 0 end) no_promo
      ,sum(case when p_promo_sk is not null then 1 else 0 end) promo
      ,count(*) total_cnt
from catalog_sales
join inventory on (cs_item_sk = inv_item_sk)
join warehouse on (w_warehouse_sk=inv_warehouse_sk)
join item on (i_item_sk = cs_item_sk)
join customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk)
join household_demographics on (cs_bill_hdemo_sk = hd_demo_sk)
join date_dim d1 on (cs_sold_date_sk = d1.d_date_sk)
join date_dim d2 on (inv_date_sk = d2.d_date_sk)
join date_dim d3 on (cs_ship_date_sk = d3.d_date_sk)
left outer join promotion on (cs_promo_sk=p_promo_sk)
left outer join catalog_returns on (cr_item_sk = cs_item_sk and cr_order_number = cs_order_number)
where d1.d_week_seq = d2.d_week_seq
  and inv_quantity_on_hand < cs_quantity
  and d3.d_date > (d1.d_date + INTERVAL '5 days')
  and hd_buy_potential = '1001-5000'
  and d1.d_year = 2001
  and cd_marital_status = 'M'
group by i_item_desc,w_warehouse_name,d1.d_week_seq
order by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq
limit 100;

@adriangb
Copy link
Copy Markdown
Contributor Author

Yes. And I think this was another bandaid. But it's closer to the root cause than previous attempts. This has to do with cancellation when multiple joins are involved.

TLDR I think what is happening is when you have multiple joins you end up with a tree of operators. One of the joins up higher in the tree hits the new optimization and aborts work, dropping tasks that would have polled downstream joins. But not the downstream join is stuck waiting for all of it's partition tasks to finish even though they never will. I think we were all operating under the assumption that the issue was within a single join operator but really it's an issue any time an upstream operator cancels on a join.

I think the real solution is to track when a join build partition task gets dropped and report that to the dynamic filter building so that it doesn't wait for that partition to report.

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark tpcds

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4260950110-1371-8wgd5 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (0584854) to 5c653be (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangb adriangb changed the title Report empty build partitions for partitioned hash join filters Handle canceled partitioned hash join dynamic filters lazily Apr 16, 2026
@adriangb adriangb requested a review from Copilot April 16, 2026 14:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a hang in partitioned hash join dynamic filter coordination by allowing partitions that are dropped early to be treated as “canceled” and not block filter finalization (DataFusion issue #21625).

Changes:

  • Add cancellation tracking for partitioned build-side reports and treat canceled partitions as true in the synthesized partitioned dynamic filter.
  • Mark partitioned HashJoinStream partitions as canceled on Drop when they never reported build data.
  • Add a regression test covering the early-completing RightSemi parent join scenario that previously hung.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
datafusion/physical-plan/src/joins/hash_join/stream.rs Track whether build info was reported and report partition cancellation to the coordinator on Drop.
datafusion/physical-plan/src/joins/hash_join/shared_bounds.rs Replace barrier-based coordination with explicit partition status + notify-based completion; synthesize filters that handle canceled partitions.
datafusion/physical-plan/src/joins/hash_join/exec.rs Refactor dynamic-filter accumulator initialization and add a regression test for the cancellation/hang scenario.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

enum CompletionState {
Pending,
Finalizing,
Ready(std::result::Result<(), String>),
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompletionState::Ready stores errors as Result<(), String>, which forces later callers to lose the original DataFusionError variant/backtrace/context. Consider storing Result<(), Arc<DataFusionError>> (or datafusion_common::SharedResult<()>) in CompletionState instead, so you can propagate DataFusionError::Shared(...) to all waiters without stringifying.

Suggested change
Ready(std::result::Result<(), String>),
Ready(std::result::Result<(), Arc<datafusion_common::DataFusionError>>),

Copilot uses AI. Check for mistakes.
Comment on lines +524 to +529
let guard = self.inner.lock();
match &guard.completion {
CompletionState::Ready(Ok(())) => return Ok(()),
CompletionState::Ready(Err(err)) => {
return Err(DataFusionError::Execution(err.clone()));
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_completion converts stored failures into DataFusionError::Execution(err.clone()), which changes the error category and can make debugging harder. If you keep an Arc<DataFusionError> in CompletionState, you can return Err(DataFusionError::Shared(err_arc)) here and preserve the original error semantics.

Copilot uses AI. Check for mistakes.
Comment on lines 391 to +394
/// # Returns
/// * `Result<()>` - Ok if successful, Err if filter update failed or mode mismatch
pub(crate) async fn report_build_data(&self, data: PartitionBuildData) -> Result<()> {
// Store data in the accumulator
{
let finalize_input = {
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The report_build_data doc comment still says "have reported (barrier wait)", but this method no longer uses tokio::sync::Barrier (it uses Notify/CompletionState). Please update the docs to match the current synchronization mechanism so the comment doesn’t mislead future changes.

Copilot uses AI. Check for mistakes.
@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │              6.92 / 7.40 ±0.82 / 9.04 ms │               7.00 / 7.45 ±0.78 / 9.01 ms │     no change │
│ QQuery 2  │        144.54 / 145.39 ±0.79 / 146.62 ms │         147.27 / 148.61 ±0.75 / 149.50 ms │     no change │
│ QQuery 3  │        115.14 / 115.85 ±0.68 / 116.87 ms │         114.50 / 115.93 ±1.13 / 117.57 ms │     no change │
│ QQuery 4  │    1335.48 / 1382.00 ±36.82 / 1440.86 ms │     1402.24 / 1427.26 ±15.58 / 1447.66 ms │     no change │
│ QQuery 5  │        174.17 / 175.13 ±0.83 / 176.53 ms │         172.07 / 174.75 ±2.27 / 177.94 ms │     no change │
│ QQuery 6  │       853.20 / 880.67 ±22.38 / 913.37 ms │        863.36 / 885.51 ±22.93 / 927.68 ms │     no change │
│ QQuery 7  │        344.50 / 348.21 ±4.06 / 355.46 ms │         346.17 / 347.70 ±0.89 / 348.94 ms │     no change │
│ QQuery 8  │        115.84 / 116.59 ±0.68 / 117.69 ms │         116.73 / 118.22 ±1.22 / 119.85 ms │     no change │
│ QQuery 9  │        102.18 / 105.91 ±2.54 / 108.50 ms │         100.90 / 106.70 ±7.24 / 120.08 ms │     no change │
│ QQuery 10 │        107.43 / 108.45 ±0.84 / 109.44 ms │         106.39 / 108.37 ±1.16 / 109.59 ms │     no change │
│ QQuery 11 │        969.56 / 981.48 ±7.26 / 988.61 ms │       954.27 / 979.77 ±14.85 / 1000.34 ms │     no change │
│ QQuery 12 │           45.11 / 46.95 ±1.34 / 49.09 ms │            44.90 / 45.78 ±0.72 / 46.82 ms │     no change │
│ QQuery 13 │        403.49 / 408.01 ±3.30 / 412.85 ms │         400.55 / 404.63 ±2.29 / 407.48 ms │     no change │
│ QQuery 14 │     1006.48 / 1017.03 ±8.22 / 1025.61 ms │      1010.30 / 1015.54 ±6.32 / 1026.77 ms │     no change │
│ QQuery 15 │           16.63 / 17.24 ±0.63 / 18.26 ms │            16.87 / 17.94 ±0.77 / 18.73 ms │     no change │
│ QQuery 16 │              7.72 / 8.54 ±0.56 / 9.30 ms │               7.22 / 7.73 ±0.54 / 8.75 ms │ +1.11x faster │
│ QQuery 17 │        231.06 / 233.39 ±1.70 / 235.79 ms │         231.81 / 233.42 ±0.94 / 234.62 ms │     no change │
│ QQuery 18 │        129.42 / 130.07 ±0.45 / 130.62 ms │         131.82 / 133.23 ±1.06 / 134.94 ms │     no change │
│ QQuery 19 │        155.63 / 156.89 ±1.15 / 158.85 ms │         154.75 / 157.32 ±1.32 / 158.53 ms │     no change │
│ QQuery 20 │           14.37 / 15.27 ±1.16 / 17.56 ms │            13.87 / 14.71 ±0.69 / 15.70 ms │     no change │
│ QQuery 21 │           19.53 / 19.89 ±0.36 / 20.41 ms │            19.62 / 20.04 ±0.26 / 20.40 ms │     no change │
│ QQuery 22 │        486.19 / 493.40 ±8.86 / 510.86 ms │         492.47 / 496.82 ±2.72 / 500.59 ms │     no change │
│ QQuery 23 │        881.36 / 890.22 ±6.35 / 900.98 ms │        872.74 / 885.76 ±16.25 / 917.42 ms │     no change │
│ QQuery 24 │        388.20 / 389.79 ±2.17 / 394.08 ms │         384.80 / 388.73 ±4.76 / 398.03 ms │     no change │
│ QQuery 25 │        341.61 / 344.05 ±1.24 / 345.01 ms │         343.50 / 344.98 ±1.24 / 346.45 ms │     no change │
│ QQuery 26 │           82.30 / 82.66 ±0.48 / 83.58 ms │            83.14 / 84.50 ±0.98 / 85.88 ms │     no change │
│ QQuery 27 │              6.96 / 7.69 ±0.64 / 8.61 ms │               7.28 / 7.40 ±0.11 / 7.54 ms │     no change │
│ QQuery 28 │        150.81 / 152.63 ±1.52 / 154.95 ms │         149.78 / 150.58 ±0.85 / 152.10 ms │     no change │
│ QQuery 29 │        285.06 / 286.51 ±1.33 / 288.63 ms │         282.04 / 285.54 ±1.92 / 287.21 ms │     no change │
│ QQuery 30 │           45.50 / 46.31 ±0.59 / 47.04 ms │            43.61 / 45.80 ±1.39 / 47.98 ms │     no change │
│ QQuery 31 │        174.26 / 175.97 ±1.17 / 177.08 ms │         170.89 / 173.48 ±1.66 / 175.21 ms │     no change │
│ QQuery 32 │           58.35 / 59.11 ±0.83 / 60.36 ms │            57.92 / 58.65 ±0.54 / 59.58 ms │     no change │
│ QQuery 33 │        144.63 / 145.81 ±1.50 / 148.74 ms │         141.98 / 143.27 ±0.92 / 144.38 ms │     no change │
│ QQuery 34 │              7.41 / 7.67 ±0.23 / 7.98 ms │               7.15 / 7.52 ±0.29 / 7.88 ms │     no change │
│ QQuery 35 │        109.04 / 110.12 ±0.61 / 110.84 ms │         107.55 / 109.84 ±1.46 / 111.56 ms │     no change │
│ QQuery 36 │              6.86 / 7.09 ±0.23 / 7.47 ms │               6.57 / 6.89 ±0.39 / 7.61 ms │     no change │
│ QQuery 37 │              8.71 / 9.06 ±0.29 / 9.41 ms │               8.29 / 8.48 ±0.12 / 8.61 ms │ +1.07x faster │
│ QQuery 38 │           86.14 / 89.02 ±2.40 / 93.42 ms │            83.84 / 88.70 ±4.04 / 95.56 ms │     no change │
│ QQuery 39 │        129.89 / 132.81 ±2.41 / 136.09 ms │         127.71 / 129.86 ±1.91 / 132.57 ms │     no change │
│ QQuery 40 │        115.47 / 118.01 ±3.84 / 125.63 ms │         109.81 / 116.50 ±7.23 / 130.37 ms │     no change │
│ QQuery 41 │           15.11 / 16.82 ±1.63 / 19.73 ms │            14.14 / 15.39 ±0.86 / 16.42 ms │ +1.09x faster │
│ QQuery 42 │        109.75 / 111.47 ±1.31 / 112.93 ms │         107.93 / 109.57 ±1.38 / 111.45 ms │     no change │
│ QQuery 43 │              6.13 / 6.29 ±0.17 / 6.61 ms │               5.93 / 6.15 ±0.21 / 6.55 ms │     no change │
│ QQuery 44 │           12.58 / 12.96 ±0.26 / 13.25 ms │            12.34 / 12.56 ±0.15 / 12.80 ms │     no change │
│ QQuery 45 │           52.79 / 53.56 ±0.87 / 54.92 ms │            51.18 / 52.36 ±0.65 / 52.88 ms │     no change │
│ QQuery 46 │              8.78 / 9.15 ±0.30 / 9.60 ms │               8.74 / 8.93 ±0.16 / 9.17 ms │     no change │
│ QQuery 47 │        742.93 / 753.23 ±5.85 / 758.52 ms │         708.30 / 717.30 ±5.38 / 724.42 ms │     no change │
│ QQuery 48 │        290.05 / 295.39 ±3.74 / 301.60 ms │         285.59 / 288.81 ±2.62 / 292.36 ms │     no change │
│ QQuery 49 │        250.24 / 255.09 ±2.81 / 258.28 ms │         251.10 / 254.72 ±2.73 / 259.58 ms │     no change │
│ QQuery 50 │        223.40 / 234.27 ±6.15 / 241.49 ms │         218.87 / 224.72 ±5.18 / 233.14 ms │     no change │
│ QQuery 51 │        180.95 / 184.85 ±2.47 / 188.02 ms │         182.51 / 184.37 ±2.70 / 189.65 ms │     no change │
│ QQuery 52 │        109.08 / 110.67 ±1.28 / 112.66 ms │         108.18 / 109.57 ±1.54 / 111.78 ms │     no change │
│ QQuery 53 │        103.46 / 104.00 ±0.53 / 104.88 ms │         104.24 / 105.46 ±1.06 / 106.83 ms │     no change │
│ QQuery 54 │        148.06 / 149.13 ±0.68 / 149.96 ms │         149.61 / 150.82 ±0.71 / 151.58 ms │     no change │
│ QQuery 55 │        108.04 / 109.20 ±1.60 / 112.27 ms │         108.22 / 109.47 ±1.00 / 110.68 ms │     no change │
│ QQuery 56 │        144.70 / 145.47 ±1.23 / 147.92 ms │         142.57 / 143.39 ±0.73 / 144.48 ms │     no change │
│ QQuery 57 │        175.10 / 177.24 ±2.43 / 181.70 ms │         173.46 / 176.66 ±2.03 / 178.71 ms │     no change │
│ QQuery 58 │        295.65 / 312.83 ±9.63 / 324.54 ms │         293.44 / 299.98 ±4.50 / 305.03 ms │     no change │
│ QQuery 59 │        198.60 / 203.87 ±4.01 / 208.21 ms │         199.76 / 203.17 ±3.02 / 208.39 ms │     no change │
│ QQuery 60 │        145.57 / 147.62 ±1.30 / 149.01 ms │         144.75 / 146.38 ±1.00 / 147.76 ms │     no change │
│ QQuery 61 │           13.91 / 14.07 ±0.17 / 14.32 ms │            13.47 / 13.74 ±0.25 / 14.12 ms │     no change │
│ QQuery 62 │      910.53 / 952.12 ±38.07 / 1016.94 ms │        889.46 / 923.63 ±27.09 / 954.89 ms │     no change │
│ QQuery 63 │        105.01 / 106.27 ±1.50 / 109.21 ms │         105.84 / 107.25 ±1.09 / 109.01 ms │     no change │
│ QQuery 64 │        693.19 / 700.25 ±4.96 / 707.76 ms │         683.20 / 693.67 ±7.30 / 704.24 ms │     no change │
│ QQuery 65 │        256.32 / 258.81 ±2.14 / 261.65 ms │         251.02 / 254.74 ±5.12 / 264.08 ms │     no change │
│ QQuery 66 │       245.58 / 257.95 ±14.16 / 282.61 ms │        230.14 / 250.48 ±10.85 / 259.27 ms │     no change │
│ QQuery 67 │        320.13 / 324.04 ±3.78 / 330.13 ms │         309.64 / 318.64 ±8.28 / 332.71 ms │     no change │
│ QQuery 68 │            9.58 / 11.27 ±1.40 / 13.31 ms │              8.56 / 9.91 ±1.72 / 13.07 ms │ +1.14x faster │
│ QQuery 69 │        102.97 / 104.95 ±1.68 / 108.02 ms │          99.44 / 102.68 ±1.91 / 105.35 ms │     no change │
│ QQuery 70 │       344.34 / 357.33 ±12.58 / 380.42 ms │         340.10 / 349.58 ±7.08 / 360.43 ms │     no change │
│ QQuery 71 │        136.54 / 138.70 ±1.45 / 140.47 ms │         135.40 / 138.53 ±2.72 / 142.01 ms │     no change │
│ QQuery 72 │        612.25 / 621.71 ±6.86 / 630.78 ms │         627.79 / 634.31 ±5.88 / 644.35 ms │     no change │
│ QQuery 73 │              7.04 / 8.14 ±0.99 / 9.93 ms │               6.83 / 7.65 ±0.81 / 8.94 ms │ +1.06x faster │
│ QQuery 74 │        595.53 / 604.64 ±5.65 / 612.39 ms │        555.25 / 573.01 ±14.36 / 597.16 ms │ +1.06x faster │
│ QQuery 75 │        275.82 / 279.56 ±2.96 / 284.12 ms │         277.37 / 279.35 ±1.61 / 281.82 ms │     no change │
│ QQuery 76 │        132.50 / 134.20 ±1.70 / 137.20 ms │         134.73 / 135.28 ±0.49 / 135.88 ms │     no change │
│ QQuery 77 │        189.31 / 190.65 ±0.75 / 191.45 ms │         189.69 / 191.41 ±1.61 / 194.40 ms │     no change │
│ QQuery 78 │        343.15 / 346.69 ±2.56 / 349.66 ms │         346.51 / 350.83 ±2.82 / 353.82 ms │     no change │
│ QQuery 79 │        235.73 / 238.17 ±2.98 / 243.76 ms │         234.05 / 236.15 ±1.40 / 237.63 ms │     no change │
│ QQuery 80 │        322.76 / 325.99 ±2.78 / 331.08 ms │         323.23 / 324.45 ±0.78 / 325.24 ms │     no change │
│ QQuery 81 │           26.96 / 28.33 ±0.97 / 29.72 ms │            26.97 / 28.15 ±0.89 / 29.30 ms │     no change │
│ QQuery 82 │        201.49 / 203.50 ±2.11 / 206.09 ms │         199.24 / 201.41 ±2.23 / 205.59 ms │     no change │
│ QQuery 83 │           40.50 / 40.85 ±0.36 / 41.45 ms │            38.74 / 39.67 ±0.62 / 40.57 ms │     no change │
│ QQuery 84 │           48.94 / 50.73 ±0.95 / 51.76 ms │            48.23 / 48.85 ±0.51 / 49.79 ms │     no change │
│ QQuery 85 │        148.70 / 149.82 ±0.90 / 151.22 ms │         146.47 / 149.55 ±2.58 / 153.68 ms │     no change │
│ QQuery 86 │           39.34 / 41.24 ±1.54 / 43.37 ms │            39.48 / 39.97 ±0.47 / 40.77 ms │     no change │
│ QQuery 87 │           85.55 / 89.57 ±3.09 / 94.57 ms │            85.92 / 90.79 ±4.84 / 98.07 ms │     no change │
│ QQuery 88 │        101.46 / 103.45 ±2.06 / 106.73 ms │         100.73 / 101.37 ±0.63 / 102.18 ms │     no change │
│ QQuery 89 │        120.02 / 121.59 ±1.12 / 123.23 ms │         119.28 / 120.62 ±0.74 / 121.30 ms │     no change │
│ QQuery 90 │           24.12 / 24.79 ±0.43 / 25.43 ms │            23.89 / 24.38 ±0.63 / 25.60 ms │     no change │
│ QQuery 91 │           64.15 / 65.50 ±1.01 / 67.04 ms │            61.94 / 64.90 ±2.53 / 68.60 ms │     no change │
│ QQuery 92 │           58.17 / 60.01 ±1.14 / 61.32 ms │            57.94 / 58.96 ±0.75 / 59.84 ms │     no change │
│ QQuery 93 │        188.85 / 192.03 ±2.29 / 194.61 ms │         188.31 / 189.89 ±1.05 / 191.57 ms │     no change │
│ QQuery 94 │           63.51 / 64.16 ±0.69 / 65.42 ms │            61.83 / 63.45 ±1.36 / 65.15 ms │     no change │
│ QQuery 95 │        130.25 / 131.04 ±0.55 / 131.83 ms │         128.96 / 130.98 ±1.19 / 132.43 ms │     no change │
│ QQuery 96 │           71.56 / 75.16 ±1.87 / 76.72 ms │            74.33 / 74.93 ±0.67 / 76.23 ms │     no change │
│ QQuery 97 │        128.99 / 130.75 ±1.38 / 133.23 ms │         125.02 / 127.78 ±1.59 / 129.80 ms │     no change │
│ QQuery 98 │        156.22 / 157.55 ±0.81 / 158.67 ms │         154.80 / 157.76 ±1.98 / 160.27 ms │     no change │
│ QQuery 99 │ 10863.95 / 10885.69 ±22.13 / 10924.58 ms │  10794.69 / 10840.18 ±28.73 / 10881.09 ms │     no change │
└───────────┴──────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 31952.71ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 31772.51ms │
│ Average Time (HEAD)                                      │   322.75ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   320.93ms │
│ Queries Faster                                           │          6 │
│ Queries Slower                                           │          0 │
│ Queries with No Change                                   │         93 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 160.1s
Peak memory 5.7 GiB
Avg memory 4.5 GiB
CPU user 263.8s
CPU sys 17.7s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 159.2s
Peak memory 5.4 GiB
Avg memory 4.5 GiB
CPU user 262.8s
CPU sys 17.1s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4261207193-1375-rv9zw 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (7011c5d) to 5c653be (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4261207193-1374-6k5jz 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (7011c5d) to 5c653be (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4261207193-1373-pk7t6 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/hash-join-empty-partition-reporting (7011c5d) to 5c653be (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.21 / 4.56 ±6.51 / 17.58 ms │              1.21 / 4.48 ±6.40 / 17.28 ms │     no change │
│ QQuery 1  │        15.53 / 15.68 ±0.17 / 15.97 ms │            14.97 / 15.50 ±0.28 / 15.72 ms │     no change │
│ QQuery 2  │        44.79 / 45.06 ±0.32 / 45.59 ms │            43.95 / 44.23 ±0.20 / 44.44 ms │     no change │
│ QQuery 3  │        43.28 / 45.82 ±1.56 / 47.87 ms │            43.79 / 44.60 ±1.19 / 46.95 ms │     no change │
│ QQuery 4  │     300.16 / 305.00 ±3.93 / 311.02 ms │         291.01 / 298.90 ±6.49 / 306.78 ms │     no change │
│ QQuery 5  │     350.72 / 358.13 ±4.67 / 365.27 ms │         344.86 / 347.96 ±3.76 / 355.04 ms │     no change │
│ QQuery 6  │          6.02 / 7.78 ±2.92 / 13.60 ms │               5.30 / 6.27 ±0.60 / 6.94 ms │ +1.24x faster │
│ QQuery 7  │        16.97 / 17.52 ±0.66 / 18.53 ms │            17.14 / 17.38 ±0.18 / 17.64 ms │     no change │
│ QQuery 8  │     410.17 / 420.59 ±8.39 / 431.59 ms │         417.02 / 428.47 ±6.85 / 435.06 ms │     no change │
│ QQuery 9  │     652.25 / 661.89 ±5.75 / 668.45 ms │         644.64 / 655.84 ±7.14 / 665.75 ms │     no change │
│ QQuery 10 │        92.52 / 94.00 ±1.40 / 96.34 ms │            92.71 / 95.10 ±2.60 / 99.97 ms │     no change │
│ QQuery 11 │     105.99 / 107.24 ±1.18 / 109.13 ms │         105.92 / 107.73 ±1.33 / 109.64 ms │     no change │
│ QQuery 12 │     340.77 / 348.75 ±5.66 / 356.06 ms │         350.33 / 353.92 ±2.95 / 358.53 ms │     no change │
│ QQuery 13 │    455.88 / 472.88 ±14.96 / 498.25 ms │        476.85 / 495.23 ±14.45 / 513.87 ms │     no change │
│ QQuery 14 │     344.86 / 348.07 ±3.10 / 352.53 ms │         355.46 / 363.51 ±6.21 / 372.83 ms │     no change │
│ QQuery 15 │     348.06 / 357.64 ±7.32 / 366.36 ms │        362.13 / 385.40 ±23.59 / 430.46 ms │  1.08x slower │
│ QQuery 16 │     710.70 / 719.10 ±6.50 / 729.54 ms │        729.48 / 740.26 ±14.95 / 768.21 ms │     no change │
│ QQuery 17 │     708.45 / 712.99 ±3.91 / 719.70 ms │        730.10 / 763.36 ±33.17 / 819.83 ms │  1.07x slower │
│ QQuery 18 │ 1425.99 / 1466.24 ±23.31 / 1493.34 ms │     1466.34 / 1507.16 ±27.51 / 1542.96 ms │     no change │
│ QQuery 19 │       36.19 / 43.90 ±14.45 / 72.78 ms │            34.81 / 37.07 ±1.44 / 38.96 ms │ +1.18x faster │
│ QQuery 20 │    719.05 / 735.55 ±19.57 / 765.34 ms │        722.25 / 734.81 ±15.17 / 763.07 ms │     no change │
│ QQuery 21 │     765.26 / 767.77 ±2.08 / 770.71 ms │        771.59 / 782.17 ±13.09 / 807.23 ms │     no change │
│ QQuery 22 │  1128.42 / 1137.21 ±9.90 / 1155.46 ms │      1134.06 / 1137.51 ±2.71 / 1140.71 ms │     no change │
│ QQuery 23 │ 3078.94 / 3114.20 ±21.83 / 3147.29 ms │     3076.66 / 3104.17 ±24.78 / 3139.26 ms │     no change │
│ QQuery 24 │     100.36 / 102.87 ±2.23 / 106.11 ms │         101.64 / 103.90 ±1.81 / 106.87 ms │     no change │
│ QQuery 25 │     139.21 / 140.77 ±1.29 / 142.61 ms │         139.80 / 141.41 ±1.81 / 143.92 ms │     no change │
│ QQuery 26 │      98.30 / 102.32 ±2.42 / 105.03 ms │          98.68 / 101.50 ±1.74 / 103.28 ms │     no change │
│ QQuery 27 │     849.95 / 857.35 ±6.67 / 869.17 ms │         852.03 / 856.61 ±3.72 / 862.10 ms │     no change │
│ QQuery 28 │ 3243.78 / 3270.64 ±15.40 / 3290.54 ms │     3257.78 / 3301.90 ±28.61 / 3329.77 ms │     no change │
│ QQuery 29 │      49.88 / 77.08 ±48.53 / 173.78 ms │           50.04 / 57.73 ±10.85 / 78.66 ms │ +1.34x faster │
│ QQuery 30 │     355.29 / 359.90 ±4.76 / 368.54 ms │         360.76 / 371.94 ±7.15 / 383.19 ms │     no change │
│ QQuery 31 │    360.74 / 378.80 ±11.38 / 394.94 ms │        361.60 / 381.13 ±12.84 / 396.99 ms │     no change │
│ QQuery 32 │ 1120.91 / 1276.78 ±84.79 / 1378.17 ms │     1243.27 / 1283.99 ±51.24 / 1385.18 ms │     no change │
│ QQuery 33 │ 1467.54 / 1510.83 ±37.89 / 1578.55 ms │     1504.55 / 1559.73 ±37.66 / 1599.80 ms │     no change │
│ QQuery 34 │  1449.11 / 1465.87 ±9.25 / 1476.99 ms │     1550.56 / 1598.95 ±52.06 / 1670.22 ms │  1.09x slower │
│ QQuery 35 │     388.11 / 395.21 ±5.32 / 403.09 ms │         396.52 / 410.10 ±9.56 / 426.36 ms │     no change │
│ QQuery 36 │     113.98 / 120.84 ±3.55 / 123.83 ms │         112.49 / 119.79 ±4.86 / 125.89 ms │     no change │
│ QQuery 37 │        48.15 / 51.18 ±1.57 / 52.70 ms │            48.52 / 50.43 ±1.79 / 53.42 ms │     no change │
│ QQuery 38 │        74.24 / 76.31 ±1.79 / 78.79 ms │            77.49 / 78.49 ±0.85 / 79.80 ms │     no change │
│ QQuery 39 │     203.95 / 213.99 ±6.78 / 223.86 ms │         204.72 / 212.87 ±8.80 / 229.49 ms │     no change │
│ QQuery 40 │        24.38 / 27.00 ±2.19 / 30.63 ms │            23.04 / 25.37 ±1.51 / 27.72 ms │ +1.06x faster │
│ QQuery 41 │        20.35 / 21.47 ±1.10 / 23.16 ms │            20.66 / 21.02 ±0.28 / 21.39 ms │     no change │
│ QQuery 42 │        19.42 / 20.44 ±0.75 / 21.43 ms │            20.06 / 20.59 ±0.33 / 21.07 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 22777.20ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 23168.50ms │
│ Average Time (HEAD)                                      │   529.70ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   538.80ms │
│ Queries Faster                                           │          4 │
│ Queries Slower                                           │          3 │
│ Queries with No Change                                   │         36 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 115.0s
Peak memory 42.1 GiB
Avg memory 29.3 GiB
CPU user 1076.3s
CPU sys 95.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 116.9s
Peak memory 39.0 GiB
Avg memory 29.0 GiB
CPU user 1083.5s
CPU sys 104.1s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and codex_hash-join-empty-partition-reporting
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃ codex_hash-join-empty-partition-reporting ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │              7.07 / 7.57 ±0.73 / 9.00 ms │               6.97 / 7.43 ±0.74 / 8.91 ms │     no change │
│ QQuery 2  │        146.94 / 148.20 ±1.03 / 149.53 ms │         145.91 / 146.88 ±0.82 / 147.85 ms │     no change │
│ QQuery 3  │        114.73 / 115.34 ±0.69 / 116.66 ms │         113.82 / 114.05 ±0.17 / 114.26 ms │     no change │
│ QQuery 4  │    1446.30 / 1484.55 ±24.78 / 1512.49 ms │     1428.52 / 1468.10 ±25.80 / 1498.60 ms │     no change │
│ QQuery 5  │        174.45 / 176.13 ±1.14 / 177.61 ms │         175.16 / 177.00 ±1.82 / 179.68 ms │     no change │
│ QQuery 6  │       892.55 / 911.70 ±20.43 / 948.24 ms │        854.85 / 896.75 ±23.73 / 920.00 ms │     no change │
│ QQuery 7  │        348.04 / 352.12 ±2.92 / 357.04 ms │         346.46 / 348.34 ±1.90 / 351.83 ms │     no change │
│ QQuery 8  │        118.14 / 118.71 ±0.56 / 119.74 ms │         118.42 / 119.54 ±0.99 / 121.06 ms │     no change │
│ QQuery 9  │        107.55 / 110.79 ±3.53 / 117.29 ms │         103.18 / 106.62 ±2.30 / 108.76 ms │     no change │
│ QQuery 10 │        108.45 / 109.88 ±0.84 / 110.81 ms │         107.59 / 109.38 ±1.47 / 111.94 ms │     no change │
│ QQuery 11 │     1057.30 / 1064.65 ±7.61 / 1078.23 ms │      1040.15 / 1049.87 ±7.29 / 1061.85 ms │     no change │
│ QQuery 12 │           47.28 / 48.55 ±0.95 / 49.41 ms │            45.67 / 47.06 ±0.83 / 47.91 ms │     no change │
│ QQuery 13 │        415.70 / 423.17 ±4.91 / 428.64 ms │         403.64 / 410.71 ±3.64 / 413.85 ms │     no change │
│ QQuery 14 │     1026.57 / 1032.91 ±3.77 / 1037.06 ms │      1007.54 / 1017.70 ±8.62 / 1030.60 ms │     no change │
│ QQuery 15 │           16.18 / 17.07 ±0.88 / 18.54 ms │            16.34 / 17.66 ±0.84 / 18.57 ms │     no change │
│ QQuery 16 │              8.07 / 8.70 ±0.57 / 9.45 ms │               7.64 / 7.98 ±0.40 / 8.76 ms │ +1.09x faster │
│ QQuery 17 │        238.35 / 241.53 ±2.69 / 246.07 ms │         235.18 / 237.56 ±1.73 / 239.59 ms │     no change │
│ QQuery 18 │        132.90 / 133.81 ±0.60 / 134.34 ms │         131.77 / 133.55 ±1.30 / 135.47 ms │     no change │
│ QQuery 19 │        161.00 / 162.14 ±1.13 / 164.18 ms │         159.67 / 161.06 ±0.87 / 162.24 ms │     no change │
│ QQuery 20 │           14.57 / 14.83 ±0.29 / 15.40 ms │            14.49 / 14.97 ±0.32 / 15.26 ms │     no change │
│ QQuery 21 │           20.33 / 20.84 ±0.39 / 21.33 ms │            20.07 / 20.57 ±0.35 / 21.08 ms │     no change │
│ QQuery 22 │        520.91 / 525.56 ±3.25 / 528.33 ms │         517.43 / 520.58 ±2.20 / 523.72 ms │     no change │
│ QQuery 23 │        930.90 / 938.56 ±7.29 / 952.45 ms │         918.42 / 924.72 ±3.65 / 929.43 ms │     no change │
│ QQuery 24 │        396.36 / 400.85 ±3.58 / 405.99 ms │         391.89 / 394.35 ±2.22 / 396.93 ms │     no change │
│ QQuery 25 │        353.39 / 355.48 ±1.27 / 357.00 ms │         350.65 / 354.41 ±2.11 / 356.18 ms │     no change │
│ QQuery 26 │           83.34 / 84.90 ±1.92 / 88.45 ms │            82.26 / 85.01 ±1.92 / 87.07 ms │     no change │
│ QQuery 27 │              7.19 / 7.54 ±0.25 / 7.90 ms │               7.19 / 7.42 ±0.19 / 7.67 ms │     no change │
│ QQuery 28 │        154.64 / 155.46 ±0.92 / 157.09 ms │         150.66 / 152.60 ±1.60 / 154.93 ms │     no change │
│ QQuery 29 │        287.68 / 292.87 ±3.22 / 297.77 ms │         290.15 / 291.81 ±1.40 / 293.83 ms │     no change │
│ QQuery 30 │           45.11 / 46.01 ±0.86 / 47.26 ms │            44.10 / 46.57 ±1.43 / 48.02 ms │     no change │
│ QQuery 31 │        174.13 / 175.86 ±1.21 / 177.77 ms │         175.46 / 176.85 ±1.40 / 179.33 ms │     no change │
│ QQuery 32 │         57.69 / 67.84 ±16.42 / 100.47 ms │            58.21 / 59.18 ±1.35 / 61.82 ms │ +1.15x faster │
│ QQuery 33 │        144.84 / 146.64 ±1.65 / 149.66 ms │         145.12 / 146.40 ±1.19 / 148.67 ms │     no change │
│ QQuery 34 │              7.50 / 8.15 ±0.87 / 9.89 ms │               7.51 / 7.63 ±0.12 / 7.84 ms │ +1.07x faster │
│ QQuery 35 │        110.32 / 112.92 ±1.57 / 114.82 ms │         108.24 / 110.53 ±1.83 / 113.04 ms │     no change │
│ QQuery 36 │              6.89 / 7.19 ±0.21 / 7.52 ms │               6.93 / 7.16 ±0.23 / 7.56 ms │     no change │
│ QQuery 37 │             8.79 / 9.60 ±0.57 / 10.49 ms │              9.07 / 9.39 ±0.45 / 10.28 ms │     no change │
│ QQuery 38 │           87.81 / 92.44 ±4.06 / 99.17 ms │           86.98 / 91.19 ±5.36 / 101.50 ms │     no change │
│ QQuery 39 │        133.09 / 136.35 ±2.16 / 139.87 ms │         132.22 / 133.59 ±1.00 / 134.63 ms │     no change │
│ QQuery 40 │        116.24 / 121.72 ±6.69 / 134.57 ms │         110.79 / 119.84 ±7.98 / 134.43 ms │     no change │
│ QQuery 41 │           15.16 / 15.79 ±0.63 / 16.94 ms │            15.14 / 15.52 ±0.33 / 16.02 ms │     no change │
│ QQuery 42 │        110.16 / 112.18 ±1.35 / 114.22 ms │         107.95 / 109.09 ±1.19 / 110.84 ms │     no change │
│ QQuery 43 │              6.54 / 6.62 ±0.07 / 6.71 ms │               6.35 / 6.54 ±0.12 / 6.66 ms │     no change │
│ QQuery 44 │           12.74 / 13.00 ±0.24 / 13.32 ms │            12.17 / 12.63 ±0.32 / 13.03 ms │     no change │
│ QQuery 45 │           51.58 / 52.66 ±0.88 / 54.08 ms │            52.11 / 52.62 ±0.53 / 53.56 ms │     no change │
│ QQuery 46 │             8.86 / 9.58 ±0.49 / 10.12 ms │              8.88 / 9.42 ±0.54 / 10.43 ms │     no change │
│ QQuery 47 │       798.09 / 816.66 ±13.47 / 833.81 ms │         797.59 / 810.08 ±7.03 / 818.75 ms │     no change │
│ QQuery 48 │        296.70 / 301.03 ±3.27 / 305.09 ms │         291.52 / 297.98 ±3.93 / 303.10 ms │     no change │
│ QQuery 49 │        257.67 / 260.29 ±1.81 / 262.97 ms │         253.95 / 255.69 ±1.00 / 257.00 ms │     no change │
│ QQuery 50 │        232.58 / 239.49 ±4.59 / 245.23 ms │         228.78 / 237.23 ±5.54 / 244.33 ms │     no change │
│ QQuery 51 │        183.34 / 187.08 ±3.28 / 191.33 ms │         179.95 / 185.03 ±3.22 / 189.68 ms │     no change │
│ QQuery 52 │        108.86 / 111.77 ±1.91 / 114.28 ms │         109.18 / 110.68 ±1.12 / 112.57 ms │     no change │
│ QQuery 53 │        104.43 / 106.32 ±1.43 / 108.00 ms │         104.05 / 105.09 ±1.26 / 107.57 ms │     no change │
│ QQuery 54 │        151.14 / 152.99 ±1.02 / 154.17 ms │         150.89 / 153.62 ±1.66 / 155.58 ms │     no change │
│ QQuery 55 │        109.27 / 110.83 ±1.82 / 113.83 ms │         107.82 / 109.52 ±1.45 / 111.35 ms │     no change │
│ QQuery 56 │        143.61 / 145.72 ±1.33 / 147.31 ms │         143.83 / 145.03 ±0.97 / 146.64 ms │     no change │
│ QQuery 57 │        176.50 / 179.98 ±1.84 / 181.69 ms │         174.38 / 176.65 ±2.56 / 181.57 ms │     no change │
│ QQuery 58 │        306.80 / 308.48 ±1.24 / 310.44 ms │         290.48 / 301.66 ±7.15 / 311.46 ms │     no change │
│ QQuery 59 │        204.21 / 207.22 ±2.33 / 210.15 ms │         201.50 / 203.43 ±1.35 / 204.87 ms │     no change │
│ QQuery 60 │        147.42 / 147.92 ±0.60 / 148.91 ms │         145.40 / 147.62 ±1.49 / 149.88 ms │     no change │
│ QQuery 61 │           13.68 / 13.93 ±0.23 / 14.28 ms │            13.72 / 14.14 ±0.69 / 15.52 ms │     no change │
│ QQuery 62 │      945.03 / 980.43 ±21.86 / 1009.49 ms │        933.92 / 963.43 ±22.58 / 991.33 ms │     no change │
│ QQuery 63 │        106.57 / 107.60 ±0.85 / 108.93 ms │         105.58 / 106.79 ±1.17 / 108.44 ms │     no change │
│ QQuery 64 │        711.49 / 714.22 ±2.03 / 717.09 ms │         707.18 / 715.13 ±5.30 / 721.61 ms │     no change │
│ QQuery 65 │        271.03 / 272.39 ±1.25 / 273.90 ms │         271.63 / 274.99 ±2.14 / 277.99 ms │     no change │
│ QQuery 66 │       247.17 / 267.28 ±13.18 / 286.63 ms │         245.66 / 256.27 ±9.68 / 274.33 ms │     no change │
│ QQuery 67 │        327.31 / 335.79 ±4.71 / 341.09 ms │         324.25 / 332.57 ±4.99 / 339.00 ms │     no change │
│ QQuery 68 │            9.14 / 11.18 ±1.37 / 13.41 ms │            10.26 / 12.01 ±1.33 / 14.33 ms │  1.07x slower │
│ QQuery 69 │        103.98 / 106.72 ±1.85 / 108.50 ms │         103.63 / 105.31 ±1.31 / 107.50 ms │     no change │
│ QQuery 70 │       340.51 / 355.79 ±11.32 / 372.52 ms │         343.42 / 349.94 ±9.36 / 368.37 ms │     no change │
│ QQuery 71 │        138.81 / 141.08 ±1.83 / 143.34 ms │         136.84 / 138.60 ±0.94 / 139.54 ms │     no change │
│ QQuery 72 │        631.14 / 644.46 ±9.36 / 655.45 ms │         641.20 / 650.22 ±8.72 / 665.55 ms │     no change │
│ QQuery 73 │             7.12 / 8.73 ±1.00 / 10.15 ms │               7.40 / 8.40 ±0.86 / 9.75 ms │     no change │
│ QQuery 74 │        651.44 / 662.25 ±5.69 / 667.28 ms │         654.58 / 665.06 ±6.73 / 673.65 ms │     no change │
│ QQuery 75 │        281.77 / 284.64 ±3.09 / 290.39 ms │         282.98 / 285.19 ±1.84 / 287.88 ms │     no change │
│ QQuery 76 │        135.75 / 136.99 ±0.93 / 138.12 ms │         133.63 / 135.81 ±1.62 / 137.82 ms │     no change │
│ QQuery 77 │        191.29 / 193.09 ±1.76 / 196.24 ms │         191.16 / 194.25 ±2.46 / 197.55 ms │     no change │
│ QQuery 78 │        353.84 / 358.39 ±3.14 / 361.81 ms │         346.77 / 352.78 ±5.18 / 360.82 ms │     no change │
│ QQuery 79 │        252.73 / 255.79 ±2.70 / 260.62 ms │         249.60 / 251.65 ±1.74 / 253.82 ms │     no change │
│ QQuery 80 │        326.74 / 328.76 ±1.61 / 330.74 ms │         322.63 / 325.84 ±2.00 / 328.90 ms │     no change │
│ QQuery 81 │           27.52 / 28.43 ±0.67 / 29.48 ms │            27.32 / 28.10 ±0.62 / 29.21 ms │     no change │
│ QQuery 82 │        203.59 / 205.63 ±1.73 / 207.94 ms │         198.72 / 202.06 ±2.08 / 205.19 ms │     no change │
│ QQuery 83 │           40.88 / 42.30 ±1.09 / 43.62 ms │            39.87 / 40.72 ±0.99 / 42.03 ms │     no change │
│ QQuery 84 │           49.85 / 50.78 ±0.61 / 51.76 ms │            50.25 / 50.71 ±0.69 / 52.08 ms │     no change │
│ QQuery 85 │        152.30 / 155.19 ±2.06 / 158.47 ms │         151.23 / 152.68 ±1.24 / 154.38 ms │     no change │
│ QQuery 86 │           40.69 / 41.78 ±0.96 / 43.34 ms │            40.10 / 41.10 ±0.54 / 41.60 ms │     no change │
│ QQuery 87 │           93.16 / 95.13 ±1.59 / 97.63 ms │            88.19 / 91.34 ±3.55 / 98.16 ms │     no change │
│ QQuery 88 │        105.15 / 105.87 ±0.72 / 107.20 ms │         102.25 / 103.80 ±1.10 / 105.48 ms │     no change │
│ QQuery 89 │        124.67 / 125.76 ±1.25 / 127.94 ms │         119.69 / 121.22 ±1.31 / 122.85 ms │     no change │
│ QQuery 90 │           24.77 / 25.25 ±0.44 / 25.92 ms │            24.26 / 25.50 ±1.19 / 27.73 ms │     no change │
│ QQuery 91 │           63.61 / 67.15 ±2.13 / 69.35 ms │            64.09 / 66.89 ±1.43 / 68.12 ms │     no change │
│ QQuery 92 │           59.95 / 60.68 ±0.78 / 62.10 ms │            58.73 / 59.54 ±0.60 / 60.59 ms │     no change │
│ QQuery 93 │        195.21 / 195.88 ±0.75 / 196.89 ms │         191.87 / 193.52 ±1.08 / 195.21 ms │     no change │
│ QQuery 94 │           63.75 / 64.70 ±0.50 / 65.20 ms │            62.45 / 63.60 ±0.74 / 64.58 ms │     no change │
│ QQuery 95 │        132.82 / 134.65 ±1.66 / 137.70 ms │         129.96 / 130.39 ±0.24 / 130.61 ms │     no change │
│ QQuery 96 │           77.82 / 78.34 ±0.42 / 79.02 ms │            75.04 / 75.59 ±0.36 / 76.08 ms │     no change │
│ QQuery 97 │        135.03 / 136.38 ±1.12 / 138.42 ms │         129.53 / 130.64 ±0.88 / 131.90 ms │     no change │
│ QQuery 98 │        162.89 / 164.68 ±1.67 / 166.70 ms │         155.77 / 158.79 ±2.32 / 162.24 ms │     no change │
│ QQuery 99 │ 10908.05 / 10955.34 ±34.31 / 11010.59 ms │  10893.43 / 10947.81 ±46.52 / 11008.84 ms │     no change │
└───────────┴──────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                        │ 32788.21ms │
│ Total Time (codex_hash-join-empty-partition-reporting)   │ 32519.50ms │
│ Average Time (HEAD)                                      │   331.19ms │
│ Average Time (codex_hash-join-empty-partition-reporting) │   328.48ms │
│ Queries Faster                                           │          3 │
│ Queries Slower                                           │          1 │
│ Queries with No Change                                   │         95 │
│ Queries with Failure                                     │          0 │
└──────────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 164.3s
Peak memory 5.5 GiB
Avg memory 4.5 GiB
CPU user 271.7s
CPU sys 18.0s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 162.9s
Peak memory 6.0 GiB
Avg memory 4.8 GiB
CPU user 269.6s
CPU sys 16.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

Okay I think this addresses the root cause with no performance regression or behavior changes.

@adriangb
Copy link
Copy Markdown
Contributor Author

@RatulDawar could you let me know what you think of this solution?

@RatulDawar
Copy link
Copy Markdown
Contributor

@adriangb went through the solution and the PR, makes much more sense to remove barrier and track states instead here, this also give a much predictable behaviour due to the mention of explicit states.

Just one concern with code here, reaching to a correct state here depends on if the person calls report_build_data. Can we have a state transition method so that build data is automatically reported and we would just need to call the state changes method ike existing state_after_build_ready.

Replaces the manual PartitionBuildData construction + report_build_data
call + build_reported flag set in collect_build_side with a single
transition_after_build_collected method, making it impossible to forget
to report build data when transitioning state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@adriangb
Copy link
Copy Markdown
Contributor Author

@RatulDawar does 5ad96d9 help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: TPCH 18 query hangs

5 participants