[core] Support partition predicate pushdown for PartitionsTable#7628
[core] Support partition predicate pushdown for PartitionsTable#7628sundapeng wants to merge 4 commits intoapache:masterfrom
Conversation
PartitionsTable.withFilter() was a no-op (TODO), causing full manifest scans when querying with partition filters. This adds predicate pushdown following the same pattern as BucketsTable (apache#7592) and FilesTable (apache#7376). Key changes: - PartitionsScan extracts partition predicate via LeafPredicateExtractor - PartitionsSplit carries the predicate to PartitionsRead - Catalog path: in-memory filter preserving metadata columns - TableScan path: manifest-level pushdown via withPartitionFilter - PartitionPredicateHelper refactored to build+apply two-step pattern - parsePartitionSpec extended for key=value/key=value format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…redicate filterByPredicate used raw p.spec().get(key) which renders null as literal "null", while toRow substitutes null with defaultPartitionName. This caused predicate pushdown to fail matching null-valued partitions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
testPartitionPredicateFilterMultiColumnKeys created MultiPartTable directly via filesystem (SchemaUtils.forceCommit), which works for local catalog but fails for REST catalog since it's unaware of tables created outside its API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| Object value = | ||
| TypeUtils.castFromString( | ||
| partSpec.get(partitionKeys.get(i)), partitionType.getTypeAt(i)); | ||
| predicates.add(partBuilder.equal(i, value)); |
There was a problem hiding this comment.
We should consider default partitions here. If the value is the default partition name, this should become isNull(...) rather than equal(...).
E2E Benchmark v2: 3 Rounds × 9 Samples (with bug fixes)Environment
Bug Fixes in Latest Push
Catalog Path (DLF REST API)
Filtered avg at 10K: +8.6% Scan Path (Time-Travel
|
| Partitions | Query | Before (ms) | After (ms) | Speedup |
|---|---|---|---|---|
| 100 | no_filter | 223 | 223 | 1.00x |
| 100 | equal_filter | 242 | 239 | 1.01x |
| 100 | in_filter | 207 | 222 | 0.93x |
| 1,000 | no_filter | 206 | 208 | 0.99x |
| 1,000 | equal_filter | 167 | 156 | 1.07x |
| 1,000 | in_filter | 168 | 161 | 1.04x |
| 5,000 | no_filter | 235 | 256 | 0.92x |
| 5,000 | equal_filter | 179 | 154 | 1.16x |
| 5,000 | in_filter | 193 | 152 | 1.27x |
| 10,000 | no_filter | 240 | 234 | 1.02x |
| 10,000 | equal_filter | 199 | 142 | 1.40x |
| 10,000 | in_filter | 201 | 156 | 1.29x |
Filtered avg at 10K: +25.5%
Summary
| Path | Best Single Speedup | 10K Filtered Avg |
|---|---|---|
| Catalog (DLF) | 1.18x (1K in_filter) | +8.6% |
| Scan (time-travel) | 1.40x (10K equal_filter) | +25.5% |
No regression in unfiltered baselines. Improvement scales with partition count.
| buildPartitionPredicate( | ||
| partitionPredicate, partitionKeys, partitionType, defaultPartitionName); | ||
| if (predicate == null) { | ||
| return false; |
There was a problem hiding this comment.
false? No record will be returned? {@code null} if the predicate cannot be pushed down (unsupported predicate type or invalid partition spec.
There was a problem hiding this comment.
Good catch. The issue is that buildPartitionPredicate returned null for two different reasons:
- Unsupported predicate type (e.g.,
isNotNull) → should skip pushdown, return all data - Invalid partition spec (e.g., partial key
[2]on a 2-column partition table) → no match, return empty is correct
Simply changing return false to return true would break FilesTableTest.testReadWithNotFullPartitionKey (case 2).
Fixed by using @Nullable Predicate with clear semantics: null means unsupported (skip pushdown), PredicateBuilder.alwaysFalse() means no match (return empty), and a normal predicate means apply pushdown. This reuses the existing AlwaysFalse predicate instead of introducing a wrapper class.
Added testBucketsTableUnsupportedPredicateFallsBackToFullScan to cover the regression.
4500b2c to
3f41e81
Compare
3f41e81 to
7213570
Compare
…correctness - Handle __DEFAULT_PARTITION__ in buildPartitionPredicate() by generating isNull() instead of castFromString() which throws NumberFormatException on non-string partition types (e.g. INT) - Fix scan path to skip pushdown for unsupported predicates instead of returning empty results - Pass defaultPartitionName through all system table callers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7213570 to
a430947
Compare
Purpose
PartitionsTable.withFilter()was a no-op (// TODOat line 189), causing full manifest scans when querying with partition filters likeSELECT * FROM t$partitions WHERE partition = 'dt=20260410'. This adds partition predicate pushdown following the same pattern established by BucketsTable (#7592), FilesTable (#7376), ManifestsTable (#7310), and ConsumersTable (#7329).The implementation uses a dual-path filtering strategy:
catalog.listPartitions()call and filters results in memory, keeping metadata columns (created_at,created_by,updated_by,options) intactInnerTableScan.withPartitionFilter()for manifest-level pruningAlso refactors
PartitionPredicateHelper.applyPartitionFilter()into a two-step build+apply pattern (buildPartitionPredicate()+ apply), and extendsparsePartitionSpec()to support PartitionsTable'skey=value/key=valueformat in addition to the existing{value1, value2}format.Tests
BucketsTableTestpasses (verifies refactored helper is backward-compatible)API and Format
No API or storage format changes.
Documentation
No documentation changes needed.