Conversation
91f2bda to
b6b9f76
Compare
b6b9f76 to
065c688
Compare
| bucket: i32, | ||
| bucket_path: String, | ||
| total_buckets: Option<i32>, | ||
| total_buckets: i32, |
There was a problem hiding this comment.
java use nuaable Integer for potential null value. But I think it won't be null in new version paimon. So, I remove Option.
For reference: apache/paimon#5537 may be a related pr.
| let base_path = table_path; | ||
| let mut splits = Vec::new(); | ||
|
|
||
| for ((_partition, bucket), group_entries) in groups { |
There was a problem hiding this comment.
@luoyuxia Hi, for partitioned tables, _partition is dropped right after grouping, but the split is later built with BinaryRow::new(0) and "{table_path}/bucket-{bucket}". This is fine for unpartitioned tables, but for partitioned tables, it loses partition identity, and bucket_path also misses the partition directory prefix (k=v/...). It would be better to reconstruct the partition from the grouped partition bytes, build splits with the real partition, and generate bucket_path as partition_path/bucket-{bucket}. What do you think of this?
Purpose
Linked issue: close #105
Brief change log
Tests
API and Format
Documentation