[core] introduce Placeholder for Blob File Format#7889
Conversation
| * The placeholder blob, mainly for blob update in data-evolution. It should never be exposed to | ||
| * users. | ||
| */ | ||
| Blob PLACE_HOLDER = |
There was a problem hiding this comment.
This is strange, maybe just use NULL as place holder?
There was a problem hiding this comment.
Thanks for your advise! But in #7125 we supports storing nulls in blob file. I'm not clear how to distinguish placeholders and native NULLs if so.
From the semantics, NULLs are exposed to users, users know that they store some nulls. But placeholders are fully internal used, users should never be aware about them. If users set some rows as nulls, we may fallback those rows to earlier versions, this is not expected in our design.
Could you please give me some advise?
There was a problem hiding this comment.
Perhaps you can consider using row number in blob to determine how to merge? You can just return valid blobs with row number.
There was a problem hiding this comment.
The row number is actually the primary key.
There was a problem hiding this comment.
I understand that you not only need this class for reading, but also for writing. If you skip these elements, the changes will be significant.
I thin you can just introduce a BlobPlaceHolder implements Blob, Serializable for this, use instance of is better.
There was a problem hiding this comment.
Thanks! I'll modify my code!
JingsongLi
left a comment
There was a problem hiding this comment.
Review Comments for PR #7889
Overall this is a well-designed change with thorough tests. The sequence-group fallback mechanism for blob placeholders is a solid approach. A few observations:
Issues
1. Typo in comment (DataEvolutionSplitRead.java)
} else if (bunch instanceof BlobFileBunch) {
// for blob funch, fallback on placeholders→ "funch" should be "bunch"
2. Potential resource leak in BlobFallbackRecordReader.readBatch()
for (int i = 0; i < groupReaders.size(); i++) {
RecordIterator<InternalRow> iterator = groupReaders.get(i).readBatch();
if (iterator == null) {
return null; // ← iterators[0..i-1] are not released
}
iterators[i] = iterator;
}If the k-th reader returns null, the already-obtained iterators [0, k-1] will never have releaseBatch() called. Should release them before returning null.
3. Memory pressure from ForceSingleBatchReader wrapping all group readers
Each sequence group is fully materialized into memory via ForceSingleBatchReader. When there are many sequence groups with large row ranges, this could cause significant memory pressure, especially when blob-as-descriptor is disabled and actual blob data is loaded. The TODO comment acknowledges this - just want to confirm this is acceptable for the first iteration.
4. Singleton placeholder row reuse in BlobSequenceGroupRecordReader
private InternalRow placeHolderRow() {
if (placeholderRow == null) {
GenericRow row = new GenericRow(readRowType.getFieldCount());
row.setField(blobIndex, BlobPlaceholder.INSTANCE);
placeholderRow = row;
}
return placeholderRow;
}This returns the same mutable GenericRow instance for all placeholder positions. It works here because ForceSingleBatchReader copies the data, but it's fragile - a future caller that holds references to returned rows would see aliased data. Consider adding a comment noting this intentional reuse.
5. BlobFileBunch doesn't validate schemaId across files (by design?)
VectorFileBunch.add() enforces file.schemaId() == files.get(0).schemaId(), but BlobFileBunch.add() only checks writeCols. This seems intentional since blob files from different sequences naturally have different schemas, but worth confirming.
6. DataEvolutionFileReader contract relaxation
- checkArgument(readers != null && readers.length > 1, "Readers should be more than 1");
+ checkArgument(readers != null && readers.length >= 1, "should not pass empty readers.");This relaxes the precondition for all callers of DataEvolutionFileReader, not just the blob path. Is there a case where a single reader is passed from non-blob paths? If not, consider keeping > 1 for non-blob scenarios or documenting when single-reader is expected.
Minor / Style
-
In
BlobFallbackRecordReaderTest, theReadResult.add()method silently treats placeholder rows differently (counts them vs collecting rowIds). This is fine for testing but the test would be more explicit if assertions on placeholder count were always paired with total row count assertions. -
The
fixedBlobByteshelper inBlobUpdateTestallocates2 * 1024 * 124bytes (≈248KB). Was2 * 1024 * 1024(2MB) intended? Or is this intentionally smaller to keep tests fast?
Positive
- The separation of
SpecialFieldBunchintoBlobFileBunchandVectorFileBunchis a good refactoring - they have fundamentally different semantics now. - The backward-compatible version 1 reading test (
testReadLegacyVersionOneBlobFile) is a nice addition. - The
BlobSequenceGroupRecordReaderjavadoc with ASCII art is very helpful for understanding the complex layout. - Python-side changes properly mirror the Java changes with appropriate error handling for unsupported placeholder in
read_arrow_batch.
Purpose
This is the first part of #7881
Including:
a. At first, all data files will be divided according to max_seq_num
b. within each group, create a sequential reader to logically concat files and fill missing gaps. For example: If the full row range of normal files is [0, 100], but some group only have one file with range [20, 80], the output is: [0, 19] -> filled with placeholders; [20, 80] -> records from files; [81, 100] -> filled with placeholders.
c. create readers for each group, and read the blob from the max group whose value is NOT a placeholder.
The mechanism can be illustrated as below:

Tests
ITCase and Unit tests