Skip to content

[fix](binlog) Fix missing binlog column index when converting TTabletSchema to TabletSchemaPB#64484

Open
heguanhui wants to merge 2 commits into
apache:masterfrom
heguanhui:fix/fix-binlog-be-coredump-issue
Open

[fix](binlog) Fix missing binlog column index when converting TTabletSchema to TabletSchemaPB#64484
heguanhui wants to merge 2 commits into
apache:masterfrom
heguanhui:fix/fix-binlog-be-coredump-issue

Conversation

@heguanhui

@heguanhui heguanhui commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: #64483

Problem Summary:
When running GroupRowsetWriterTest.sub_writer_rollback, a coredump occurs:
F20260613 19:13:27.347458 row_binlog_segment_writer.cpp:69] Check failed: lsn_col_id >= 0 binlog schema missing DORIS_BINLOG_LSN

Root Cause

TabletMeta::init_schema_from_thrift() does not set binlog_lsn_col_idx and binlog_timestamp_col_idx in TabletSchemaPB when converting from TTabletSchema. As a result, these fields remain -1 after deserialization, causing the CHECK failure.

Solution

Add logic in init_schema_from_thrift() to:

  1. Detect binlog special columns (__DORIS_BINLOG_LSN__, __DORIS_BINLOG_TIMESTAMP__) by name
  2. Record their column indices
  3. Set the corresponding fields in TabletSchemaPB

Additionally, fix a nullable type mismatch for the timestamp column in _fill_binlog_columns() by using check_and_get_column instead of assert_cast.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from f707b1d to ca44c72 Compare June 13, 2026 20:06
@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29004 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ca44c726c9401866066b89731eb6c9af0f488c6a, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17692	4004	3978	3978
q2	q3	10767	1338	812	812
q4	4684	465	344	344
q5	7535	875	571	571
q6	184	170	137	137
q7	776	844	620	620
q8	9485	1613	1622	1613
q9	7218	4524	4477	4477
q10	6825	1813	1513	1513
q11	432	265	245	245
q12	635	420	290	290
q13	18094	3412	2797	2797
q14	268	259	242	242
q15	q16	820	778	708	708
q17	1165	1048	818	818
q18	6662	5677	5549	5549
q19	1320	1271	1065	1065
q20	498	401	266	266
q21	5958	2808	2640	2640
q22	462	379	319	319
Total cold run time: 101480 ms
Total hot run time: 29004 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4793	4742	4705	4705
q2	q3	5314	5215	4586	4586
q4	2114	2179	1364	1364
q5	4884	4923	4759	4759
q6	237	178	135	135
q7	1884	1751	1563	1563
q8	2385	2017	1945	1945
q9	7321	7370	7341	7341
q10	5047	4652	4229	4229
q11	545	382	352	352
q12	731	731	527	527
q13	3007	3394	2795	2795
q14	274	293	266	266
q15	q16	685	700	611	611
q17	1279	1265	1250	1250
q18	7390	6830	6761	6761
q19	1140	1075	1099	1075
q20	2223	2219	1972	1972
q21	5295	4580	4437	4437
q22	554	464	399	399
Total cold run time: 57102 ms
Total hot run time: 51072 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 168871 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ca44c726c9401866066b89731eb6c9af0f488c6a, data reload: false

query5	4318	619	495	495
query6	433	192	173	173
query7	4870	574	302	302
query8	362	216	201	201
query9	8751	4051	4061	4051
query10	452	313	251	251
query11	5969	2367	2131	2131
query12	152	101	100	100
query13	1304	589	423	423
query14	6341	5362	5085	5085
query14_1	4391	4424	4419	4419
query15	202	196	182	182
query16	1022	458	418	418
query17	1094	672	574	574
query18	2415	468	338	338
query19	194	179	137	137
query20	110	107	106	106
query21	208	143	117	117
query22	13715	13650	13486	13486
query23	17196	16579	16161	16161
query23_1	16255	16232	16286	16232
query24	7646	1798	1303	1303
query24_1	1326	1324	1320	1320
query25	580	480	392	392
query26	1323	321	166	166
query27	2656	560	335	335
query28	4484	2053	2023	2023
query29	1130	651	501	501
query30	314	231	200	200
query31	1114	1074	961	961
query32	118	65	61	61
query33	525	331	266	266
query34	1187	1159	641	641
query35	760	820	674	674
query36	1400	1398	1249	1249
query37	155	113	101	101
query38	3174	3164	3041	3041
query39	953	913	895	895
query39_1	896	869	860	860
query40	222	128	108	108
query41	72	67	67	67
query42	100	98	134	98
query43	328	335	294	294
query44	
query45	194	187	179	179
query46	1058	1231	778	778
query47	2321	2309	2261	2261
query48	402	416	293	293
query49	637	492	348	348
query50	1012	358	266	266
query51	4379	4294	4199	4199
query52	87	88	78	78
query53	239	268	187	187
query54	264	222	196	196
query55	77	74	76	74
query56	228	214	218	214
query57	1423	1396	1325	1325
query58	236	209	210	209
query59	1572	1629	1393	1393
query60	277	244	225	225
query61	149	146	148	146
query62	701	648	579	579
query63	238	185	188	185
query64	2549	791	602	602
query65	
query66	1818	468	338	338
query67	29660	29646	29491	29491
query68	
query69	428	297	265	265
query70	1001	972	968	968
query71	286	221	241	221
query72	2902	2627	2373	2373
query73	871	744	427	427
query74	5111	4978	4779	4779
query75	2664	2597	2242	2242
query76	2319	1167	799	799
query77	353	376	300	300
query78	12261	12359	11934	11934
query79	1462	1052	791	791
query80	720	477	408	408
query81	468	280	236	236
query82	580	162	124	124
query83	340	281	253	253
query84	
query85	899	510	409	409
query86	407	303	274	274
query87	3390	3342	3237	3237
query88	3697	2752	2777	2752
query89	427	387	326	326
query90	1807	186	180	180
query91	175	161	134	134
query92	66	60	56	56
query93	1538	1445	883	883
query94	619	344	273	273
query95	681	372	444	372
query96	1019	826	335	335
query97	2708	2673	2623	2623
query98	210	205	205	205
query99	1126	1192	1019	1019
Total cold run time: 250510 ms
Total hot run time: 168871 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 90.00% (18/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.36% (21306/39196)
Line Coverage 37.96% (203358/535659)
Region Coverage 33.99% (159594/469596)
Branch Coverage 34.97% (69788/199561)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.00% (18/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.07% (28348/38274)
Line Coverage 58.06% (309172/532521)
Region Coverage 54.96% (259295/471748)
Branch Coverage 56.29% (112479/199820)

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch 3 times, most recently from efa3cc3 to 4ef5c5d Compare June 14, 2026 03:53
@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29303 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4ef5c5d19580d58d190a11b169b2f1c6f373c72f, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17920	4208	4118	4118
q2	q3	10869	1354	796	796
q4	4686	486	337	337
q5	7582	878	577	577
q6	177	173	137	137
q7	763	856	618	618
q8	9585	1597	1656	1597
q9	6289	4516	4498	4498
q10	6850	1810	1480	1480
q11	443	275	245	245
q12	646	422	301	301
q13	18158	3458	2799	2799
q14	261	258	247	247
q15	q16	826	769	709	709
q17	923	978	976	976
q18	6792	5698	5551	5551
q19	1338	1298	1148	1148
q20	527	406	259	259
q21	6109	2872	2594	2594
q22	449	379	316	316
Total cold run time: 101193 ms
Total hot run time: 29303 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5085	4923	4800	4800
q2	q3	4980	5223	4659	4659
q4	2118	2207	1404	1404
q5	4772	4746	4832	4746
q6	234	179	130	130
q7	1848	1761	1597	1597
q8	2377	1965	1992	1965
q9	7427	7491	7451	7451
q10	4727	4672	4242	4242
q11	529	383	353	353
q12	735	743	534	534
q13	3004	3432	2797	2797
q14	278	284	246	246
q15	q16	676	697	603	603
q17	1282	1268	1264	1264
q18	7406	6764	6900	6764
q19	1134	1100	1124	1100
q20	2218	2214	1952	1952
q21	5256	4606	4492	4492
q22	524	451	424	424
Total cold run time: 56610 ms
Total hot run time: 51523 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169441 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4ef5c5d19580d58d190a11b169b2f1c6f373c72f, data reload: false

query5	4334	636	510	510
query6	463	192	176	176
query7	4835	547	307	307
query8	365	218	206	206
query9	8749	4049	4020	4020
query10	444	310	254	254
query11	5931	2368	2121	2121
query12	175	99	94	94
query13	1232	628	417	417
query14	6351	5441	5052	5052
query14_1	4408	4423	4397	4397
query15	203	197	179	179
query16	1003	454	439	439
query17	1124	719	597	597
query18	2596	462	349	349
query19	202	195	146	146
query20	110	112	108	108
query21	217	148	125	125
query22	13617	13574	13323	13323
query23	17392	16463	16233	16233
query23_1	16230	16343	16335	16335
query24	7568	1830	1350	1350
query24_1	1344	1319	1345	1319
query25	577	484	400	400
query26	1323	335	176	176
query27	2630	596	350	350
query28	4434	2029	2025	2025
query29	1094	630	504	504
query30	313	242	203	203
query31	1119	1085	974	974
query32	104	60	55	55
query33	516	322	251	251
query34	1176	1157	656	656
query35	756	848	714	714
query36	1396	1363	1198	1198
query37	157	109	91	91
query38	3229	3246	3074	3074
query39	961	960	946	946
query39_1	904	937	917	917
query40	222	136	113	113
query41	66	75	74	74
query42	103	101	96	96
query43	327	325	286	286
query44	
query45	203	194	187	187
query46	1169	1264	749	749
query47	2323	2409	2236	2236
query48	394	408	282	282
query49	617	458	368	368
query50	1023	361	258	258
query51	4353	4344	4251	4251
query52	88	90	76	76
query53	246	271	187	187
query54	281	217	193	193
query55	79	76	70	70
query56	232	232	207	207
query57	1423	1373	1303	1303
query58	246	220	212	212
query59	1603	1706	1423	1423
query60	276	264	224	224
query61	166	146	144	144
query62	715	648	602	602
query63	239	187	184	184
query64	2562	750	604	604
query65	
query66	1759	481	333	333
query67	29878	29838	29551	29551
query68	
query69	423	299	262	262
query70	982	954	974	954
query71	292	232	203	203
query72	2921	2600	2343	2343
query73	884	786	444	444
query74	5123	4961	4766	4766
query75	2723	2613	2228	2228
query76	2287	1228	803	803
query77	351	392	291	291
query78	12237	12512	11882	11882
query79	1485	1085	768	768
query80	942	488	401	401
query81	506	280	237	237
query82	571	158	121	121
query83	337	269	256	256
query84	
query85	917	495	414	414
query86	408	299	267	267
query87	3389	3462	3199	3199
query88	3667	2741	2714	2714
query89	431	381	328	328
query90	1779	185	186	185
query91	175	181	132	132
query92	64	63	59	59
query93	1619	1479	959	959
query94	614	337	304	304
query95	687	376	359	359
query96	1039	839	365	365
query97	2694	2730	2577	2577
query98	209	205	198	198
query99	1144	1178	1044	1044
Total cold run time: 251450 ms
Total hot run time: 169441 ms

@heguanhui

Copy link
Copy Markdown
Contributor Author

run p0 regression

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 77.78% (28/36) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.08% (28354/38274)
Line Coverage 58.07% (309232/532537)
Region Coverage 54.89% (258947/471762)
Branch Coverage 56.26% (112434/199832)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 77.78% (28/36) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.13% (28371/38274)
Line Coverage 58.08% (309314/532537)
Region Coverage 54.89% (258967/471762)
Branch Coverage 56.27% (112453/199832)

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from 4ef5c5d to c955a8e Compare June 14, 2026 17:57
@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from c955a8e to 3e8f036 Compare June 14, 2026 18:00
@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@heguanhui

Copy link
Copy Markdown
Contributor Author

review

@csun5285

Copy link
Copy Markdown
Contributor

/review

Comment thread be/src/storage/tablet/tablet_schema.cpp Outdated

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found.

Critical checkpoint conclusions:

  • Goal and proof: The PR addresses the missing row-binlog LSN/timestamp column indexes when converting TTabletSchema to TabletSchemaPB, and adds BE unit coverage for both new conversion and old-metadata fallback paths. Existing GroupRowsetWriterTest coverage should exercise the original row-binlog writer crash path once these metadata fields resolve correctly.
  • Scope: The code change is small and focused on row-binlog schema metadata and the row-binlog timestamp placeholder write path.
  • Concurrency/lifecycle: No new shared mutable state, locks, threads, or non-trivial lifetime ownership are introduced.
  • Configuration/compatibility: No new config items. The TabletSchemaPB fallback by column name preserves compatibility with older metadata where these optional indexes remained -1.
  • Parallel paths: TabletSchema serialization already persists the resolved indexes; cloud PB conversion copies the fields when present, and the fallback also helps deserialized old PBs.
  • Conditional checks: The added timestamp nullable check has a concrete path from existing BE test-created row-binlog schemas while production FE schemas keep timestamp nullable; read-time TSO replacement already handles both nullable and non-nullable columns.
  • Test coverage: Added BE unit tests cover conversion and fallback. I did not run BE unit tests because no built output/test binary was present in this review checkout.
  • Test results: passed. No generated result files are involved.
  • Observability: No new observability appears necessary for this metadata fix.
  • Transaction/persistence/data correctness: Persisted row-binlog schema metadata now carries the indexes needed by RowBinlogSegmentWriter and row-binlog readers; old metadata remains recoverable by name lookup.
  • Performance: Only small schema-construction scans are added, outside hot row processing paths.

User focus: no additional user-provided review focus was present.

IColumn* ts_col_ptr = binlog_prefix_columns[2].get();
auto* ts_nullable_column = assert_cast<ColumnNullable*>(ts_col_ptr);
ts_nullable_column->insert_many_defaults(num_rows); // NULL placeholder (value + null map)
auto* ts_nullable_column = check_and_get_column<ColumnNullable>(ts_col_ptr);

@TsukiokaKogane TsukiokaKogane Jun 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary, for row binlog writer ts_col_ptr should always exist

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

You're right that timestamp column should always be nullable by design. However, during my testing after fixing the binlog column index issue, I still encountered a coredump at this assert_cast in GroupRowsetWriterTest.sub_writer_rollback.

The root cause is that in the unit test environment, the timestamp column is not wrapped as ColumnNullable when creating the Block via create_block_by_cids. This is because the test uses a simplified schema creation path that doesn't go through the full FE->BE schema conversion.

I agree this should be investigated separately. But to make the current PR clean and focused on the binlog column index issue, I will split this into two PRs:

  • PR1 (this PR): Fix binlog column index in init_schema_from_thrift
  • PR2 (follow-up): Investigate and fix the nullable type mismatch for timestamp column in unit tests

I'll revert the change in row_binlog_segment_writer.cpp from this PR and keep only the binlog column index fix. The timestamp column issue will be addressed in a separate PR after further investigation.

Thank you for pointing this out!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary, for row binlog writer ts_col_ptr should always exist

this fixed in pr #64495

Comment thread be/src/storage/utils.h
static const std::string SEQUENCE_COL = "__DORIS_SEQUENCE_COL__";
static const std::string BINLOG_TIMESTAMP_COL = "__DORIS_BINLOG_TIMESTAMP__";
static const std::string BINLOG_LSN_COL = "__DORIS_BINLOG_LSN__";
static const std::string BINLOG_OP_COL = "__DORIS_BINLOG_OP__";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this add for fallback test, now fallback test will remove, so I will remove this also.

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from 3e8f036 to dd75689 Compare June 15, 2026 02:33

tablet_schema_pb->set_num_short_key_columns(tablet_schema.short_key_column_count);
tablet_schema_pb->set_num_rows_per_row_block(config::default_num_rows_per_column_file_block);
tablet_schema_pb->set_sequence_col_idx(tablet_schema.sequence_col_idx);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could just like sequence_col, change to tablet_schema_pb->set_binlog_lsn_col_idx
and tablet_schema_pb->set_binlog_timestamp_col_idx from tablet_schema?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. However, TTabletSchema currently doesn't have
binlog_lsn_col_idx and binlog_timestamp_col_idx fields. Adding them would
require modifying Thrift definitions and FE code, which is a larger change.

This PR focuses on fixing the coredump in BE UT with minimal changes.
Computing the indices by column name is consistent with how other special
columns (like delete_sign_idx) are handled when TTabletSchema doesn't
provide them directly.

If you prefer, I can create a follow-up PR to add these fields to
TTabletSchema and pass them from FE to BE. But for this PR, the dynamic
approach works and is safe.

What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, just leave it for now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the approval!

col_ordinal_to_unique_id[i] = i;
}

// 构造 SchemaCreateOptions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释不要使用中文

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全部换成英文注释了

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from dd75689 to 338a31a Compare June 15, 2026 03:02
@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29031 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 338a31a08725b0c946f6ba6a7ceb1715f87640f5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17591	4035	3974	3974
q2	q3	10820	1350	785	785
q4	4679	472	338	338
q5	7552	885	574	574
q6	181	170	137	137
q7	767	868	633	633
q8	9332	1628	1598	1598
q9	5767	4462	4453	4453
q10	6767	1808	1532	1532
q11	430	265	247	247
q12	620	420	289	289
q13	18111	3367	2795	2795
q14	262	255	236	236
q15	q16	816	772	707	707
q17	978	891	888	888
q18	6873	5855	5730	5730
q19	1376	1265	1045	1045
q20	491	383	255	255
q21	6042	2521	2528	2521
q22	425	362	294	294
Total cold run time: 99880 ms
Total hot run time: 29031 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4318	4237	4212	4212
q2	q3	4499	4937	4312	4312
q4	2098	2186	1390	1390
q5	4447	4286	4299	4286
q6	230	173	126	126
q7	1696	1665	1468	1468
q8	2910	2217	2128	2128
q9	8178	8163	8116	8116
q10	4824	4784	4280	4280
q11	564	412	394	394
q12	732	756	540	540
q13	3282	3682	2951	2951
q14	320	314	261	261
q15	q16	700	771	651	651
q17	1311	1310	1313	1310
q18	7878	7245	7223	7223
q19	1188	1135	1132	1132
q20	2203	2192	1943	1943
q21	5263	4539	4436	4436
q22	492	444	445	444
Total cold run time: 57133 ms
Total hot run time: 51603 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 168703 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 338a31a08725b0c946f6ba6a7ceb1715f87640f5, data reload: false

query5	4300	638	495	495
query6	436	190	168	168
query7	4864	565	294	294
query8	355	202	217	202
query9	8744	4009	3995	3995
query10	430	312	254	254
query11	5871	2360	2132	2132
query12	153	101	103	101
query13	1257	609	405	405
query14	6406	5369	5073	5073
query14_1	4380	4414	4397	4397
query15	207	205	178	178
query16	986	457	432	432
query17	1108	706	568	568
query18	2445	477	347	347
query19	199	191	166	166
query20	113	109	105	105
query21	220	138	115	115
query22	13568	13616	13383	13383
query23	17293	16549	16303	16303
query23_1	16319	16348	16209	16209
query24	7509	1776	1317	1317
query24_1	1343	1292	1294	1292
query25	556	466	391	391
query26	1317	301	168	168
query27	2735	580	335	335
query28	4448	2022	2021	2021
query29	1082	626	499	499
query30	307	245	203	203
query31	1125	1093	953	953
query32	107	62	62	62
query33	552	358	251	251
query34	1149	1210	643	643
query35	745	767	678	678
query36	1407	1421	1251	1251
query37	152	99	91	91
query38	3197	3139	3043	3043
query39	923	916	935	916
query39_1	868	857	870	857
query40	213	121	96	96
query41	63	62	59	59
query42	93	95	91	91
query43	312	318	271	271
query44	
query45	193	182	171	171
query46	1067	1176	726	726
query47	2411	2327	2266	2266
query48	381	405	308	308
query49	618	450	347	347
query50	945	344	248	248
query51	4348	4274	4260	4260
query52	85	87	77	77
query53	237	265	190	190
query54	262	221	201	201
query55	76	74	71	71
query56	227	225	223	223
query57	1445	1408	1321	1321
query58	239	203	211	203
query59	1541	1612	1391	1391
query60	291	244	226	226
query61	155	167	170	167
query62	703	662	577	577
query63	225	186	176	176
query64	2528	750	600	600
query65	
query66	1823	455	336	336
query67	29715	29657	29428	29428
query68	
query69	426	305	257	257
query70	984	925	951	925
query71	301	229	204	204
query72	2863	2620	2371	2371
query73	808	738	435	435
query74	5104	4921	4755	4755
query75	2621	2553	2260	2260
query76	2310	1195	759	759
query77	350	383	292	292
query78	12374	12407	11845	11845
query79	1417	1049	773	773
query80	578	461	380	380
query81	451	279	239	239
query82	582	156	118	118
query83	355	279	244	244
query84	
query85	844	495	405	405
query86	375	279	292	279
query87	3356	3337	3152	3152
query88	3616	2681	2722	2681
query89	419	369	340	340
query90	1992	174	171	171
query91	171	150	127	127
query92	60	59	56	56
query93	1551	1492	872	872
query94	549	333	299	299
query95	661	465	327	327
query96	1038	754	372	372
query97	2709	2687	2559	2559
query98	212	216	206	206
query99	1150	1174	1027	1027
Total cold run time: 249962 ms
Total hot run time: 168703 ms

@heguanhui

Copy link
Copy Markdown
Contributor Author

run p0 regression

@heguanhui

Copy link
Copy Markdown
Contributor Author

run external regression

@TsukiokaKogane

Copy link
Copy Markdown
Contributor

need to fix row_binlog_segment_writer.cpp:334 for not nullable column

Comment thread be/src/storage/tablet/tablet_meta.cpp Outdated
}
col_ordinal++;
init_column_from_tcolumn(unique_id, tcolumn, column);
if (tcolumn.column_name == BINLOG_LSN_COL) {

@TsukiokaKogane TsukiokaKogane Jun 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-        col_ordinal++;
         init_column_from_tcolumn(unique_id, tcolumn, column);

         if (column->is_bf_column()) {
             has_bf_columns = true;
         }

+        if (column->name() == BINLOG_LSN_COL) {
+            tablet_schema_pb->set_binlog_lsn_col_idx(col_ordinal);
+        } else if (column->name() == BINLOG_TIMESTAMP_COL) {
+            tablet_schema_pb->set_binlog_timestamp_col_idx(col_ordinal);
+        }
+        col_ordinal++;
+

i think it's better this way

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Updated as suggested. The timestamp nullable cast fix will be in a separate PR .

@TsukiokaKogane TsukiokaKogane Jun 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why fix it in another PR? it won't pass be ut without fixing timestamp nullable cast

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll combine both fixes into this PR:

  1. Fix binlog column index in init_schema_from_thrift
  2. Fix test helper to set nullable for binlog special columns

This way the UT will pass without modifying row_binlog_segment_writer.cpp.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why fix it in another PR? it won't pass be ut without fixing timestamp nullable cast

Thanks for the suggestion. I've merged the test helper fix into this PR:

  • Added BINLOG_OP_COL to utils.h
  • Set is_allow_null = true for binlog special columns in enable_row_binlog()

Now this PR includes both fixes:

  1. Fix binlog column index in init_schema_from_thrift()
  2. Fix test helper to create nullable binlog columns

The UT should pass without modifying row_binlog_segment_writer.cpp.

PTAL. Thanks!

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from 338a31a to a672379 Compare June 15, 2026 08:53
@heguanhui heguanhui requested a review from TsukiokaKogane June 15, 2026 08:54
@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from a672379 to 7e2c20b Compare June 15, 2026 08:57
@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 28990 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7e2c20be557b4c35e4da8eec39bc180b01cf96f4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17711	4070	4094	4070
q2	q3	10735	1438	804	804
q4	4681	470	339	339
q5	7569	858	593	593
q6	188	169	134	134
q7	801	836	621	621
q8	9325	1626	1565	1565
q9	5711	4485	4542	4485
q10	6808	1835	1564	1564
q11	433	274	244	244
q12	631	437	284	284
q13	18128	3425	2756	2756
q14	264	259	246	246
q15	q16	817	766	715	715
q17	975	966	976	966
q18	7386	5821	5606	5606
q19	1193	1265	1092	1092
q20	525	399	262	262
q21	5577	2608	2344	2344
q22	443	358	300	300
Total cold run time: 99901 ms
Total hot run time: 28990 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4361	4285	4259	4259
q2	q3	4487	4973	4364	4364
q4	2064	2212	1380	1380
q5	4501	4312	4303	4303
q6	232	175	129	129
q7	1714	1623	1732	1623
q8	2713	2239	2148	2148
q9	8158	8079	8091	8079
q10	4807	4908	4333	4333
q11	581	418	399	399
q12	752	746	528	528
q13	3338	3645	2948	2948
q14	308	322	269	269
q15	q16	712	732	671	671
q17	1374	1336	1370	1336
q18	8092	7227	7290	7227
q19	1205	1155	1101	1101
q20	2234	2218	1954	1954
q21	5275	4557	4458	4458
q22	531	443	419	419
Total cold run time: 57439 ms
Total hot run time: 51928 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169858 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7e2c20be557b4c35e4da8eec39bc180b01cf96f4, data reload: false

query5	4315	619	477	477
query6	446	194	183	183
query7	4834	568	291	291
query8	350	211	202	202
query9	8758	4069	4109	4069
query10	445	304	255	255
query11	5891	2359	2201	2201
query12	159	102	98	98
query13	1272	612	420	420
query14	6401	5436	5107	5107
query14_1	4451	4373	4407	4373
query15	206	197	181	181
query16	1043	454	408	408
query17	1123	711	577	577
query18	2629	486	353	353
query19	199	190	146	146
query20	118	110	107	107
query21	216	139	119	119
query22	13618	13656	13419	13419
query23	17388	16514	16110	16110
query23_1	16366	16362	16379	16362
query24	7696	1857	1305	1305
query24_1	1319	1323	1335	1323
query25	556	465	417	417
query26	1306	314	172	172
query27	2672	559	340	340
query28	4443	2051	2043	2043
query29	1069	613	507	507
query30	310	252	192	192
query31	1135	1093	955	955
query32	105	63	57	57
query33	532	356	245	245
query34	1174	1130	667	667
query35	764	778	668	668
query36	1367	1414	1272	1272
query37	148	101	88	88
query38	3200	3143	3062	3062
query39	936	919	898	898
query39_1	862	862	888	862
query40	214	122	100	100
query41	70	78	69	69
query42	95	95	93	93
query43	322	326	278	278
query44	
query45	198	184	181	181
query46	1124	1196	762	762
query47	2339	2334	2218	2218
query48	392	395	304	304
query49	624	459	361	361
query50	1032	349	259	259
query51	4334	4278	4306	4278
query52	86	89	79	79
query53	261	276	191	191
query54	267	219	194	194
query55	79	77	72	72
query56	237	216	216	216
query57	1429	1375	1312	1312
query58	239	212	208	208
query59	1598	1663	1474	1474
query60	285	246	235	235
query61	154	147	150	147
query62	699	654	591	591
query63	232	187	188	187
query64	2477	762	606	606
query65	
query66	1745	453	338	338
query67	29762	29681	29017	29017
query68	
query69	412	300	265	265
query70	979	960	971	960
query71	333	217	212	212
query72	3001	2607	2394	2394
query73	845	790	420	420
query74	5110	4957	4776	4776
query75	2649	2590	2238	2238
query76	2365	1144	777	777
query77	368	383	284	284
query78	12326	12579	11918	11918
query79	1430	1004	770	770
query80	602	475	416	416
query81	453	277	237	237
query82	581	161	123	123
query83	352	275	254	254
query84	
query85	858	507	427	427
query86	384	306	285	285
query87	3403	3394	3205	3205
query88	3639	2776	2720	2720
query89	419	384	328	328
query90	1974	186	174	174
query91	171	159	133	133
query92	63	63	58	58
query93	1510	1432	928	928
query94	547	364	314	314
query95	703	401	434	401
query96	1077	809	337	337
query97	2726	2685	2559	2559
query98	217	204	199	199
query99	1168	1175	1030	1030
Total cold run time: 251313 ms
Total hot run time: 169858 ms

@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants