Skip to content

Fix SYN-ACK retransmission bug in accept() causing CI timeouts #43

Merged
danielinux merged 3 commits intowolfSSL:masterfrom
aidangarske:fix-wolfip-ci
Feb 24, 2026
Merged

Fix SYN-ACK retransmission bug in accept() causing CI timeouts #43
danielinux merged 3 commits intowolfSSL:masterfrom
aidangarske:fix-wolfip-ci

Conversation

@aidangarske
Copy link
Copy Markdown
Member

@aidangarske aidangarske commented Feb 23, 2026

Descritpion

Fixes test-wolfssl hanging on both Linux and FreeBSD CI due to an orphaned
RTO timer in wolfIP_sock_accept() that prevented SYN-ACK retransmission.

Root Cause

When a SYN arrived at the listening socket:

  1. Listening socket transitioned to TCP_SYN_RCVD and got an RTO timer
  2. accept() created a new socket and set it directly to TCP_ESTABLISHED
  3. Listening socket returned to TCP_LISTEN, orphaning its RTO timer
  4. If the SYN-ACK was lost or timing was off → no retransmission → host OS
    connect() timed out → test hung forever

The Fix (src/wolfip.c)

  • Keep accepted socket in TCP_SYN_RCVD instead of jumping to TCP_ESTABLISHED;
    the existing ACK handler at lines 3236-3247 completes the transition
  • Start RTO timer on the new accepted socket so SYN-ACK gets retransmitted
  • Stop the orphaned RTO timer on the listening socket when it returns to TCP_LISTEN
  • Defer CB_EVENT_WRITABLE until the final ACK completes the three-way handshake
  • Copy negotiated TCP options (window scale, timestamps, SACK, MSS) to the
    accepted socket so the connection starts with correct parameters

Test Changes

  • test_native_wolfssl.c: Replace blocking connect() with non-blocking connect
    using select() loop for robustness (matches the pattern in test_eventloop.c)
  • unit.c: Add 3 new unit tests verifying RTO timer is active on accepted sockets
    in TCP_SYN_RCVD, stops after the final ACK, and the socket transitions to
    TCP_ESTABLISHED with CB_EVENT_WRITABLE signaled
  • unit.c: Fix 3 existing unit test assertions to expect TCP_SYN_RCVD after
    accept() instead of TCP_ESTABLISHED

Test plan

  • Unit tests pass locally (517/517, 0 failures)
  • Linux CI: test-wolfssl completes without timeout
  • FreeBSD CI: test-wolfssl completes without timeout
  • macOS CI passes

aidangarske and others added 3 commits February 23, 2026 11:17
  The accepted socket was being set directly to TCP_ESTABLISHED state,
  which meant if the SYN-ACK was lost, no retransmission would occur
  because the RTO infrastructure only handles SYN_RCVD/SYN_SENT states.

  Changes:
  - Keep accepted socket in TCP_SYN_RCVD until final ACK received
  - Start RTO timer on accepted socket for SYN-ACK retransmission
  - Stop orphaned RTO timer on listening socket
  - Don't signal writable until connection fully established
  - Update test_native_wolfssl.c to use non-blocking connect with select()
  - Add unit tests for SYN-ACK retransmission behavior

  Fixes test-wolfssl and test-esp timeouts on Linux and FreeBSD CI.
  - Update all workflow push triggers from specific branches to ['*']
    so CI runs on feature/fix branches
  - Fix 3 unit tests that assert TCP_ESTABLISHED after accept(); the
    SYN-ACK retransmission fix correctly keeps accepted sockets in
    TCP_SYN_RCVD until the final ACK completes the handshake
@aidangarske aidangarske self-assigned this Feb 23, 2026
@aidangarske aidangarske marked this pull request as ready for review February 23, 2026 20:04
@danielinux danielinux requested a review from Copilot February 23, 2026 22:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a TCP accept()/SYN-ACK reliability issue in wolfIP that could leave an orphaned control-RTO timer on the listening socket, preventing SYN-ACK retransmissions and causing CI hangs during host connect().

Changes:

  • Update wolfIP_sock_accept() to keep the accepted socket in TCP_SYN_RCVD, start control-RTO on the accepted socket, stop it on the listener, and defer CB_EVENT_WRITABLE until the final ACK.
  • Add/adjust unit tests to validate accepted-socket control-RTO behavior and the SYN_RCVD→ESTABLISHED transition on final ACK.
  • Make the native wolfSSL test client use non-blocking connect() with a select() loop.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/wolfip.c Moves accepted sockets to TCP_SYN_RCVD, fixes control-RTO ownership (accepted vs listener), defers writable event signaling.
src/test/unit/unit.c Adds new tests around accept control-RTO and updates existing assertions for the new post-accept state.
src/test/test_native_wolfssl.c Switches to non-blocking connect + select() to reduce risk of indefinite blocking in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/wolfip.c
Comment thread src/test/test_native_wolfssl.c
Comment thread src/test/test_native_wolfssl.c
Comment thread src/test/unit/unit.c
@danielinux danielinux merged commit 2a39848 into wolfSSL:master Feb 24, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants