refactor(uffd-tests): pre-fault uffd-registered pages via madvise to eliminate STW-deadlock class#2542
Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit b8500e9. Bugbot is set up for automated code reviews on this repo. Configure here. |
…V_POPULATE_*) Eliminate the Go-runtime ↔ UFFD STW-deadlock class without changing the two-process harness or production code: in executeRead/executeWrite, run the actual fault through a syscall (MADV_POPULATE_READ / _WRITE) instead of the implicit fault triggered by bytes.Equal / copy. Once madvise returns the page is resident, so the subsequent Go memory ops never fault. Why a syscall is necessary: a plain Go load against a uffd-MISSING page parks the OS thread in TASK_KILLABLE while the runtime still considers the goroutine _Grunning, so any concurrent STW (GC, ReadMemStats, etc.) waits for the thread to reach a safe point — forever. Going through unix.Madvise puts the goroutine in _Gsyscall before the kernel parks the thread, which is preemptible by STW. MADV_POPULATE_* (Linux 5.14+) is the correct syscall for this: its fault path uses get_user_pages (no FOLL_REMOTE), so UFFD fires normally — unlike pread/pwrite on /proc/self/mem, whose access_remote_vm path sets FAULT_FLAG_REMOTE and FOLL_FORCE and intentionally bypasses UFFD (so a debugger can't deadlock on a uffd-controlled page). The parent's mmap and UFFDIO_REGISTER are unchanged. The cross-process RPC layer is unchanged. No production code is touched.
The STW-deadlock hazard those tests were guarding against is gone after the previous commit (madvise pre-faulting), so the workaround in caf0377 and the older serialisations on TestFaultedShortCircuitOrdering, TestRemoveThenWriteGated, and TestWriteThenRemoveGated are no longer needed. This: - Reverts caf0377 by re-adding t.Parallel() on TestStaleSourceRaceMissingAndRemove and TestNoMadviseDeadlockWithInflightCopy (both outer and inner subtests). - Drops the //nolint:paralleltest,tparallel comments on TestFaultedShortCircuitOrdering, TestRemoveThenWriteGated, and TestWriteThenRemoveGated and re-adds t.Parallel() on each. The race tests now also pre-fault their direct page reads via MADV_POPULATE_READ for the same reason: an ad-hoc page[0] load could otherwise re-introduce the deadlock on a removed page.
736d083 to
b8500e9
Compare
|
Folded into #2520. The two commits (madvise pre-fault + restore t.Parallel) now live there. |
Stacked on #2520. Eliminates the Go-runtime UFFD STW-deadlock class without splitting the harness further or removing parallelism.
Problem
Test goroutines in the parent read UFFD-registered memory via plain Go (
bytes.Equal(memoryArea[...], ...)). When the page is missing, the kernel parks the OS thread in TASK_KILLABLE and Linux refuses to deliver SIGURG. Go's STW (e.g. triggered by a concurrent goroutine allocating during a barrier hold) waits for that thread to reach a safe point forever.Fix
Drive the actual fault through a syscall —
unix.Madvise(page, MADV_POPULATE_READ/WRITE)— so the goroutine is in_Gsyscallwhile the kernel is parked. The Go runtime'sentersyscalltransitions the goroutine to_Gsyscallbefore the syscall, detaching it from a P; STW does not wait on_Gsyscallgoroutines, so GC completes immediately even if the kernel parks the thread inside the pagefault for arbitrarily long. Oncemadvisereturns, the page is resident and the subsequent plain Gobytes.Equal/copydoes not fault.Why MADV_POPULATE_* and not pread/pwrite on /proc/self/mem
/proc/self/memgoes throughaccess_remote_vm->get_user_pages_remotewithFOLL_FORCE | FOLL_REMOTE, which intentionally bypasses UFFD (so a debugger cannot deadlock on a UFFD-controlled page). Empirically,preadon a UFFD-MISSING page returns EIO.MADV_POPULATE_READ/MADV_POPULATE_WRITE(Linux 5.14+) useget_user_pages(noFOLL_REMOTE), so UFFD fires normally.Knock-on cleanups
t.Parallel()onTestStaleSourceRaceMissingAndRemoveandTestNoMadviseDeadlockWithInflightCopy(reverts the workaround in feat(uffd): handle UFFD_EVENT_REMOVE; track per-page state; race-safe COPY #2520'scaf0377dc).//nolint:paralleltest,tparallelrationale onTestFaultedShortCircuitOrdering,TestRemoveThenWriteGated, andTestWriteThenRemoveGated. Pure cleanup — the hazard is gone.Notes
unix.Madvise(MADV_DONTNEED)already runs in syscall mode, so REMOVE-trigger paths needed no changes.page[0]after a barrier-released read) also pre-faults viaMADV_POPULATE_READso an ad-hoc load on aremovedpage cannot re-introduce the deadlock.