Skip to content

Commit b2be3e1

Browse files
authored
cargo-pgrx: inject -Wl,--no-gc-sections to keep .pgrxsc sections on aarch64 Linux (pgcentralfoundation#2280)
Closes pgcentralfoundation#2279 (hopefully) Background ---------- The "one compile" schema pipeline embeds SQL entity metadata in a `.pgrxsc` ELF section made of `#[used]` statics scattered across every codegen unit. Rust lowers `#[used]` to `@llvm.used`, which is supposed to translate to `SHF_GNU_RETAIN` so `ld --gc-sections` (which rustc passes by default on ELF cdylibs) leaves the section alone. That contract has three load-bearing links: LLVM must emit the flag, binutils must honor it, and LTO must not mask the difference. The chain breaks on aarch64 Linux under non-LTO release builds. When it breaks, every `.pgrxsc` input section is dropped, the output section vanishes, and `cargo pgrx schema` fails with: no embedded pgrx schema section found; expected `.pgrxsc` on ELF/PE or `__DATA,__pgrxsc` on Mach-O pglinter 0.18.0 hit exactly this on their arm64 `release-lto-off` build while their amd64 `release` (lto = "fat") build kept working — fat LTO preserves `@llvm.used` globals via the linker plugin, sidestepping the `SHF_GNU_RETAIN` path entirely. The asymmetry is profile-driven, not architecture-driven; arm64 just happens to be where we see it first. Fix --- cargo-pgrx now injects a top-level cargo `--config` on every cargo invocation it spawns: target.'cfg(all(target_family = "unix", not(target_os = "macos")))'.rustflags = ["-C", "link-arg=-Wl,--no-gc-sections"] That disables section GC for the cdylib link step on Linux/BSDs (where the retention contract is fragile) and leaves macOS (Mach-O, `ld64`, uses `-dead_strip` and doesn't speak `-Wl,...`) and Windows MSVC (`link.exe`, also doesn't speak `-Wl,...`) alone. The injection is wired into the `cargo()` helper in `cargo-pgrx/src/cargo.rs`, which is the single chokepoint that every cargo subprocess in the crate goes through — both `Cargo::into_command()` (schema, test, run, pgrx_target) and the hand-rolled `Command` in `install.rs::build_extension`. The cost is a few KB of unreferenced code surviving in the final .so, which is a trade we'll happily make for reliable schema generation across the whole LLVM/binutils matrix. Users who set `RUSTFLAGS`/`CARGO_ENCODED_RUSTFLAGS` continue to override per cargo's rustflags precedence. A new env var `CARGO_PGRX_DISABLE_GC_SECTIONS_WORKAROUND` provides an explicit escape hatch. Why inject instead of touch `pgrx`'s own build.rs ------------------------------------------------- `cargo:rustc-link-arg-cdylib` from a build script only affects the package the build script belongs to — it doesn't propagate to dependents. `pgrx` is an rlib, so emitting link args from its build.rs is effectively a no-op at extension-link time. Injecting from cargo-pgrx at the cargo invocation boundary is the one place where we can fix this for every extension without requiring a source change in each one. CI regression coverage ---------------------- New `pgrx_tests_arm64` job in `.github/workflows/tests.yml` runs on `ubuntu-24.04-arm` (free native arm64 runner, same as pglinter uses) and executes: cd pgrx-unit-tests && cargo pgrx schema pg17 --release with `CARGO_PROFILE_RELEASE_LTO=false` and `CARGO_PROFILE_RELEASE_CODEGEN_UNITS=16` pinned so the regression test stays faithful to the pglinter failure shape even if the workspace `[profile.release]` is edited later. pgrx-unit-tests has dozens of `#[pg_extern]` / `#[pg_trigger]` / `#[pg_aggregate]` / etc. entries, so every schema entity kind contributes to `.pgrxsc` and the test exercises every producer path. If the `--no-gc-sections` injection ever regresses, this job fails loudly with the exact error string users would see. Changes ------- - cargo-pgrx/src/cargo.rs: `pgrx_injected_config_args()` helper + unit test guarding the `--no-gc-sections` and non-macOS scope - cargo-pgrx/README.md: document `CARGO_PGRX_DISABLE_GC_SECTIONS_WORKAROUND` alongside existing `PGRX_*` env vars - .github/workflows/tests.yml: `pgrx_tests_arm64` job pinned to the non-LTO release profile shape
1 parent 0486b47 commit b2be3e1

5 files changed

Lines changed: 217 additions & 2 deletions

File tree

.github/workflows/tests.yml

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,114 @@ jobs:
302302
- name: Stop sccache server
303303
run: sccache --stop-server || true
304304

305+
pgrx_tests_arm64:
306+
name: Linux arm64 native tests
307+
needs: lintck
308+
runs-on: ubuntu-24.04-arm
309+
if: "!contains(github.event.head_commit.message, 'nogha')"
310+
env:
311+
PG_VER: ${{ matrix.postgres }}
312+
RUST_TOOLCHAIN: 'stable'
313+
314+
# Regression coverage for the `.pgrxsc` ELF-section retention contract on
315+
# aarch64 Linux. The bug this is guarding against: without LTO, `#[used]`
316+
# statics in `.pgrxsc` rely on `SHF_GNU_RETAIN` surviving through LLVM →
317+
# binutils → `ld --gc-sections`. That chain has broken in the wild on
318+
# aarch64 (see pglinter 0.18.0 failure), dropping the whole section and
319+
# producing "no embedded pgrx schema section found" at schema-generation
320+
# time. Running `cargo pgrx schema --release` against pgrx-unit-tests on
321+
# a native aarch64 runner directly exercises the broken path.
322+
strategy:
323+
matrix:
324+
postgres: [ 17 ]
325+
326+
steps:
327+
- uses: actions/checkout@v4
328+
329+
- name: Set up prerequisites and environment
330+
run: |
331+
sudo apt-get update -y -qq --fix-missing
332+
333+
echo "----- Remove old postgres / clang / llvm (best effort) -----"
334+
# Each regex can legitimately match zero packages on the arm64
335+
# runner image (e.g. no pre-installed `mono-llvm*`), so we treat
336+
# every removal as best-effort rather than failing the whole job.
337+
for pattern in \
338+
'^postgres.*' \
339+
'^libpq.*' \
340+
'^clang.*' \
341+
'^llvm.*' \
342+
'^libclang.*' \
343+
'^libllvm.*' \
344+
'^mono-llvm.*'; do
345+
sudo apt remove -y "$pattern" || true
346+
done
347+
348+
echo "----- Install system dependencies -----"
349+
sudo apt-get install -y \
350+
build-essential \
351+
clang \
352+
libclang-dev \
353+
llvm-dev \
354+
gcc \
355+
libssl-dev \
356+
libz-dev \
357+
make \
358+
pkg-config \
359+
zlib1g-dev
360+
361+
"$TOOL_DIR"/rustup.sh
362+
363+
echo "----- Print env -----"
364+
env
365+
echo ""
366+
367+
- name: Setup release Postgres apt repo
368+
run: |
369+
sudo apt-get install -y wget gnupg
370+
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
371+
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
372+
sudo apt-get update -y -qq --fix-missing
373+
374+
- name: Install Postgres
375+
run: |
376+
sudo apt-get install -y \
377+
postgresql-$PG_VER \
378+
postgresql-server-dev-$PG_VER
379+
380+
- name: Set up PostgreSQL permissions
381+
run: sudo chmod a+rwx `/usr/lib/postgresql/$PG_VER/bin/pg_config --pkglibdir` `/usr/lib/postgresql/$PG_VER/bin/pg_config --sharedir`/extension /var/run/postgresql/
382+
383+
- name: Cache cargo registry
384+
uses: actions/cache@v4
385+
continue-on-error: false
386+
with:
387+
path: |
388+
~/.cargo/registry
389+
~/.cargo/git
390+
key: pgrx-arm64-cargo-${{ runner.os }}-${{ runner.arch }}-${{ hashFiles('**/Cargo.lock', '.github/workflows/tests.yml') }}
391+
392+
- name: Install cargo-pgrx
393+
run: cargo install --path cargo-pgrx/ --debug --force
394+
395+
- name: Run 'cargo pgrx init' against system-level ${{ matrix.postgres }}
396+
run: cargo pgrx init --pg$PG_VER /usr/lib/postgresql/$PG_VER/bin/pg_config
397+
398+
# Reproduces the pglinter 0.18.0 failure mode: release build with LTO
399+
# off and many codegen units, where `.pgrxsc` retention depends on
400+
# `SHF_GNU_RETAIN` surviving the whole LLVM → binutils → `ld
401+
# --gc-sections` chain. The env vars below pin the profile shape so
402+
# this regression test keeps exercising the broken path even if
403+
# someone later edits `[profile.release]` in the workspace Cargo.toml.
404+
# pgrx-unit-tests has a large, diverse set of schema entries, so if
405+
# the injected `--no-gc-sections` workaround ever regresses, this step
406+
# fails with "no embedded pgrx schema section found".
407+
- name: Exercise cargo pgrx schema --release against pgrx-unit-tests
408+
env:
409+
CARGO_PROFILE_RELEASE_LTO: "false"
410+
CARGO_PROFILE_RELEASE_CODEGEN_UNITS: "16"
411+
run: cd pgrx-unit-tests && cargo pgrx schema pg$PG_VER --release
412+
305413
cargo_pgrx_init:
306414
name: cargo pgrx init
307415
runs-on: ubuntu-22.04

cargo-pgrx/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ Options:
6161
- `PGRX_BUILD_VERBOSE` - Set to true to enable verbose "build.rs" output -- useful for debugging build issues
6262
- `HTTPS_PROXY` - If set during `cargo pgrx init`, it will download the Postgres sources using these proxy settings. For more details refer to the [env_proxy crate documentation](https://docs.rs/env_proxy/*/env_proxy/fn.for_url.html).
6363
- `PGRX_IGNORE_RUST_VERSIONS` - Set to true to disable the `rustc` version check we have when performing schema generation (schema generation requires the same version of `rustc` be used to build `cargo-pgrx` as the crate in question).
64+
- `CARGO_PGRX_DISABLE_GC_SECTIONS_WORKAROUND` - Set to any value to stop `cargo-pgrx` from injecting `-C link-arg=-Wl,--no-gc-sections` on non-macOS Unix cdylib builds. By default, `cargo-pgrx` passes this flag (scoped to `cfg(all(target_family = "unix", not(target_os = "macos")))`) to cargo as `--config target.<cfg>.rustflags=[...]` only for cdylib-producing invocations — `cargo pgrx install`, `cargo pgrx package`, and the non-`--test` path of `cargo pgrx schema`. This prevents GNU `ld --gc-sections` from dropping the `.pgrxsc` section where the embedded SQL schema metadata lives — a problem that surfaces on non-LTO builds (notably aarch64 Linux) as `no embedded pgrx schema section found`. The workaround is intentionally not applied to `cargo pgrx test` / `cargo pgrx schema --test` because those also link a test binary whose undefined Postgres `extern "C"` references are expected to be GC'd away. Disable this only if you are explicitly managing `rustflags` yourself and understand the retention contract.
6465

6566
## First Time Initialization
6667

cargo-pgrx/src/cargo.rs

Lines changed: 91 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ pub struct Cargo {
2424
profile: CargoProfile,
2525
// use a BTreeMap for deterministic order of iteration
2626
more_args: BTreeMap<String, Vec<String>>,
27+
// Top-level cargo args placed before the subcommand (e.g. `--config ...`).
28+
// Kept separate so callers can opt into cdylib-only tweaks without touching
29+
// subcommand-specific arg ordering.
30+
top_level_args: Vec<String>,
2731
}
2832

2933
impl Cargo {
@@ -73,11 +77,26 @@ impl Cargo {
7377
self
7478
}
7579

80+
/// Apply the `.pgrxsc` retention workaround for cdylib-producing
81+
/// invocations. See `pgrx_cdylib_config_args` for why this is only safe
82+
/// to use when the subcommand only links a cdylib (never for `test`,
83+
/// which also links a test binary that needs normal `--gc-sections` to
84+
/// prune undefined Postgres `extern "C"` references).
85+
pub fn with_pgrx_cdylib_workaround(mut self) -> Self {
86+
self.top_level_args.extend(pgrx_cdylib_config_args());
87+
self
88+
}
89+
7690
#[track_caller]
7791
pub fn into_command(self) -> process::Command {
7892
let mut cmd = cargo();
7993

80-
// subcommand *must* go first
94+
// top-level cargo args (e.g. `--config ...`) go before the subcommand.
95+
for arg in &self.top_level_args {
96+
cmd.arg(arg);
97+
}
98+
99+
// subcommand *must* go first among subcommand-relative args
81100
if self.subcmd != "" {
82101
cmd.arg(&self.subcmd);
83102
} else {
@@ -94,6 +113,7 @@ impl Cargo {
94113
package,
95114
subcmd: _,
96115
more_args,
116+
top_level_args: _,
97117
} = self;
98118

99119
// set most-interesting flags first, like profile, target, and manifest-path
@@ -178,6 +198,54 @@ pub(crate) fn cargo() -> std::process::Command {
178198
std::process::Command::new(cargo)
179199
}
180200

201+
/// Cargo top-level `--config` arguments that force the linker to retain the
202+
/// `.pgrxsc` section when building the extension cdylib.
203+
///
204+
/// The schema metadata that drives `cargo pgrx schema` lives in a `.pgrxsc`
205+
/// ELF section composed of `#[used]` statics scattered across every CGU. Rust
206+
/// lowers `#[used]` to `@llvm.used`, which on ELF is supposed to mark the
207+
/// containing section with `SHF_GNU_RETAIN` so the linker's default
208+
/// `--gc-sections` pass leaves it alone. That contract is fragile in the wild:
209+
/// it depends on LLVM emitting the flag, binutils honoring it, and LTO not
210+
/// masking the difference. When it breaks, every `.pgrxsc` contribution is
211+
/// dropped and `cargo pgrx schema` fails with "no embedded pgrx schema section
212+
/// found" — which is exactly what we've seen on non-LTO aarch64 Linux builds.
213+
///
214+
/// Passing `-Wl,--no-gc-sections` as a link-arg on non-macOS Unix sidesteps the
215+
/// whole retention contract. The cost is a few KB of unreferenced code surviving
216+
/// in the final cdylib, which is a trade we'll happily make.
217+
///
218+
/// Critically, these args are only injected for cdylib-producing invocations
219+
/// (`cargo build --lib` inside cargo-pgrx install/package/schema paths). They
220+
/// MUST NOT be injected for `cargo test` invocations: rustc uses `--gc-sections`
221+
/// on test binaries to prune the many unreachable `extern "C"` references to
222+
/// Postgres symbols that pgrx-pg-sys declares. Keeping all those call sites
223+
/// retained makes the linker refuse to produce the test executable with a
224+
/// wall of "undefined symbol: CurrentMemoryContext" errors. The test-harness
225+
/// flow (`cargo pgrx test`) still picks up the workaround because the harness
226+
/// re-invokes `cargo-pgrx install` internally, and that path goes through
227+
/// `build_extension` which does apply the workaround.
228+
///
229+
/// This is scoped via a `cfg(...)` target predicate so it only fires when
230+
/// building for a GNU-ld-ish target; macOS (Mach-O, `ld64`, uses `-dead_strip`
231+
/// and doesn't understand this flag) and Windows (MSVC `link.exe`, doesn't
232+
/// speak `-Wl,...`) are excluded.
233+
///
234+
/// Users who set `RUSTFLAGS` or `CARGO_ENCODED_RUSTFLAGS` override this per
235+
/// cargo's rustflags precedence rules. Setting
236+
/// `CARGO_PGRX_DISABLE_GC_SECTIONS_WORKAROUND=1` in the environment also
237+
/// suppresses the injection entirely.
238+
pub(crate) fn pgrx_cdylib_config_args() -> Vec<String> {
239+
if std::env::var_os("CARGO_PGRX_DISABLE_GC_SECTIONS_WORKAROUND").is_some() {
240+
return Vec::new();
241+
}
242+
vec![
243+
"--config".to_string(),
244+
r#"target.'cfg(all(target_family = "unix", not(target_os = "macos")))'.rustflags = ["-C", "link-arg=-Wl,--no-gc-sections"]"#
245+
.to_string(),
246+
]
247+
}
248+
181249
/// Set some environment variables for use downstream (in `pgrx-test` for
182250
/// example). Does nothing if already set.
183251
pub(crate) fn initialize() {
@@ -256,3 +324,25 @@ impl CargoProfile {
256324
}
257325
}
258326
}
327+
328+
#[cfg(test)]
329+
mod tests {
330+
use super::pgrx_cdylib_config_args;
331+
332+
#[test]
333+
fn cdylib_config_args_carry_no_gc_sections_workaround() {
334+
let args = pgrx_cdylib_config_args();
335+
assert_eq!(args.len(), 2, "expected `--config <value>` pair, got {args:?}");
336+
assert_eq!(args[0], "--config");
337+
assert!(
338+
args[1].contains("--no-gc-sections"),
339+
"cdylib --config should disable section GC on non-macOS Unix: {}",
340+
args[1]
341+
);
342+
assert!(
343+
args[1].contains("not(target_os = \"macos\")"),
344+
"cdylib --config must exclude macOS: {}",
345+
args[1]
346+
);
347+
}
348+
}

cargo-pgrx/src/command/install.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,13 @@ pub(crate) fn build_extension(
305305
let flags = std::env::var("PGRX_BUILD_FLAGS").unwrap_or_default();
306306

307307
let mut command = crate::cargo::cargo();
308+
// These `--config` args go before the subcommand. They retain `.pgrxsc`
309+
// on non-macOS Unix cdylib links so schema generation can find it; safe
310+
// here because this path only ever builds the cdylib (`--lib`). See
311+
// `pgrx_cdylib_config_args` for the full rationale.
312+
for arg in crate::cargo::pgrx_cdylib_config_args() {
313+
command.arg(arg);
314+
}
308315
command.arg("build");
309316
command.arg("--lib");
310317

cargo-pgrx/src/command/schema.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -310,7 +310,16 @@ fn first_build(
310310
cargo.subcommand("build").flag("--lib")
311311
};
312312

313-
let cargo = cargo.profile(profile.clone()).target(target.map(|t| t.to_owned()));
313+
let mut cargo = cargo.profile(profile.clone()).target(target.map(|t| t.to_owned()));
314+
315+
// `cargo build --lib` only ever links the cdylib, so it's safe to apply
316+
// the `.pgrxsc` retention workaround. `cargo test --no-run` also links
317+
// the test binary, which rustc needs `--gc-sections` to prune undefined
318+
// Postgres `extern "C"` references from; applying the workaround there
319+
// surfaces those as linker errors. See `pgrx_cdylib_config_args`.
320+
if !is_test {
321+
cargo = cargo.with_pgrx_cdylib_workaround();
322+
}
314323

315324
let mut command = cargo.into_command();
316325

0 commit comments

Comments
 (0)