feat: add pre_seed phase — S3 zip download before preboot#1189
Open
chaodu-agent wants to merge 1 commit into
Open
feat: add pre_seed phase — S3 zip download before preboot#1189chaodu-agent wants to merge 1 commit into
chaodu-agent wants to merge 1 commit into
Conversation
Adds a new [pre_seed] config section that runs before pre_boot hooks. Downloads up to 5 zip archives from S3 and extracts them in order to $HOME (or a custom target), implementing a layer system where later archives overwrite earlier ones. Closes #1188
Collaborator
Author
法師團隊 Review — PR #1189
|
| # | Severity | Finding | Location | Reviewer(s) |
|---|---|---|---|---|
| 1 | 🔴 Critical | spawn_blocking + timeout race condition — tokio::time::timeout only cancels the await, not the blocking unzip thread. With on_failure = "warn", the loop advances to the next layer while the timed-out extraction is still writing to $HOME, breaking the sequential-overwrite contract and producing nondeterministic boot state. |
pre_seed.rs:39-55, 99 |
擺渡, Z渡 |
| 2 | 🟡 Important | Memory double-allocation — bytes.to_vec() copies the entire S3 body a second time. Should pass Bytes handle directly (zero-copy clone) to the blocking task. |
pre_seed.rs:98-99 |
擺渡, Z渡, 普渡 |
| 3 | 🟡 Important | No zip size cap — S3 object is fully collected into memory with no limit. A large or highly-compressed zip can OOM the process on boot path (resource-constrained ECS/K8s pods). Add MAX_ZIP_BYTES guard similar to MAX_SCRIPT_SIZE in hooks. |
pre_seed.rs:88-93 |
擺渡, Z渡, 普渡 |
| 4 | 🟡 Important | Missing region/endpoint_url override — PreSeedConfig lacks region and endpoint_url fields that AwsSecretsConfig provides. This blocks LocalStack and VPC endpoint testing. |
config.rs:210 |
擺渡 |
| 5 | 🟡 Important | Duplicated parse_s3_uri — identical function exists in config.rs. Extract to shared util to avoid drift. |
pre_seed.rs:122-135 |
普渡 |
| 6 | 🟢 Praise | Correct use of enclosed_name() for Zip Slip prevention. |
pre_seed.rs:113 |
普渡, 擺渡 |
| 7 | 🟢 Praise | S3 Client initialized once and reused across layers. | pre_seed.rs:32-33 |
擺渡 |
| 8 | 🟢 Praise | on_failure semantics consistent with existing hooks module. |
pre_seed.rs:54-62 |
普渡 |
| 9 | 🟢 Praise | Unit tests cover URI parsing, extraction, overwrite semantics, empty config, max source count. | pre_seed.rs:165+ |
Z渡 |
Suggested Fix for 🔴 #1
Extract to a temp directory first, then atomically rename/move into target only on success. If timeout fires, the incomplete temp dir is cleaned up — no partial writes pollute $HOME. This also makes the sequential-overwrite contract deterministic regardless of on_failure mode.
Housekeeping
- Branch needs rebase onto latest
main— current diff includes merged feat(discord): add /auth slash command for device-flow auth #1185 (/authslash command) commits.
Next Steps
1️⃣ Approve PR
2️⃣ 請 contributor 修改後再 review
3️⃣ 關閉 PR
4️⃣ 我自己來 fix,push 後讓法師團隊 review 直到完全修正
Collaborator
Author
|
LGTM ✅ — Clean, well-structured implementation of the pre_seed phase with proper security mitigations and good test coverage. What This PR DoesImplements a How It Works
Findings
What's Good (🟢)
Baseline Check
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements #1188 — adds a
[pre_seed]config section that downloads and extracts zip archives from S3 before thepre_boothook runs. This seeds the agent environment (configs, tools, memory files) without requiring AWS CLI in the container image.Changes
crates/openab-core/src/config.rs—PreSeedConfigstruct (sources, target, timeout, on_failure)crates/openab-core/src/pre_seed.rs— new module: S3 download + zip extraction with layer semanticscrates/openab-core/src/lib.rs— exportpre_seed(gated bypre-seedfeature)crates/openab-core/Cargo.toml— addzipdep +pre-seedfeatureCargo.toml— forwardpre-seedfeature toopenab-coresrc/main.rs— callpre_seed::run()beforepre_boothookconfig.toml.example— add commented[pre_seed]exampledocs/hooks.md— lifecycle diagram + pre_seed docsdocs/config-reference.md—[pre_seed]field tableKey Design Decisions
pre-seedfeature (default-on), compiles away when unusedLifecycle
Testing
parse_s3_uri,extract_zip,run_empty,run_too_manycargo verify-project✅cargo metadataresolves ✅Closes #1188