Skip to content

fix: add network policies and scope Vault permissions per service#425

Open
Flegma wants to merge 4 commits intomainfrom
audit/413-network-policies
Open

fix: add network policies and scope Vault permissions per service#425
Flegma wants to merge 4 commits intomainfrom
audit/413-network-policies

Conversation

@Flegma
Copy link
Copy Markdown
Contributor

@Flegma Flegma commented Apr 2, 2026

Summary

Network Policies — Implements network segmentation for the 5stack namespace:

Policy Effect
default-deny-ingress Blocks all ingress traffic by default
allow-ingress-to-services NGINX ingress → web, api, hasura, minio, typesense
allow-timescaledb-ingress Only hasura + api → timescaledb:5432
allow-redis-ingress Only api + connector → redis:6379
allow-hasura-ingress Only api + web + ingress → hasura:8080
allow-api-ingress Only ingress + connector → api:5585
allow-connector-ingress Only api → connector:8585

Vault Policy — Replaced wildcard path "*" with explicit per-service read-only paths:

  • Each service secret path (kv/data/api, kv/data/redis, etc.) gets read, list only
  • Removed create, update, delete capabilities from external-secrets role
  • Matches the exact paths used by migrate_secrets_to_vault in setup-env.sh

Test plan

  • kubectl kustomize base builds successfully (7 NetworkPolicies generated)
  • All services can communicate as expected after applying policies
  • External-secrets operator can still read secrets from Vault
  • External-secrets CANNOT write/delete secrets

Note: Network policies require a CNI that supports them (Calico, Cilium, etc.). K3s with default Flannel may need --flannel-backend=none + Calico.

Closes #413

@lukepolo
Copy link
Copy Markdown
Contributor

lukepolo commented Apr 9, 2026

looks fine but i need to test before merging

@Flegma
Copy link
Copy Markdown
Contributor Author

Flegma commented Apr 23, 2026

@lukepolo gentle nudge — this has been waiting on your local test since April 9. Sprint 5 cleanup is stalled on #424, #425, #427, #428 landing. Let me know if anything's blocking the test on your end.

@lukepolo
Copy link
Copy Markdown
Contributor

these break game node server

@lukepolo
Copy link
Copy Markdown
Contributor

this MR needs more work~

Flegma added 4 commits April 26, 2026 17:47
Network policies:
- Default-deny ingress for 5stack namespace
- Allow ingress controller to reach web, api, hasura, minio, typesense
- TimescaleDB: only reachable from hasura and api
- Redis: only reachable from api and connector
- Hasura: only from api, web, and ingress
- API: only from ingress and connector
- Connector: only from api

Vault:
- Replace wildcard path "*" with explicit per-service read-only paths
  matching the kv/data/* paths used by migrate_secrets_to_vault
- External-secrets can only read specific service secrets, not
  create/update/delete or access arbitrary vault paths

Closes #413
Game server pods (labeled app: game-server) need WebSocket access to
the API for match event communication. Without this, match events
would be blocked by the default-deny policy.
Per code review — 4 critical/important missing policies:
- Hasura → API: needed for auth/event/action webhooks
- Backup CronJob → TimescaleDB + MinIO: needed for pg_dump + S3 upload
- API → MinIO: needed for S3 operations (demos, assets)
- API → Typesense: needed for player search indexing

Also adds app: postgres-backup label to backup CronJob pod template
so it can be selected by network policies.
Drop default-deny-ingress and allow-ingress policies. Without a
default-deny in place, only pods explicitly selected by allow-internal
become restricted — game-server pods (and any other unselected pod)
remain wide-open, so CS2 client traffic and connector/RCON paths are
not affected.

Per-service ingress restrictions (TimescaleDB, Redis, Hasura, API,
MinIO, Typesense, connector) still apply.
@Flegma Flegma force-pushed the audit/413-network-policies branch from 9b0fa10 to aeef066 Compare April 26, 2026 17:59
@Flegma
Copy link
Copy Markdown
Contributor Author

Flegma commented Apr 26, 2026

@lukepolo scoped this down in aeef066 — dropped default-deny-ingress and the allow-ingress umbrella policy.

What's left:

  • Per-service ingress restrictions for TimescaleDB, Redis, Hasura, API, MinIO, Typesense, connector (each only accepts traffic from the pods that need it)
  • Vault setup script scoped per service
  • postgres-backup pod label for the backup CronJob

What this means for game traffic: without default-deny, only pods selected by an allow rule become restricted. Game-server pods aren't selected by any policy here, so CS2 client connections, connector → game-server RCON, and any other game-server traffic flow remain wide-open exactly as before. Should unblock the game node server you saw breaking.

If full mesh policies (covering game-server, web, etc.) are something you want later, happy to file a separate issue once we've sorted out what game-server pod ingress actually needs to look like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Infrastructure] Add network policies & scope Vault permissions per service

2 participants