Knowledge-graph-driven AWS infrastructure builder. Describe what you want to build — CloudForge derives the architecture from validated cloud patterns, generates production-grade Terraform, and provisions real resources with zero persistent cloud credentials.
Built for a hackathon. Shipped end-to-end
Most "AI infrastructure" tools send your requirements to an LLM and hope the output is reasonable. CloudForge doesn't do that.
Architecture decisions are derived from a knowledge graph of validated cloud patterns. Your PRD is processed through a RAG pipeline — NFRs are extracted, relevant patterns are retrieved from the graph, ranked, and validated via graph traversal. The LLM is the last step: it explains a graph-derived answer. It never invents one.
User PRD
↓
[Agent 1 — PRD Refinement]
Extracts NFRs, asks clarifying questions (traffic tiers, compliance, SLAs)
Multi-choice options + freeform input. Persists structured requirements JSON.
↓
[Agent 2 — Architecture Planner: 7-node LangGraph]
RAG retrieval from knowledge graph of cloud patterns
Graph community routing → Kuzu graph traversal → ranked pattern selection
Load simulation, failure-mode analysis, compliance mapping
Rule-based test suite (SPOF detection, cascade risk, latency budget)
Human-in-the-loop interrupt before proceeding (uses LangGraph interrupt() API)
LLM explains the graph-derived architecture — never generates it blindly
↓
[Agent 3 — Code & Terraform Generator: multi-subgraph LangGraph]
Generates Terraform HCL (main.tf, variables.tf, outputs.tf) per service
Generates application code (Python/TypeScript) per service
Validation loop: terraform fmt → terraform validate → TFLint → Checkov
LLM-driven fix loop (up to 3 retries per file)
Subgraph: code_generation_loop → tsc/AST validation → test generation
↓
[Deploy — CloudFormation provisioning]
Intermediate infrastructure representation → cloud-specific template render
Temporary credentials via IAM role assumption (AWS STS AssumeRole)
Streams per-resource events live via SSE
Generated code committed directly to user's GitHub repo via GitHub App OAuth
Architecture decisions are never free-form LLM outputs. The system maintains a graph of validated cloud patterns. When a PRD comes in:
- NFRs are extracted and embedded
- Relevant patterns are retrieved from the graph based on semantic similarity to the NFR set
- Graph traversal validates pattern compatibility (no conflicting services, no SPOF introduced, latency budget respected)
- Only after graph validation does the LLM run — to explain the architecture in human language
The rule engine (analysis/arch_rules.py) runs deterministic checks: SPOF identification, cascade blast radius, latency budget, over-provisioning. These gates are not probabilistic.
Five specialised agents, each with its own state, context, and tools:
| Agent | Role |
|---|---|
| PRD Refinement | Extracts NFRs, disambiguates requirements via multi-choice Q&A |
| Service Discovery | Maps NFRs to concrete AWS services |
| Architecture Planner | Graph traversal + simulation |
| Resilience Simulator | Per-failure-mode blast radius analysis |
| Code Generator | Terraform HCL + application code with validation loops |
State is typed (Pydantic / TypedDict) and passed through the full pipeline. Each stage is persisted to MongoDB per session. No single monolithic prompt.
Generated infrastructure is provider-agnostic by design. An intermediate infrastructure representation (topology graph) is built from the architecture recommendation. At deploy time this is rendered into cloud-specific Terraform templates.
Supports AWS today. Architected to extend to GCP and Azure without changing the generation pipeline — the providers/factory.py factory handles provider dispatch, and the agent pipeline never references AWS directly.
CloudForge never stores long-term cloud credentials.
- At deploy time, temporary credentials are obtained via IAM role assumption (AWS STS
AssumeRole) - Credentials that do need to be stored (for user-initiated deployments) are encrypted at rest with Fernet (AES-128-CBC) and decrypted only at deploy time
- All credentials are discarded after use — zero persistent access
Every agent stage streams output via Server-Sent Events. Users see PRD refinement, architecture reasoning, file-by-file code generation, and live deploy logs as they happen — not after.
// NFR extracted
{"type": "constraint", "chip": {"label": "99.9% uptime", "category": "availability"}}
// Architecture derived from graph
{"type": "complete", "architecture_diagram": {"nodes": [...], "connections": [...]}}
// File generated
{"type": "file", "path": "main.tf", "content": "...", "language": "hcl"}
// Resource live
{"type": "node_status", "nodeId": "lambda-1", "status": "live"}Generated scaffolds are committed directly to the user's own GitHub repository under their identity via GitHub App OAuth. No intermediary storage — code goes from generation straight into the user's repo.
| Framework | Next.js 15 (App Router), React 19, TypeScript strict |
| Canvas | @xyflow/react v12 |
| State | Zustand v5 with persist middleware |
| Animations | Framer Motion v11 |
| Styling | Tailwind CSS v4 |
| API layer | Native fetch + custom streamSSE() for SSE |
| Framework | FastAPI 0.135+ (Python 3.12+) |
| Database | MongoDB (Motor async driver) |
| Agent framework | LangGraph v1.1.3+ with subgraphs + interrupt() API |
| LLM (local) | Ollama (codeqwen:latest) — requirements extraction |
| LLM (cloud) | Claude Sonnet via langchain-anthropic — architecture + code |
| IaC validation | TFLint, Checkov (CIS AWS Benchmark), terraform validate |
| Cloud SDK | Boto3 (STS AssumeRole, CloudFormation) |
| Auth | PyJWT, bcrypt, Fernet encryption |
State (AgentState, Pydantic): prd_text, follow_up_questions, questions_with_options, plan_markdown, plan_json, status, research_results, user_answers
Graph:
user_input → research → web_search → information_gate →
[needs clarification] → await_user (interrupt) → loop
[enough info] → plan → acceptance → END
Multi-choice clarification: 2–4 predefined options per question (traffic tiers, compliance frameworks, availability SLAs) + freeform custom input. Selections are normalised to user_answers for next-iteration context.
State (ArchitecturePlannerState, TypedDict): prd, NFR fields, architecture_diagram, arch_test_passed, arch_test_violations, user_accepted, accept_iteration_count
Graph:
START → architecture → service_discovery → arch_simulator →
resilience_simulator → compliance → arch_test →
[CRITICAL violations, iteration < 3] → architecture (retry)
[passed OR max iterations] → present_architecture (human interrupt)
[user_accepted] → END
Seven sub-agents: architecture_agent, service_discovery_agent, arch_simulator, resilience_simulator, compliance_agent, arch_test_agent, present_architecture_node.
State (AgentState, TypedDict): services, connections, tf_files, task_list, code_files, test_files, artifacts, current_phase
Phases: parsing → tf_generation → tf_validation → orchestration → assembly → done
Subgraphs:
tf_validation_loop— parallel validators, error extraction, LLM fix, re-validate (3 retries)code_generation_loop— per-service: generate → validate → test → fix
All long-running workflows return SSE streams. Auth via Bearer JWT.
| Method | Path | Description |
|---|---|---|
POST |
/auth/register |
Register (5/min rate limit) |
POST |
/auth/login |
Login → access + refresh tokens |
POST |
/auth/refresh |
Refresh access token |
GET |
/auth/github/login |
GitHub OAuth |
POST |
/projects/ |
Create project |
GET |
/projects/ |
List projects |
POST |
/workflows/prd/v2/start/{id} |
SSE — Agent 1 |
POST |
/workflows/prd/v2/respond/{id} |
Submit answers |
POST |
/workflows/prd/v2/accept/{id} |
Accept PRD |
POST |
/workflows/architecture/v2/start/{id} |
SSE — Agent 2 |
POST |
/workflows/architecture/v2/accept/{id} |
Accept architecture |
POST |
/workflows/build/start/{id} |
SSE — Agent 3 |
POST |
/workflows/deploy/start/{id} |
SSE — Deploy |
POST |
/workflows/deploy/{deployment_id}/rollback |
Rollback |
GET/PUT |
/files/{project_id}/content |
Read / write generated files |
GET |
/history/builds |
Build history |
GET |
/history/deployments |
Deployment history |
| Collection | Purpose |
|---|---|
users |
Accounts — bcrypt passwords, encrypted GitHub tokens |
projects |
Project metadata, stage, encrypted cloud credentials |
prd_conversations |
Agent 1 sessions — full message history, plan JSON |
architectures |
Agent 2 sessions — diagram, test results, violations |
builds |
Agent 3 outputs — all generated artifacts |
deployments |
CloudFormation stacks — logs, resource statuses, outputs |
# Backend
cd backend
uv sync
ollama pull codeqwen:latest && ollama serve
uv run uvicorn app.main:app --reload --port 8000
# Frontend
cd frontend
npm install && npm run devRequired env vars: JWT_SECRET_KEY, FERNET_KEY, ANTHROPIC_API_KEY. See backend/.env.sample.
- JWT access tokens (24h) + refresh tokens (30d)
- bcrypt password hashing
- Fernet (AES-128-CBC) for cloud credential storage
- IAM role assumption — no persistent AWS access keys
- Rate limiting on all auth endpoints (SlowAPI)
- Server refuses to start with default
JWT_SECRET_KEY