Skip to content

File tree

README.md

Lines changed: 119 additions & 182 deletions
Large diffs are not rendered by default.

images/eps-assist-me-flowchart.png

78.6 KB
Loading
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Bedrock Logging Config Function
2+
3+
CloudFormation custom resource Lambda that configures Bedrock model invocation logging to CloudWatch.
4+
5+
## What This Is
6+
7+
A bridge resource.
8+
AWS CloudFormation currently has no native resource type for configuring Bedrock logging. This Lambda bridges that gap by using the Bedrock API directly during stack deployment/update/deletion.
9+
10+
## What This Is Not
11+
12+
- Not a runtime dependency - it only executes during CDK/CloudFormation deployments
13+
- Not the log consumer - it only tells Bedrock *where* to send logs
14+
15+
## Architecture Overview
16+
17+
```mermaid
18+
flowchart LR
19+
CloudFormation -->|Create/Update/Delete| ConfigLambda[bedrockLoggingConfigFunction]
20+
ConfigLambda -->|Put/Delete Logging Config| BedrockAPI[Bedrock API]
21+
```
22+
23+
## Environment Variables
24+
25+
Configured by CDK based on stack parameters.
26+
27+
| Variable | Purpose |
28+
|---|---|
29+
| `ENABLE_LOGGING` | Toggle for enabling/disabling logs (`true` or `false`) |
30+
| `CLOUDWATCH_LOG_GROUP_NAME` | Destination CloudWatch Log Group |
31+
| `CLOUDWATCH_ROLE_ARN` | IAM Role allowing Bedrock to write to CloudWatch |
32+
33+
## Known Constraints
34+
35+
- It affects the Bedrock logging configuration for the *entire AWS region/account* where deployed. If another stack tries to modify Bedrock logging in the same account/region, they will overwrite each other.

packages/cdk/README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# CDK Infrastructure
2+
3+
AWS Cloud Development Kit (CDK) application defining the EPS Assist Me infrastructure.
4+
5+
## What This Is
6+
7+
The single source of truth for the project's cloud resources.
8+
Provisions the entire bot ecosystem in one deployable stack.
9+
10+
## Architecture
11+
12+
Provisions:
13+
14+
- **API Gateway** - Receives Slack events
15+
- **Lambda Functions** - `slackBotFunction`, `preprocessingFunction`, `syncKnowledgeBaseFunction`, `notifyS3UploadFunction`, `bedrockLoggingConfigFunction`
16+
- **Amazon Bedrock** - Knowledge Base and Data Source configuration
17+
- **OpenSearch Serverless** - Vector database for RAG document embeddings
18+
- **S3 Buckets** - Raw and processed document storage with event notifications
19+
- **DynamoDB** - Bot session state and feedback storage
20+
- **SQS** - Queue for asynchronous processing of document events
21+
- **IAM Roles** - Least-privilege access across services
22+
23+
## Project Structure
24+
25+
- `bin/` CDK app entry point (`EpsAssistMeApp.ts`)
26+
- `constructs/` Reusable Layer 3 (L3) components (e.g. `RestApiGateway`, `LambdaFunction`, `DynamoDbTable`)
27+
- `resources/` L2/L1 definitions grouped by domain (e.g. `VectorKnowledgeBaseResources`, `OpenSearchResources`)
28+
- `stacks/` The actual CloudFormation stack definition (`EpsAssistMeStack`)
29+
- `prompts/` Text templates used to construct Bedrock prompts (System, User, Reformulation)
30+
31+
## Environment Variables
32+
33+
Configured in the stack context (`cdk.json` or via CLI).
34+
35+
| Variable | Purpose |
36+
|---|---|
37+
| `accountId` | Target AWS Account ID |
38+
| `stackName` | CloudFormation stack name |
39+
| `versionNumber` | Stack version |
40+
| `commitId` | Hash for tagging |
41+
| `logRetentionInDays` | CloudWatch retention policy |
42+
| `slackBotToken` | The OAuth token from Slack |
43+
| `slackSigningSecret` | The signing secret from Slack |
44+
45+
## Deployment Notes
46+
47+
- Deployment uses context variables passed during synthesis (`cdk synth --context...`)
48+
- OpenSearch Serverless collections can take around 5-10 minutes to provision
49+
- The Bedrock data source ingestion relies on IAM permissions that might occasionally have propagation delays on first deploy
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Preprocessing Function
2+
3+
Lambda that converts raw uploaded documents into Markdown format for Bedrock Knowledge Base ingestion.
4+
Runs sequentially when new documents land in the raw S3 bucket prefix.
5+
6+
## What This Is
7+
8+
A document standardisation step.
9+
Converts `.pdf`, `.doc`, `.docx`, `.xls`, `.xlsx`, and `.csv` files into `.md`.
10+
Passes through `.txt`, `.md`, `.html`, and `.json` files untouched.
11+
12+
Output is written to the processed S3 bucket prefix, ready for ingestion.
13+
14+
## Architecture Overview
15+
16+
```mermaid
17+
flowchart LR
18+
S3Raw["S3 (raw/)"] -->|event| Preprocessing[preprocessingFunction]
19+
Preprocessing -->|convert/copy| S3Processed["S3 (processed/)"]
20+
```
21+
22+
Downloads from `raw/`, converts to Markdown locally (in a secure temp directory), and uploads to `processed/`.
23+
24+
## Project Structure
25+
26+
- `app/handler.py` Lambda entry point. Processes S3 records.
27+
- `app/config/config.py` Configuration and environment variables.
28+
- `app/services/` Conversion logic (`converter.py`) and S3 helpers (`s3_client.py`).
29+
- `app/cli.py` Local CLI wrapper for convert-docs.
30+
- `tests/` Unit tests.
31+
32+
## Environment Variables
33+
34+
Set by CDK.
35+
36+
| Variable | Purpose |
37+
|---|---|
38+
| `DOCS_BUCKET_NAME` | S3 bucket containing the documents |
39+
| `RAW_PREFIX` | Prefix where raw uploads land (e.g. `raw/`) |
40+
| `PROCESSED_PREFIX` | Prefix where Markdown output goes (e.g. `processed/`) |
41+
| `AWS_ACCOUNT_ID` | AWS Account ID |
42+
43+
## Running Tests
44+
45+
```bash
46+
cd packages/preprocessingFunction
47+
PYTHONPATH=. poetry run python -m pytest
48+
```
49+
50+
Or from the repo root:
51+
52+
```bash
53+
make test
54+
```
55+
56+
## Known Constraints
57+
58+
- Complex PDFs (heavy formatting, multi-column layouts) may produce imperfect Markdown
59+
- Runs sequentially per uploaded file - large batch uploads may take time to process
Lines changed: 51 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -1,179 +1,77 @@
11
# Slack Bot Function
22

3-
AWS Lambda function that handles Slack interactions for the EPS Assist Me bot. Provides AI-powered responses to user queries about the NHS EPS API using Amazon Bedrock Knowledge Base.
3+
Lambda that handles all Slack interactions for the EPS Assist Me bot.
4+
Receives events from Slack, queries Bedrock Knowledge Base, returns AI-generated responses.
45

5-
## Architecture
6+
## What This Is
67

7-
- **Slack Bolt Framework**: Handles Slack events and interactions
8-
- **Amazon Bedrock**: RAG-based AI responses using knowledge base
9-
- **DynamoDB**: Session management and feedback storage
10-
- **Async Processing**: Self-invoking Lambda for long-running AI queries
8+
The core bot logic. Handles:
119

12-
## User Interaction Patterns
10+
- `@mentions` in public channels
11+
- direct messages
12+
- thread follow-ups (no re-mention needed)
13+
- feedback (Yes/No buttons and `feedback:` text prefix)
1314

14-
### Starting Conversations
15+
One Lambda. Uses a self-invoking async pattern to handle heavy processing while still acknowledging Slack's 3-second response timeout.
1516

16-
**Public Channels** - Mention the bot:
17-
```
18-
#general channel:
19-
User: "@eps-bot What is EPS API?"
20-
Bot: "EPS API is the Electronic Prescription Service..."
21-
```
17+
## What This Is Not
2218

23-
**Direct Messages** - Send message directly:
24-
```
25-
DM to @eps-bot:
26-
User: "How do I authenticate with EPS?"
27-
Bot: "Authentication requires..."
28-
```
19+
- Not the infrastructure - that's in `packages/cdk/`
20+
- Not the document ingestion pipeline - that's `preprocessingFunction` and `syncKnowledgeBaseFunction`
21+
- Not the upload notifier - that's `notifyS3UploadFunction`
2922

30-
### Follow-up Questions
31-
32-
**In Channel Threads** - No @mention needed after initial conversation:
33-
```
34-
#general channel thread:
35-
User: "@eps-bot What is EPS API?" ← Initial mention required
36-
Bot: "EPS API is..."
37-
User: "Can you explain more about authentication?" ← No mention needed
38-
Bot: "Authentication works by..."
39-
User: "What about error handling?" ← Still no mention needed
40-
```
41-
42-
**In DMs** - Continue messaging naturally:
43-
```
44-
DM conversation:
45-
User: "How do I authenticate?"
46-
Bot: "Use OAuth 2.0..."
47-
User: "What scopes do I need?" ← Natural follow-up
48-
Bot: "Required scopes are..."
49-
```
50-
51-
### Providing Feedback
52-
53-
**Button Feedback** - Click Yes/No on bot responses:
54-
```
55-
Bot: "EPS API requires OAuth authentication..."
56-
[Yes] [No] ← Click buttons
57-
```
23+
## Architecture Overview
5824

59-
**Text Feedback** - Use "feedback:" prefix anytime (applies to most recent bot response):
60-
```
61-
Bot: "EPS API requires OAuth authentication..."
62-
User: "feedback: This was very helpful, thanks!"
63-
User: "feedback: Could you add more error code examples?"
64-
User: "feedback: The authentication section needs clarification"
25+
```mermaid
26+
flowchart LR
27+
SlackEvent[Slack Event] -->|3s timeout| Handler
28+
Handler -->|async| SelfInvoke[Self-invoke]
29+
SelfInvoke --> Bedrock[Bedrock KB]
30+
Bedrock --> Response[Slack Response]
31+
Handler --> DynamoDB[DynamoDB]
6532
```
6633

67-
## Handler Architecture
34+
- **Slack Bolt** for event handling
35+
- **Bedrock Knowledge Base** for RAG responses with guardrails
36+
- **DynamoDB** for session state and feedback storage
6837

69-
- **`mention_handler`**: Processes @mentions in public channels
70-
- **`dm_message_handler`**: Handles direct messages to the bot
71-
- **`thread_message_handler`**: Manages follow-up replies in existing threads
72-
- **`feedback_handler`**: Processes Yes/No button clicks
38+
## Project Structure
7339

74-
### Conversation Flow
75-
```
76-
Channel:
77-
User: "@eps-bot What is EPS?" ← mention_handler
78-
Bot: "EPS is..." [Yes] [No]
79-
80-
├─ User clicks [Yes] ← feedback_handler
81-
│ Bot: "Thank you for your feedback."
82-
83-
├─ User clicks [No] ← feedback_handler
84-
│ Bot: "Please provide feedback:"
85-
│ User: "feedback: Need more examples" ← thread_message_handler
86-
│ Bot: "Thank you for your feedback."
87-
88-
└─ User: "Tell me more" ← thread_message_handler
89-
Bot: "More details..." [Yes] [No]
90-
91-
DM:
92-
User: "How do I authenticate?" ← dm_message_handler
93-
Bot: "Use OAuth..." [Yes] [No]
94-
User clicks [Yes/No] ← feedback_handler
95-
Bot: "Thank you for your feedback."
96-
User: "feedback: Could be clearer" ← dm_message_handler
97-
Bot: "Thank you for your feedback."
98-
User: "What scopes?" ← dm_message_handler
99-
```
40+
- `app/handler.py` Lambda entry point.
41+
- `app/core/` Configuration and environment variables.
42+
- `app/services/` Business logic - Bedrock client, DynamoDB, Slack client, prompt loading, AI processing.
43+
- `app/slack/` Event handlers - mentions, DMs, threads, feedback.
44+
- `app/utils/` Shared utilities.
45+
- `tests/` Unit tests.
10046

101-
## Conversation Flow Rules
47+
## Environment Variables
10248

103-
1. **Public channels**: Must @mention bot to start conversation
104-
2. **Threads**: After initial @mention, no further mentions needed
105-
3. **DMs**: No @mention required, direct messaging
106-
4. **Feedback restrictions**:
107-
- Only available on most recent bot response
108-
- Cannot vote twice on same message (Yes/No)
109-
- Cannot rate old messages after conversation continues
110-
5. **Text feedback**: Use "feedback:" prefix anytime in conversation (multiple comments allowed)
111-
- Feedback applies to the most recent bot message in the conversation
49+
Set by CDK. Don't hardcode these.
11250

113-
## Technical Implementation
51+
| Variable | Purpose |
52+
|---|---|
53+
| `SLACK_BOT_TOKEN_PARAMETER` | Parameter Store path for bot token |
54+
| `SLACK_SIGNING_SECRET_PARAMETER` | Parameter Store path for signing secret |
55+
| `SLACK_BOT_STATE_TABLE` | DynamoDB table name |
56+
| `KNOWLEDGEBASE_ID` | Bedrock Knowledge Base ID |
57+
| `RAG_MODEL_ID` | Bedrock model ARN |
58+
| `GUARD_RAIL_ID` | Bedrock guardrail ID |
11459

115-
### Event Processing Flow
116-
```
117-
Slack Event → Handler (3s timeout) → Async Lambda → Bedrock → Response
118-
```
60+
## Running Tests
11961

120-
### Data Storage
121-
- **Sessions**: 30-day TTL for conversation continuity
122-
- **Q&A Pairs**: 90-day TTL for feedback correlation
123-
- **Feedback**: 90-day TTL for analytics
124-
- **Event Dedup**: 1-hour TTL for retry handling
125-
126-
### Privacy Features
127-
- **Automatic cleanup**: Q&A pairs without feedback are deleted when new messages arrive (reduces data retention by 70-90%)
128-
- **Data minimisation**: Configurable TTLs automatically expire old data
129-
- **Secure credentials**: Slack tokens stored in AWS Parameter Store
130-
131-
### Feedback Protection
132-
- **Latest message only**: Users can only rate the most recent bot response in each conversation
133-
- **Duplicate prevention**: Users cannot vote twice on the same message (Yes/No buttons)
134-
- **Multiple text feedback**: Users can provide multiple detailed comments using "feedback:" prefix
135-
136-
## Configuration
137-
138-
### Environment Variables
139-
- `SLACK_BOT_TOKEN_PARAMETER`: Parameter Store path for bot token
140-
- `SLACK_SIGNING_SECRET_PARAMETER`: Parameter Store path for signing secret
141-
- `SLACK_BOT_STATE_TABLE`: DynamoDB table name
142-
- `KNOWLEDGEBASE_ID`: Bedrock Knowledge Base ID
143-
- `RAG_MODEL_ID`: Bedrock model ARN
144-
- `GUARD_RAIL_ID`: Bedrock guardrail ID
145-
146-
### DynamoDB Schema
147-
```
148-
Primary Key: pk (partition key), sk (sort key)
149-
150-
Sessions: pk="thread#C123#1234567890", sk="session"
151-
Q&A Pairs: pk="qa#thread#C123#1234567890#1234567891", sk="turn"
152-
Feedback: pk="feedback#thread#C123#1234567890#1234567891", sk="user#U123"
153-
Text Notes: pk="feedback#thread#C123#1234567890#1234567891", sk="user#U123#note#1234567892"
62+
```bash
63+
cd packages/slackBotFunction
64+
PYTHONPATH=. poetry run python -m pytest
15465
```
15566

156-
## Development
67+
Or from the repo root:
15768

158-
### Local Testing
15969
```bash
160-
# Install dependencies
161-
npm install
162-
163-
# Run tests
164-
npm test
165-
166-
# Deploy to dev environment
167-
make cdk-deploy STACK_NAME=your-dev-stack
70+
make test
16871
```
16972

170-
### Debugging
171-
- Check CloudWatch logs for Lambda execution details
172-
- Monitor DynamoDB for session and feedback data
173-
174-
## Monitoring
175-
176-
- **CloudWatch Logs**: `/aws/lambda/{stack-name}-SlackBotFunction`
177-
- **DynamoDB Metrics**: Built-in AWS metrics for table operations
73+
## Known Constraints
17874

179-
**Note**: No automated alerts configured. Uses AWS built-in metrics and manual log review.
75+
- Slack enforces a 3-second response window. A quick acknowledgement is required, but how the subsequent background processing is handled (like the async self-invoke pattern) is an architectural choice.
76+
- Bedrock guardrails can block legitimate queries if they hit content filters - check CloudWatch logs
77+
- Session state lives in DynamoDB with TTLs - conversations expire after 30 days

0 commit comments

Comments
 (0)