Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
5504541
feat(compiler): _read_entity_briefs for entity plan context
KylinMountain May 30, 2026
7181d57
test(compiler): parity tests for _read_entity_briefs
KylinMountain May 30, 2026
efacb6f
feat(compiler): _write_entity with type/aliases frontmatter
KylinMountain May 30, 2026
71a4a14
test(compiler): assert source ordering in _write_entity; count=1 in _…
KylinMountain May 30, 2026
97f1c51
feat(lint): include entities/ in wikilink whitelist
KylinMountain May 30, 2026
3c8aa93
feat(compiler): summary<->entity backlinks
KylinMountain May 30, 2026
ff1345e
test(compiler): restore assertion erroneously deleted in 3c8aa93
KylinMountain May 30, 2026
385defd
feat(compiler): index.md Entities section
KylinMountain May 30, 2026
41cda0f
feat(compiler): remove_doc_from_entity_pages + index cleanup
KylinMountain May 30, 2026
04d2bc9
feat(compiler): plan prompt + parser for entities group
KylinMountain May 30, 2026
ad45439
fix(compiler): related entities must not downgrade index labels
KylinMountain May 30, 2026
5008a14
feat(schema): declare entities/ page type and taxonomy
KylinMountain May 30, 2026
1e82214
feat(query): point who/what questions at entities/
KylinMountain May 30, 2026
3242844
docs(readme): document entities/ page type
KylinMountain May 30, 2026
ff3fafb
feat(cli): scaffold entities/ in init and count it in status
KylinMountain May 30, 2026
a7a06ed
fix(compiler): resolve entity-page review findings (dangling links + …
claude May 30, 2026
b882ee9
fix(compiler): add [[entities/X]] whitelist rule + restore concept-to…
KylinMountain May 31, 2026
d1dc637
feat(entities): shared page-dir constants + surface entities in list/…
KylinMountain May 31, 2026
bd81f7e
feat(entities): remove preview lists entity-page actions (#1)
KylinMountain May 31, 2026
3d7c842
docs(entities): document entity pages in shipped openkb skill (#8)
KylinMountain May 31, 2026
022aad4
fix(compiler): don't write raw JSON body on empty LLM content
KylinMountain May 31, 2026
1e2d5e0
fix(compiler): graceful scalar plan + rebuild malformed entity frontm…
KylinMountain May 31, 2026
b245128
fix(compiler): keep ## Entities before ## Explorations; drop dead par…
KylinMountain May 31, 2026
2f09fad
test(compiler): cover empty-content skip, scalar plan, malformed enti…
KylinMountain May 31, 2026
0a27c04
fix(compiler): silence spurious 'hand-edited' warning on backlink sec…
KylinMountain May 31, 2026
58f8edc
feat(cli): add `recompile` command to re-run compile on indexed docs
KylinMountain May 31, 2026
d10055d
test(cli): recompile dispatch/dry-run/skip/refresh-schema
KylinMountain May 31, 2026
1416b1f
docs(readme): document openkb recompile
KylinMountain May 31, 2026
c39baf0
fix(cli): recompile --refresh-schema no-ops when AGENTS.md absent; ti…
KylinMountain May 31, 2026
b9d0dc7
fix(compiler): drop non-existent 'related' slugs so they don't create…
KylinMountain May 31, 2026
19d0f61
fix: remove-preview detects JSON-quoted sources; _write_entity preser…
KylinMountain Jun 1, 2026
e30e40a
feat(cli): rename remove --keep-empty-concepts → --keep-empty (covers…
KylinMountain Jun 1, 2026
cea1b5e
feat(compiler): config-driven entity types (entity_types overrides th…
KylinMountain Jun 1, 2026
9555889
fix(compiler): harden config-driven entity types (crash-proof + compl…
KylinMountain Jun 1, 2026
4eb3797
refactor: move entity-type resolution to config layer + co-locate rem…
KylinMountain Jun 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ wiki/ │ ← the foundation
├── sources/ Full-text conversions
├── summaries/ Per-document summaries
├── concepts/ Cross-document synthesis ← the good stuff
├── entities/ Specific named things (people, orgs, places, products)
├── explorations/ Saved query results
└── reports/ Lint reports
Expand Down Expand Up @@ -136,9 +137,10 @@ Short docs are read in full by the LLM. Long PDFs are indexed by PageIndex into
When you add a document, the LLM:

1. Generates a **summary** page
2. Reads existing **concept** pages
2. Reads existing **concept** and **entity** pages
3. Creates or updates concepts with cross-document synthesis
4. Updates the **index** and **log**
4. Creates or updates **entity** pages (people, orgs, places, products)
5. Updates the **index** and **log**

A single source might touch 10-15 wiki pages. Knowledge accumulates: each document enriches the existing wiki rather than sitting in isolation.

Expand All @@ -152,7 +154,8 @@ OpenKB commands fall into two layers: the **wiki foundation** (compile + manage
|---|---|
| `openkb init` | Initialize a new knowledge base (interactive) |
| <code>openkb&nbsp;add&nbsp;&lt;file_or_dir_or_URL&gt;</code> | Add documents and compile to wiki. URL ingest auto-detects PDF (saved as `.pdf` → PageIndex / markitdown) vs HTML (trafilatura main-content extract → `.md`) |
| <code>openkb&nbsp;remove&nbsp;&lt;doc&gt;</code> | Remove a document and clean up its wiki pages, images, registry, and PageIndex state (use `--dry-run` to preview, `--keep-raw` / `--keep-empty-concepts` to retain artifacts) |
| <code>openkb&nbsp;remove&nbsp;&lt;doc&gt;</code> | Remove a document and clean up its wiki pages, images, registry, and PageIndex state (use `--dry-run` to preview, `--keep-raw` / `--keep-empty` to retain artifacts) |
| <code>openkb&nbsp;recompile&nbsp;[&lt;doc&gt;]&nbsp;[--all]</code> | Re-run the current compile pipeline on already-indexed docs (e.g. to backfill the `entities/` layer) without re-indexing. Regenerates summaries and rewrites concept pages — manual edits are overwritten. Use `--dry-run` to preview, `--refresh-schema` to also update `wiki/AGENTS.md` |
| `openkb watch` | Watch `raw/` and auto-compile new files |
| `openkb lint` | Run structural + knowledge health checks |
| `openkb list` | List indexed documents and concepts |
Expand Down Expand Up @@ -268,6 +271,8 @@ language: en # Wiki output language
pageindex_threshold: 20 # PDF pages threshold for PageIndex
```

`entity_types` (optional): a YAML list overriding the entity-type vocabulary used for entity pages; omit it to use the default `person`, `organization`, `place`, `product`, `work`, `event`, `other`.

Model names use `provider/model` LiteLLM [format](https://docs.litellm.ai/docs/providers) (OpenAI models can omit the prefix):

| Provider | Model example |
Expand Down
9 changes: 9 additions & 0 deletions config.yaml.example
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
model: gpt-5.4 # LLM model (any LiteLLM-supported provider)
language: en # Wiki output language
pageindex_threshold: 20 # PDF pages threshold for PageIndex

# Optional: override the entity-type vocabulary used for entity pages.
# Omit this key to use the default 7 types
# (person, organization, place, product, work, event, other).
# entity_types:
# - person
# - organization
# - dataset
# - model
Loading