repowise analyses codebases written in many languages. Each language goes through a multi-stage pipeline: file traversal, AST parsing, import resolution, call graph construction, and heritage (inheritance) extraction. Not every language has reached full coverage yet --- this page documents exactly what works today, what is coming next, and how to add a new language.
Every language falls into one of five tiers. The tier determines which pipeline stages produce meaningful output.
| Stage | Full | Partial | Scaffolded | Traversal | Config / Data |
|---|---|---|---|---|---|
| File discovery & git history | Y | Y | Y | Y | Y |
| AST symbol extraction | Y | Y | -- | -- | -- |
| Import resolution | Y | Y | -- | -- | -- |
| Call graph edges | Y | partial | -- | -- | -- |
| Heritage (extends/implements) | Y | partial | -- | -- | -- |
| Named bindings | Y | -- | -- | -- | -- |
| Dead code detection | Y | Y | Y | Y | -- |
| Semantic search & wiki pages | Y | Y | Y | Y | Y |
Languages with complete pipeline coverage: AST parsing, import resolution, call resolution, named bindings, heritage extraction, and docstrings.
| Language | Extensions | Entry Points | Import Style |
|---|---|---|---|
| Python | .py .pyi |
main.py app.py __main__.py manage.py wsgi.py asgi.py |
import x / from x import y |
| TypeScript | .ts .tsx |
index.ts main.ts app.ts server.ts |
import { x } from 'y' / require() with tsconfig path aliases, npm/yarn/pnpm workspaces, and optional .vue/.svelte/.astro SFC probing |
| JavaScript | .js .jsx .mjs .cjs |
index.js main.js app.js server.js |
import / require() |
| Java | .java |
Main.java Application.java |
import pkg.Class |
| Go | .go |
main.go cmd/main.go |
import "path" with multi-module go.mod discovery (longest-prefix match) |
| Rust | .rs |
main.rs lib.rs |
use crate:: / use super:: / use self:: with Cargo.toml |
| C++ | .cpp .cc .cxx .h .hpp .hxx |
main.cpp main.cc |
#include with compile_commands.json resolution |
| C# | .cs |
Program.cs Startup.cs |
using Acme.Domain / global using / using static / using Alias = X.Y.Z with .csproj / .sln / Directory.Build.props resolution |
All eight languages support:
- Tree-sitter AST parsing with dedicated
.scmquery files - Three-tier call resolution (same-file, cross-file, global stem match)
- Named binding extraction (mapping imported names to source symbols)
- Heritage extraction (class/interface/trait/record inheritance chains)
- Docstring extraction (Python, JSDoc, GoDoc, Rustdoc, Javadoc, Doxygen, XML doc)
- Framework-aware edges (Django, FastAPI, Flask for Python; tsconfig path aliases for TS/JS; pytest fixture detection; ASP.NET controllers / minimal API / EF Core DbContext for C#; Spring Boot DI +
@Beanfactories for Java/Kotlin; Rails routes + ActiveRecord relationships; Laravel routes + service providers + Eloquent; Expressapp.use(router)+ NestJS@Modulearrays; Gin/Echo/Chi router → handler files for Go; Axum/Actix.route→ handler files for Rust) - Per-language dynamic-hint extractors (Django/Pytest/Node for Python+JS/TS; .NET DI/Activator/InternalsVisibleTo for C#; Spring
getBean/@Beanfactories for Java/Kotlin; Rubysend/const_get/define_method/delegate; PHPcall_user_func/ReflectionClass/containerget; ScalaClass.forName/given/implicit val; SwiftNSClassFromString/Selector/#selector/KVC; C function-pointer assignment +dlopen/dlsym; Luaugame:GetService/setmetatable __index; Goreflect.TypeOf/plugin.Open/plugin.Lookup) - For C# only: MSBuild project graph (
<ProjectReference>/<PackageReference>), namespace → file mapping across projects,global using/using static/using aliaspropagation, ASP.NET HTTP and gRPC-dotnet contract extraction in workspace mode, cross-repo<ProjectReference>and internal-NuGet detection
AST parsing, symbol extraction, import resolution, call resolution, named bindings, heritage extraction (including Ruby mixins, Rust derive, Swift extension conformance, PHP trait use), and docstrings. Dedicated import resolvers for each language.
| Language | Extensions | Entry Points | Import Style |
|---|---|---|---|
| C | .c |
main.c |
#include with compile_commands.json (shares C++ grammar) |
| Kotlin | .kt .kts |
Main.kt Application.kt |
import com.example.Foo with Gradle settings.gradle(.kts) subprojects + sourceSets overrides |
| Ruby | .rb |
main.rb app.rb config.ru |
require 'mod' / require_relative './mod' plus Rails / Zeitwerk autoloading (gated on config/application.rb) |
| Swift | .swift |
main.swift App.swift |
import Foundation with SPM Package.swift targets: → directory mapping |
| Scala | .scala |
Main.scala App.scala |
import pkg.{A, B => C} with SBT build.sbt / Mill build.sc multi-project parsing |
| PHP | .php |
index.php public/index.php |
use Foo\Bar\Baz with composer.json autoload.psr-4 longest-prefix resolution |
Non-code files included in the file tree and wiki. Special handlers extract endpoints or targets where applicable.
| Language | Extensions / Filenames | Special Handler |
|---|---|---|
| OpenAPI | YAML/JSON with openapi or swagger key |
Extracts API paths and schemas |
| Dockerfile | Dockerfile |
Extracts stages and exposed ports |
| Makefile | Makefile GNUmakefile |
Extracts targets |
| Protobuf | .proto |
-- |
| GraphQL | .graphql .gql |
-- |
| Terraform | .tf .hcl |
-- |
| YAML | .yaml .yml |
-- |
| JSON | .json |
-- |
| TOML | .toml |
-- |
| Markdown | .md .mdx |
-- |
| SQL | .sql |
-- |
| Shell | .sh .bash .zsh |
-- |
| Language | Extensions | Entry Points | Import Style |
|---|---|---|---|
| Luau | .luau .lua |
init.luau init.lua |
require(script.Parent.X) / require(script.X) / require(game.Service.Path) / require("rel/path") |
AST parsing, symbol extraction (functions, Luau type aliases), and
require(...) call capture are wired. Import resolution handles string
literals and script/script.Parent relative instance paths. Absolute
Roblox instance paths (game.<Service>...) currently register as external
nodes and are the target of a follow-up that reads Rojo's
default.project.json tree mapping — see issue #52.
These languages are tracked in git history (blame, hotspot analysis, co-change detection) but have no AST parsing or dedicated support. Files appear in the wiki as traversal-level entries.
Objective-C, Elixir, Erlang, R, Dart, Zig, Julia, Clojure, Elm, Haskell, OCaml, F#, Crystal, Nim, D
File discovered by FileTraverser
|
v
Extension/filename -> LanguageTag (via LanguageRegistry)
|
+-- Config/data language? -> empty ParsedFile (passthrough)
+-- Special format? -> special_handlers.py (OpenAPI/Dockerfile/Makefile)
+-- Has grammar? -> tree-sitter AST parsing
|
v
.scm query extracts:
@symbol.def / @symbol.name -> Symbol nodes
@import.statement / @import.module -> Import edges
@call.target / @call.receiver -> Call edges
|
v
Per-language extractors:
- Named bindings (import name -> source symbol)
- Heritage (extends/implements/traits)
- Docstrings (Python, JSDoc, GoDoc, Rustdoc, Javadoc)
- Visibility (public/private/protected)
|
v
GraphBuilder resolves imports:
Python: dotted module paths, __init__.py, src/ layout
TS/JS: relative paths, tsconfig aliases, node_modules
Go: go.mod module path stripping
Rust: crate::/self::/super::, mod.rs probing
C/C++: compile_commands.json include directories
Other: stem-map fallback (filename matching)
|
v
Graph analysis:
PageRank, community detection, dead code, execution flows
The pipeline is fully modular. Language identity data lives in the
centralised LanguageRegistry, per-language extraction logic lives in
extractors/, and per-language import resolution lives in resolvers/.
Adding a new language touches these places:
Edit packages/core/src/repowise/core/ingestion/languages/registry.py and
add a new LanguageSpec(...) entry to the _SPECS tuple:
LanguageSpec(
tag="mylang",
display_name="MyLang",
extensions=frozenset({".ml"}),
grammar_package="tree_sitter_mylang", # PyPI package name
scm_file="mylang.scm", # query file name
heritage_node_types=frozenset({"class_declaration"}),
entry_point_patterns=("main.ml",),
manifest_files=("mylang.toml",),
shebang_tokens=("mylang",),
builtin_calls=frozenset({"print", "len"}), # filter from call graph
builtin_parents=frozenset({"Object"}), # filter from heritage
color_hex="#AB47BC",
)Add "mylang" to the LanguageTag Literal type in
packages/core/src/repowise/core/ingestion/models.py.
Create packages/core/src/repowise/core/ingestion/queries/mylang.scm using
tree-sitter S-expression syntax. Follow the capture-name conventions:
| Capture | Purpose | Required? |
|---|---|---|
@symbol.def |
Full definition node (line numbers, kind lookup) | Yes |
@symbol.name |
Name identifier | Yes |
@symbol.params |
Parameter list | No |
@symbol.modifiers |
Decorators / visibility modifiers | No |
@symbol.receiver |
Go-style method receiver | No |
@import.statement |
Full import node | Yes |
@import.module |
Module path being imported | Yes |
@call.target |
Function/method being called | No (enables call graph) |
@call.receiver |
Object the call is made on | No |
@call.arguments |
Call arguments | No |
Look at existing .scm files for examples --- python.scm and
typescript.scm are good starting points.
Add a parser configuration to LANGUAGE_CONFIGS in
packages/core/src/repowise/core/ingestion/parser.py:
"mylang": LanguageConfig(
symbol_node_types={
"function_definition": "function",
"class_definition": "class",
},
import_node_types=["import_statement"],
export_node_types=[],
visibility_fn=public_by_default, # from extractors.visibility
parent_extraction="nesting",
parent_class_types=frozenset({"class_definition"}),
entry_point_patterns=["main.ml"],
),Add the grammar package to pyproject.toml:
[project]
dependencies = [
# ...
"tree-sitter-mylang>=0.23,<1",
]For full-tier support, add a extract_mylang_bindings() function in
packages/core/src/repowise/core/ingestion/extractors/bindings.py and
register it in the extract_import_bindings() dispatcher. Without this,
imports are still resolved but named-binding-level call resolution won't
work.
Add a _extract_mylang_heritage() function in
packages/core/src/repowise/core/ingestion/extractors/heritage.py and
register it in the HERITAGE_EXTRACTORS dict. Without this, inheritance
chains won't appear in the graph.
If the language has a non-trivial import system, create a resolver in
packages/core/src/repowise/core/ingestion/resolvers/mylang.py and
register it in the _RESOLVERS dict in resolvers/__init__.py. For simple
languages, the generic stem-map fallback (matching by filename) works out
of the box.
# Run the parser tests
pytest tests/ -k "mylang or sample_repo" -x
# Index a real project
repowise init /path/to/mylang-projectNo changes are needed to traverser.py, dead_code.py,
page_generator.py, cost_estimator.py, or any other consumer file ---
they all derive their language sets from the registry automatically.
The language pipeline is fully modular. Per-language code lives in dedicated subpackages — adding a new language means dropping a file into each subpackage rather than editing monoliths.
ingestion/
languages/ # LanguageRegistry + LanguageSpec (identity data)
extractors/ # Per-language AST extraction
visibility.py # symbol visibility (public/private/protected)
signatures.py # human-readable signature building
docstrings.py # module + symbol docstring extraction
bindings/ # import name + alias binding extraction (per-lang)
__init__.py # extract_import_bindings dispatcher
python.py ts_js.py go.py rust.py java.py kotlin.py
ruby.py csharp.py swift.py scala.py php.py cpp.py
heritage/ # inheritance/interface/trait extraction (per-lang)
__init__.py # extract_heritage + HERITAGE_EXTRACTORS dispatcher
python.py ts_js.py java.py go.py rust.py cpp.py
kotlin.py ruby.py swift.py csharp.py scala.py php.py
resolvers/ # Per-language import resolution
python.py # dotted imports, __init__.py, src/ layout
typescript.py # multi-ext probe, tsconfig aliases
go.py # go.mod module path stripping
rust.py # crate::/self::/super::, mod.rs probing
cpp.py # compile_commands.json include paths
kotlin.py # package-to-directory mapping
ruby.py # require/require_relative resolution
csharp.py / dotnet/ # namespace-based + MSBuild project graph
swift.py # module import resolution
scala.py # package-to-directory mapping
php.py # namespace/PSR-4 resolution
generic.py # stem-matching fallback
framework_edges.py # Django, FastAPI, Flask, pytest, ASP.NET detection
dynamic_hints/ # Per-language dynamic-edge extractors
base.py # DynamicHintExtractor + DynamicEdge
registry.py # HintRegistry
django.py pytest_hints.py node.py dotnet.py
spring.py ruby.py php.py scala.py swift.py c.py luau.py go.py
parser.py # ASTParser (language-agnostic orchestration)
graph.py # GraphBuilder (import/call/heritage resolution)
analysis/
dead_code/ # Dead code detection (Phase 1 split)
__init__.py # re-exports DeadCodeAnalyzer + dataclasses
analyzer.py # DeadCodeAnalyzer class + four detection passes
models.py # DeadCodeKind, DeadCodeFindingData, DeadCodeReport
constants.py # never-flag globs, framework decorators, fixtures
dynamic_markers.py # per-language source-text dynamic markers
Adding a new language requires zero changes to parser.py, graph.py,
traverser.py, or any analysis core file. New language work consists of
adding one file to each per-language subpackage and registering it in the
relevant __init__.py dispatcher / dict.
| Language | Target Tier | Status |
|---|---|---|
| Dart | Good | Planned — tree-sitter-dart available |
| Elixir | Good | Planned — tree-sitter-elixir available |
| F# | Good | Planned — tree-sitter-f-sharp available |