The stack overflow in resolve_class_fully_inner (caused by unbounded
mutual recursion between the lazy resolution pipeline and virtual
member providers) has been fixed via topological sort with eager cache
pre-population (ER1) plus recursion/depth guards throughout the
resolution pipeline (ER2). Virtual member application is now fully
iterative (ER3): providers read from the resolved-class cache instead
of resolving dependencies recursively, and topological population
order guarantees all dependencies are already cached. The depth guards
from ER2 remain as safety nets that never fire in practice.
Incremental repopulation on file change (ER4) ensures evicted classes
are eagerly re-populated in dependency order. String interning (ER6)
makes all symbol name comparisons pointer-sized and gives identity
hashing on hot-path lookups.
| Label | Scale |
|---|---|
| Impact | Critical, High, Medium-High, Medium, Low-Medium, Low |
| Effort | Low (≤ 1 day), Medium (2-5 days), Medium-High (1-2 weeks), High (2-4 weeks), Very High (> 1 month) |
PHPantom's diagnostic pipeline calls resolve_class_fully_inner
(which runs resolve_class_with_inheritance + virtual member
application + interface merging) every time a variable's type needs
to be resolved during analysis. On the shared project (~36k PHP
files), a single analyze run triggers 251,032 calls to
resolve_class_with_inheritance in the diagnostic phase alone.
The resolved-class cache mitigates this for repeated lookups of the
same class, but the forward walker (build_diagnostic_scopes) still
pays the cost of cache-key construction, lock acquisition, and
context setup on every expression. On examples/demo.php (~6 000 lines,
~300 classes), build_diagnostic_scopes alone takes 35.6 seconds
(78% of total analysis time in debug builds). The diagnostic
collectors add another 9.8 seconds (21%). Everything else (init,
parse, eager populate) is under 1 second.
Mago (mago/crates/codex/src/) uses a fundamentally different
architecture:
-
Separated method storage.
ClassLikeMetadata.methodsis anAtomSet(just names). The actual method metadata lives in a globalCodebaseMetadata.function_likes: HashMap<(Atom, Atom), FunctionLikeMetadata>keyed by(declaring_class, method_name). During inheritance, Mago copies only lightweightAtomidentifiers (interned string pointers). It never clones method metadata. -
One-pass population.
populate_codebase()toposorts classes, then walks each one iteratively, merging trait/parent/interface members by copyingMethodIdentifieratoms. After population theCodebaseMetadatais immutable. The analyzer just reads from it. -
O(1) method resolution. The analyzer resolves a method call with two hash lookups: get class metadata, get method by identifier. No inheritance walk, no merging, no virtual member application. Everything was pre-computed.
-
String interning (
Atom). All names are interned. Equality is a pointer comparison. HashMap lookups use identity hashing. -
Single-pass analysis. The analyzer walks the AST once, resolving types and emitting diagnostics inline. There is no separate "build scope cache" pass followed by diagnostic collectors. Each statement is analyzed as it is encountered.
Impact: High · Effort: High
Restructure PHPantom's class/method storage to match Mago's architecture. This is the single highest-impact change for analysis performance: it eliminates the per-expression resolution cost that dominates analysis time.
- Phase 1: MethodStore infrastructure. Added
Backend.method_store: MethodStore(anArc<RwLock<HashMap<(String, String), Arc<MethodInfo>>>>) populated alongsidefqn_indexinupdate_ast_innerandparse_and_cache_content_versioned. Eviction on re-parse viaevict_methods_for_fqns. - Phase 2: Centralised lookup helpers. Added
ClassInfo::get_method(),get_method_ci(),get_method_arc(), andhas_method(). Migrated all ~60 non-test call sites from.methods.iter().find(|m| m.name == X)/.methods.iter().any(|m| m.name == X)to these helpers. Future phases can swap the linear scan for an O(1) index without touching call sites again. - Phase 3: Arc-wrapped methods. Changed
ClassInfo.methodsfromSharedVec<MethodInfo>toSharedVec<Arc<MethodInfo>>. Inheritance merge now usesArc::clone(refcount bump) when no generic substitution is needed, avoiding deep clones of the fullMethodInfostruct. When substitution IS needed,Arc::make_mutgives copy-on-write semantics. - Phase 3.5: Atom-based FQN and cache keys. Changed
ClassInfo::fqn()to returnAtom(interned string, Copy, pointer-sized equality) instead of allocating a newStringon every call. ChangedResolvedClassCacheKeyfrom(String, Vec<String>)to(Atom, Vec<String>). This eliminates the FQN string allocation on every cache lookup (previously ~251K allocations per diagnostic pass on thesharedproject). Cache key construction for the common non-generic case is now allocation-free (just a Copy of theAtom+ an emptyVec::new()).
Reference implementation: mago/crates/codex/src/
The goal is to make the resolved codebase metadata immutable after
population so that the diagnostic/analysis pass never calls
resolve_class_with_inheritance at all. Instead, it reads from a
pre-populated, flat metadata store using O(1) lookups.
Phase 4 is broken into sub-phases:
Added method_index: AtomMap<u32> and indexed_method_count: u32
to ClassInfo. The get_method, get_method_arc, and has_method
helpers use O(1) hash lookup when the index is valid, falling back
to linear scan when the class has been mutated after indexing
(detected via indexed_method_count mismatch). The index is rebuilt
once in resolve_class_fully_inner right before the resolved class
is cached, ensuring all cached classes have a valid index while
classes under construction safely use the linear-scan fallback.
Fixed a bug where build_diagnostic_scopes ran twice per file in
release builds: once explicitly in the analyze loop, and again
inside collect_slow_diagnostics. Added an early-return guard that
skips the walk when the scope cache is already populated. This cut
slow time from 9.4s to 0.6s on examples/demo.php.
Impact: Critical. Effort: Medium-High.
Fresh profiling (perf, release build, examples/demo.php) shows the dominant cost is ClassInfo cloning and allocation churn:
| Symbol | Self % | Notes |
|---|---|---|
_int_malloc |
7.3% | heap allocation |
__memmove_avx_unaligned_erms |
5.7% | memcpy from clones |
ClassInfo::clone |
4.2% | deep-clone of ClassInfo |
_int_free_chunk |
4.2% | heap deallocation |
cfree |
3.6% | free() |
drop_in_place<ClassInfo> |
3.3% | destructor |
core::fmt::write |
3.0% | string formatting |
malloc_consolidate |
2.8% | allocator bookkeeping |
Vec::clone |
2.4% | vector cloning |
String::clone |
2.2% | string cloning |
Combined: ~38% of total CPU in allocation, cloning, and dropping.
The root cause: type_hint_to_classes_typed returns
Vec<ClassInfo> (owned). Every caller that has an
Arc<ClassInfo> must deep-clone the struct (which contains
SharedVec<Arc<MethodInfo>>, SharedVec<PropertyInfo>, multiple
Vec<Atom>, AtomMap fields, etc.) just to pass it through
the resolution pipeline.
Plan:
-
✓ Change
ResolvedType.class_infofromOption<ClassInfo>toOption<Arc<ClassInfo>>. This propagates Arc sharing through the entire variable resolution pipeline. Addedfrom_arc()andfrom_both_arc()constructors for zero-clone creation. -
✓ Changed
type_hint_to_classes_typedto returnVec<Arc<ClassInfo>>. Callers that need mutation (generic substitution, scope injection) clone only when necessary viaArc::make_mutorArc::unwrap_or_clone. -
✓
ResolvedType::into_arced_classesnow returns innerArcs directly (zero-cost, noArc::newwrapping).into_classeskept for the few callers needing ownedClassInfo. -
✓
resolve_rhs_method_call_innerandresolve_rhs_property_accessnow useVec<Arc<ClassInfo>>for owner classes, eliminating deep clones on every method/property access in the hot path. RemainingArc::unwrap_or_clonesites (~40) are in cold paths (definition lookup, hover, narrowing) or require deeper signature changes. -
✓
find_class_by_namecallers inresolver.rsnow useArc::clone+from_arc(refcount bump) instead ofClassInfo::clone(deep copy). 11 call sites converted.
Expected impact: Eliminates ~10-15% of total CPU (clone + drop + associated malloc/free). Combined with the double-walk fix, should bring examples/demo.php from ~9.5s to ~5-6s.
Impact: Medium. Effort: Low-Medium.
3% self-time in core::fmt::write + 1.4% in format_inner +
1.3% in Ustr::from (atom interning). These come from:
format!("{}\\{}", ns, name)inClassInfo::fqn()— called on every resolution. Already mitigated by returningAtombut the initial intern still allocates.- Type string construction during template substitution.
name.to_string()calls throughout the resolution pipeline.
Plan: audit hot-path format!() calls and replace with
pre-computed or cached values where possible.
Measured after Phase 4a + double-walk fix:
| Phase | Wall time | Notes |
|---|---|---|
| Init + parse + eager populate | < 1s | negligible |
build_diagnostic_scopes |
8.9s | forward walker |
| Fast diagnostics | 0.0s | |
| Slow diagnostics | 0.6s | (was 9.4s before double-walk fix) |
| Total | ~9.5s | target: < 3s |
Instrumentation data from the scope walk:
- 8,726 calls to
resolve_rhs_expression(8.3s top-level time) - 5,559 resolved-class cache hits, 396 misses
- Subject pipeline fallback: 1 call (negligible)
The 8.3s is dominated by allocation churn (ClassInfo cloning) inside the type resolution pipeline, not by resolution logic.
Impact: Medium-High. Effort: Medium.
Once Phase 4d (or earlier) makes the resolved codebase metadata immutable after population, the re-entrant resolution that currently requires thread-local recursion guards cannot occur. This phase removes those guards and the depth caps they protect:
-
RESOLVINGset andMAX_RESOLVE_DEPTH(30) invirtual_members/mod.rs. Thread-local set of class FQNs currently being resolved. When a class is already in the set, re-entrant calls return a partial result (base inheritance only, no virtual members). This produces non-deterministic results: whichever class in a mutual dependency (e.g. Model/Builder) is resolved first gets full virtual members, the other gets degraded. After eager population, all classes are resolved before any consumer queries them, so re-entry cannot happen. -
RESOLVE_DEPTHandMAX_RESOLVE_TARGET_DEPTH(60) incompletion/resolver.rs. Thread-local depth counter forresolve_target_classes_expr_inner. Guards against mutual recursion between subject resolution, call-return resolution, and variable resolution. The limit of 60 (vs typical chain depth of 5-10) indicates the recursion is caused by accidental re-entry into class resolution, not by the problem size. Once class resolution is a cache lookup, this re-entry path vanishes. -
LSP server eager population. The
analyse.rsCLI already runspopulate_from_sortedbefore diagnostics. The LSP server's Tokio threads do not. Extend eager population to run on file change in the LSP server (incrementally, not full re-population) so that interactive requests also benefit from pre-resolved metadata.
How the reference projects avoid this problem:
-
Mago: topologically sorts classes (
codex/src/populator/ sorter.rs) using a DFS withvisited+visitingsets. Cycles are broken silently whenvisiting.contains(&class_like). Each class is then populated exactly once bypopulate_class_like_metadata_iterative(non-recursive, assumes dependencies are done). No re-entrant resolution is possible. -
PHPStan: member lookup on
ClassReflectiondelegates toPhpClassReflectionExtension, which calls PHP's nativeReflectionClass(already resolved by the runtime). Each class has a single canonical instance viaMemoizingReflectionProvider(keyed by lowercase name), so re-entrant lookups hit the same cached object. Explicit cycle guards exist only for specific features:$resolvingTypeAliasImportsfor@type-importcycles,$inferClassConstructorPropertyTypesInProcessfor constructor property inference. -
Phpactor: three independent cycle-protection layers. (1)
ClassHierarchyResolver::doResolve()passes a$resolvedmap by value; if a class name is already a key, recursion stops. (2) Every reflection object carries a$visitedarray through its constructor;reflectClassLike()throwsCycleDetectedon re-entry. (3) Direct self-reference guards inparent()andancestors().
Success criteria:
RESOLVING,RESOLVE_DEPTH,MAX_RESOLVE_DEPTH, andMAX_RESOLVE_TARGET_DEPTHare removed from the codebase.mark_resolving/unmark_resolvingfunctions are removed.- No test regressions.
- LSP server runs eager population incrementally on file change.
Impact: Medium. Effort: Low.
After Phase 4e eliminates re-entrant class resolution and P20
eliminates exponential forward-walker blowup, the 32 MB stack
threads in analyse.rs (index workers, eager population, diagnostic
workers) and the 16 MB stack threads in references/mod.rs (parallel
parsing) should no longer be necessary.
- Run the full test suite and
analyseon the largest available project with default 8 MB stacks. - Remove each
stack_sizecall that no longer causes overflow. - The reference parsing threads (16 MB) may still be needed for pathological PHP files with extreme nesting; verify separately.
Note: Mago's build.rs uses a 36 MB stack for prelude parsing, so
inflated stacks are not inherently wrong for one-off build tasks.
The problem is needing them for every runtime analysis thread.
Success criteria:
- All
stack_sizecalls inanalyse.rsremoved. references/mod.rsPARSE_STACK_SIZEreduced to 8 MB or removed.- No stack overflows on the full test suite or the largest available project.
phpantom_lsp analyze --project-root . examples/demo.phpcompletes in under 2 seconds (release build).- Analysis time for examples/demo.php (~300 classes) approaches Mago's speed (sub-second).
- No test regressions in the existing test suite.
- Memory usage for the populated metadata is within 2x of the
current
ast_map+class_indexcombined size. - The diagnostic/analysis pass never calls
resolve_class_with_inheritance. All class metadata is pre-populated and immutable during analysis. - All thread-local recursion guards (
RESOLVING,RESOLVE_DEPTH) and depth caps (MAX_RESOLVE_DEPTH,MAX_RESOLVE_TARGET_DEPTH) are removed (Phase 4e). - All inflated
stack_sizecalls are removed or reduced to default (Phase 4f).