feat(intrinsics): shim old adapters onto Adapter dataclass; call_intrinsic rewritten (Epic #929 Phase 1)#1269
Conversation
770181a to
9292fee
Compare
…all_intrinsic (Epic generative-computing#929 Phase 1) Old classes (IntrinsicAdapter, EmbeddedIntrinsicAdapter, CustomIntrinsicAdapter) now inherit from the new Adapter dataclass introduced in Phase 0 (PR generative-computing#1158). Each emits DeprecationWarning on construction and exposes an Identity triple, making isinstance(x, Adapter) return True alongside their existing type checks. call_intrinsic is rewritten to go through backend.resolve_adapter() and backend.adapter_scope() (no-op stub in Phase 1), removing the direct IntrinsicAdapter / EmbeddedIntrinsicAdapter branching from stdlib code. Requirement rerouting in OpenAIBackend and LocalHFBackend is updated to use Identity.capability for adapter lookup instead of isinstance + AdapterType checks. This removes the last internal callers that branched on the old class hierarchy and unblocks Phase 3 (issues generative-computing#1137, generative-computing#1138, generative-computing#1139, generative-computing#1140, generative-computing#1141). Closes generative-computing#1136 Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Fix test_huggingface_unit.py: remove dead patch of get_adapter_for_intrinsic (removed from huggingface.py imports in parent commit) and pre-populate _added_adapters with a proper Identity so the new capability-based lookup finds the adapter. Add two unit tests to test_shims.py covering the find-existing and no-model-id error paths of resolve_adapter. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Docstring quality gate (--fail-on-quality) requires the Returns: prefix to match the return type annotation exactly. Annotation is _AdapterCore; docstring incorrectly said Adapter. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Add AdapterMixin._find_adapter(capability, adapter_types) to centralise the capability-based lookup that was repeated inline four times across huggingface.py and openai.py. resolve_adapter's duplicate search loops also collapse to single _find_adapter calls. Clarify IntrinsicAdapter and EmbeddedIntrinsicAdapter docstrings: replace the bare identity/io_contract/weights Attributes entries with a .. note:: explaining these are Phase 1 internal scaffolding not meant for consumer use. Update test fixtures to wire _find_adapter through the real AdapterMixin implementation, and fix a misleading comment in test_huggingface_unit.py. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
The inline next(...) example was superseded by the _find_adapter helper extracted in the previous commit. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
4d9ef93 to
7804ec3
Compare
…pter ordering EmbeddedIntrinsicAdapter previously emitted DeprecationWarning before validating the technology argument, so an invalid call (e.g. technology='qlora') would both warn and raise. Validation now runs first. Add a Note: section to _find_adapter documenting that insertion-order is used and the Phase 0 preference-ordering semantics are not preserved; Phase 2 (issue generative-computing#1138) will revisit if multiple entries per capability become possible. Test updated: test_embedded_invalid_technology now asserts that no DeprecationWarning fires on an invalid-technology construction attempt. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
jakelorocco
left a comment
There was a problem hiding this comment.
Please ensure all intrinsic tests pass before merging this PR as well.
216 unit/integration intrinsic tests pass locally and on GPU. The five tests flagged below are pre-existing flaky tests on Ran each of the five suspicious tests 10× on
No regressions. The two tests with near-zero pass rates on main ( The pass-rate improvements for |
…cope from call_intrinsic _find_adapter previously accepted set[str] and returned the first hit in _added_adapters insertion order, ignoring any caller-specified type preference. Change the signature to tuple[str, ...] and iterate preferred types first so aLoRA is returned before LoRA when both are registered for the same capability, matching Phase 0's get_adapter_for_intrinsic behaviour. The HF backend was also discarding the preference order at the call site by building a set from action.adapter_types before passing to _find_adapter. Both call sites in huggingface.py now pass a tuple so the preference flows end-to-end from call_intrinsic through to adapter selection. Remove the adapter_scope wrapper from call_intrinsic. The HF backend already handles adapter activation inside _generate_with_adapter_lock under the generation lock, immediately before generate_func is called. Activating at the call_intrinsic level would be outside that lock and could race with concurrent async requests. Add a comment explaining this so the pattern is clear for Phase 2 implementors. Add test_find_adapter_honours_type_preference_order to verify aLoRA wins over LoRA regardless of insertion order — the previous set-based implementation would have passed all prior tests since none registered both types for the same capability. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
|
@jakelorocco Above is a summary of running multiple tests on lsf/cuda both from upstream/main and this branch. The code changes here do not seem to cause a regression, so arguably could be merged (which would free up other changes) However the fact the baseline is failing tests is a concern so I will link independently at the two issues mentioned above (particularly the failures on cuda) |
…d_adapter Missed in the initial fix — openai.py had the same set[str] call sites as huggingface.py at lines 486 and 571. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
|
@jakelorocco I believe this is now clean. A full run did show up some intrinsics failures. some captured by existing issues (including xfails). But also anbother batch which I’ve just fixed in #1292 I’m ok to merge in either order, but logically getting 1292 in first, rebasing this, then retesting would the safest - happy to do that once 1292 is in. |
…mple
The "After (Phase 1)" snippet showed {"alora"} (set) but _find_adapter
takes tuple[str, ...] for preference-ordered iteration. A set has no
guaranteed order, defeating the aLoRA-before-LoRA preference the method
provides.
Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…ment resolve_adapter contracts _ShimWeightsBinding methods were silent no-ops while _ShimIOContract already raised NotImplementedError. The asymmetry means Phase 2 lifecycle code accidentally calling WeightsBinding.activate/deactivate on an un-migrated shim would silently succeed rather than surface the gap. Raise consistently. Also adds three inline comments to resolve_adapter: - warns that warnings.catch_warnings() is not async/thread-safe (Phase 2 gap) - notes the embedded branch is only valid for backends whose add_adapter accepts the full Adapter type (not LocalHFBackend) - documents that the AdapterType.LORA default is intentional (mirrors the pre-Phase-1 _util.py default) and is Phase 2 scope to improve Adds test_resolve_adapter_lazy_creates_and_returns covering the previously untested lazy-creation branch of resolve_adapter. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
|
added some review fixes. As per above, will re-run full intrinsics tests again after 1292 merges |
…tes_and_returns The test patched fetch_intrinsic_metadata but resolve_adapter calls IntrinsicAdapter without a config_dict, causing __init__ to fall through to intrinsics.obtain_io_yaml() and make a real HF Hub request. Mock obtain_io_yaml and builtins.open to keep this unit test fully offline, consistent with the hf_skip() pattern from PR generative-computing#1199. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
|
@markstur @AngeloDanducci @akihikokuroda — this PR has been open for 4 days. I've just pushed a fix for the CI failure (429 flake in |
|
Summary from Bob. PR #1269 Review — Epic #929 Phase 1 This PR correctly executes Phase 1 of Epic #929. The three legacy adapter classes become deprecation shims that satisfy isinstance(x, Adapter) against the new _core.Adapter dataclass while preserving all existing behaviour. The call_intrinsic rewrite is a clean ~50-line reduction. The aLoRA preference-ordering fix in _find_adapter is a genuine correctness improvement, independently validated by the GPU campaign. All CI checks pass. Concerns (non-blocking)
The comment on lines 431–433 explains the LORA default but doesn't link to the tracking issue. A caller that separately registers an aLoRA adapter and then calls call_intrinsic will always get the LORA one, since lazy creation always produces AdapterType.LORA. Suggest adding: # TODO(#1138): prefer aLoRA when catalog lists it
Lines 408–413 correctly flag the async/thread-safety gap but don't name the concrete failure mode: under asyncio, two concurrent first-time call_intrinsic calls could interleave their catch_warnings() contexts, causing a DeprecationWarning to leak to user code during concurrent lazy-registration. Low-severity (cosmetic), but worth naming explicitly before Phase 2 work begins.
Lines 1–9 still describe IntrinsicAdapter as a primary concrete subclass and get_adapter_for_intrinsic as a primary API — both are now legacy/deprecated. The primary new surface (resolve_adapter, _find_adapter) isn't mentioned. Easy fix to include before merge. Nits _ShimIOContract includes "Phase 2 (issue #1137)" in its NotImplementedError messages. _ShimWeightsBinding (lines 99–108) just says "WeightsBinding not yet implemented" — no phase or issue reference. Inconsistent; a Phase 2 contributor hitting that error gets no navigation hint. # Suggested:
raise NotImplementedError("Phase 2 (issue #1138) — WeightsBinding not yet implemented")
IntrinsicAdapter.init calls _AdapterCore.init(self, ...) directly at the end of its body. Python only invokes post_init automatically from the dataclass-generated init — a direct call skips it. Acceptable Phase 1 trade-off, but worth a short comment warning future maintainers. Praise Frozen-dataclass bypass is clean. object.setattr/object.delattr overrides on the shim classes are the right approach — they don't affect the production _core.Adapter dataclass and are well-documented in the class docstrings. The mutability tests verify it. call_intrinsic rewrite is minimal and architecturally correct. The key invariant — adapter activation must stay inside the backend's generation lock — is preserved and the comment explains the race condition that motivates it. Test coverage is thorough. 18 new unit tests covering all three shim classes, isinstance conformance, Identity correctness, legacy-attribute preservation, frozen-bypass, AdapterMixin interface, and the ordering fix. GPU campaign results are transparently reported with pre-existing failure attribution properly tracked in #1286, #1291, #1292. |
markstur
left a comment
There was a problem hiding this comment.
Some concerns, nits, PRAISE in a comment... from Bob.
See inline for mine, but basically just a docstring to note deprecation, and then there's that async problem that is already well-commented. Worth considering fixing those now, but I understand there is follow-up coming soon so... Approve
| # add_adapter is idempotent so the double-registration hazard is benign, but the | ||
| # filter race is a known Phase-1 gap. Phase 2 (#1138) introduces a proper lock. | ||
| # Suppress DeprecationWarning: the shim constructors warn user-facing code, | ||
| # not internal registration paths. |
There was a problem hiding this comment.
the race is commented above and understood.
Fyi -- this is the AI suggestion:
Suggestion: Even in Phase 1, consider adding a lock:
Add at class level:
_resolve_lock: asyncio.Lock = field(default_factory=asyncio.Lock, init=False)
# In resolve_adapter():
async with self._resolve_lock:
with warnings.catch_warnings():
warnings.simplefilter("ignore", DeprecationWarning)
# ... existing logicDecision to defer this is intentional.
| stub_backend._generate_with_adapter_lock = ( | ||
| lambda adapter_name, generate_func, *args: generate_func(*args) | ||
| ) | ||
| stub_backend._find_adapter = lambda cap, types=None: AdapterMixin._find_adapter( |
There was a problem hiding this comment.
nit: add_adapter and resolve_adapter should be stubbed to make the test helper robust and prevent confusing failures for future test authors.
pre-populating _added_adapters is either a workaround or author's preference.
| """Adapter classes for adding fine-tuned modules to inference backends. | ||
|
|
||
| Defines the abstract ``Adapter`` base class and its concrete subclasses | ||
| ``LocalHFAdapter`` (for locally loaded Hugging Face models) and ``IntrinsicAdapter`` |
There was a problem hiding this comment.
stale docstring should mention deprecation
Why
If you use
IntrinsicAdapter,EmbeddedIntrinsicAdapter, orCustomIntrinsicAdapterdirectly, you will start seeingDeprecationWarningafter this PR merges. These classes are being retired as part of Epic #929 — the newAdapterdataclass (introduced in Phase 0, PR #1158) is the replacement. Phase 1 (this PR) bridges the gap: the old classes still work exactly as before, but each one now inherits fromAdapterand warns on construction so you know to migrate.Most users reach adapter functions through the high-level wrappers in
mellea.stdlib.components.intrinsic(e.g.check_certainty,check_answerability). Those callers are unaffected — they go throughcall_intrinsic, which was rewritten to use the newAdapterMixin.resolve_adapterAPI and no longer touches the deprecated classes directly.What changed
(a) Legacy adapter classes become deprecation shims —
IntrinsicAdapter,EmbeddedIntrinsicAdapter, andCustomIntrinsicAdapternow:DeprecationWarning(stacklevel=2)on construction (pointing to the caller's frame, not library internals)Adapterdataclass, soisinstance(x, Adapter)returnsTruePrivate
_ShimIOContractand_ShimWeightsBindingstubs satisfy the newAdapterprotocol in Phase 1; the real implementations come in Phase 2 (issues #1137, #1138).(b) Internal callers switch to the new API —
call_intrinsicis rewritten to usebackend.resolve_adapter(name)(no-op stub in Phase 1), removing ~50 lines of manual adapter-loading logic.AdapterMixingains three new methods:resolve_adapter,adapter_scope(context manager), and_find_adapter(private helper that centralises the capability-based lookup previously duplicated inline acrossLocalHFBackendandOpenAIBackend). Adapter activation remains inside the backend's generation lock, not at thecall_intrinsiclevel — this avoids a race condition when Phase 2 implements real activation.User impact
check_answerability,check_certainty, etc.IntrinsicAdapterdirectlyDeprecationWarningon construction — code still worksEmbeddedIntrinsicAdapterdirectlyDeprecationWarningon construction — code still worksCustomIntrinsicAdapterDeprecationWarningon construction — code still worksisinstance(x, IntrinsicAdapter)/isinstance(x, EmbeddedIntrinsicAdapter)_find_adapternow acceptstuple[str, ...]and iterates preferred types first; the HF and OpenAI backends passtuple(notset) so aLoRA is always preferred over LoRA when both are registeredTo silence the warnings now, filter
DeprecationWarningfrommellea.backends.adapters. Migration guide to the newAdapterdataclass will land with Phase 4 (issue #1144).Where this fits in Epic #929
What changed (files)
mellea/backends/adapters/adapter.py_ShimIOContract/_ShimWeightsBinding;AdapterMixin.resolve_adapter,adapter_scope,_find_adaptermellea/stdlib/components/intrinsic/_util.pycall_intrinsicrewritten;adapter_scoperemoved (activation is backend's responsibility)mellea/backends/openai.py_find_adapter;tuplecall sitesmellea/backends/huggingface.py_find_adapter;tuplecall sitesdocs/dev/requirement_aLoRA_rerouting.mdtest/backends/test_adapters/test_shims.py_find_adapterordering testtest/backends/test_adapters/test_embedded_adapter.pyTesting
GPU test campaign
216 unit/integration intrinsic tests pass. Five GPU-only tests were investigated with a 10-run campaign on the same GPU cluster, comparing
upstream/main(317d5d9) against this branch (f6fb2e7):test_run_transformers[context_relevance_alora]test_run_transformers[uncertainty_alora]test_run_transformers[requirement_check_alora]test_run_transformers[context-attribution]test_groundedness_e2e_string_documentsNo regressions from this PR. The pass-rate improvements for
context-attributionandgroundednessare a side-effect of the_find_adapterordering fix — aLoRA is now selected consistently, producing better citation quality when it works.The low pass rates on both branches are pre-existing issues, tracked and fixed in #1292:
context_relevance_alora/uncertainty_alora(0/10): peft aLoRA offsets disabled due toinputs=vsinput_ids=call — see test: context_relevance_alora and uncertainty_alora fail on CPU — peft cannot compute aLoRA offsets without input_ids #1286requirement_check_alora/context-attribution/groundedness_e2e: score non-determinism and parser bug — see test: GPU score non-determinism causes flaky failures in context-attribution, requirement_check_alora, groundedness_e2e #1291Attribution
Fixes #1136