Summary
The megaparsec-based UPLC textual parser in plutus-core 1.45.0.0 fails to parse a syntactically valid (case (constr 0 b1..b8) (lam ...)) expression when the single case branch contains a body of approximately 490 lines. The error position reported is misleading (points to the outer (lam start) due to megaparsec backtracking.
Reproduction
File: the HTLC scenario from Unisay/scalus-cape-submissions, commit 0846510184d98b054c4405c4d036154444467d9f, file src/htlc/htlc.uplc.
- Size: 51 700 bytes
- SHA-256:
abd5609f2da11b37cdb71d8314b34d94012bee1f3b2a3b43793a715228818493
- Program header:
(program 1.1.0 ...)
Trigger structure (outer program body, lines 1–5 of the file):
(program 1.1.0 (case (constr 0
(force (force (builtin chooseList)))
(force (force (builtin fstPair))) (force (builtin headList))
(force (builtin ifThenElse)) (force (builtin mkCons))
(force (force (builtin sndPair))) (force (builtin tailList))
(force (builtin trace))) (lam __builtin_ChooseList
(lam __builtin_FstPair
...~490 lines of body...
))))
How to reproduce:
import UntypedPlutusCore.Parser (parseProgram)
import Data.Text.IO qualified as T
main :: IO ()
main = do
src <- T.readFile "htlc.uplc"
case parseProgram src of
Left err -> print err
Right _ -> putStrLn "OK"
Error:
test:5:32:
5 | (force (builtin trace))) (lam __builtin_ChooseList
| ^
unexpected '('
expecting ')'
What was ruled out
- Paren balance: 437 opens = 437 closes. File is well-formed.
case/constr support: parser version 1.45.0.0 does support case/constr (language version 1.1.0). fibonacci Scalus 0.16.0 UPLC in the same repo — which also uses (force (case (constr 0 ...) ...)) but with a small body — parses and verifies fine.
- Constants: removing
(con string …) and (con data …) constants does not change the failure.
- Structure in isolation: a minimal
(case (constr 0 b1..b8) (lam x1 (lam x2 ... (lam x8 small-body)))) parses fine.
- Hidden characters / encoding: no BOM, no CRLF, plain UTF-8.
Root cause hypothesis
caseTerm in UntypedPlutusCore/Parser.hs uses many term for branches. When term starts parsing the (lam __builtin_ChooseList ...) branch and then fails somewhere deep inside the ~490-line body, megaparsec's many backtracking causes the error to be reported at the outermost ( of the failed branch (line 5:32) rather than at the actual failure site. The actual parser error is lost and the reported position is misleading.
The underlying issue may be one of:
- A missing
try combinator on one of the sub-parsers inside term, causing megaparsec to incorrectly drop progress and backtrack to the many term boundary.
- An error inside the branch body that is not recoverable — the parser cannot finish parsing
(lam __builtin_ChooseList BODY) due to some construct inside BODY — but the reported location is wrong.
Impact
Any UPLC file generated by Scalus 0.16.0's toUplcOptimized() for contracts with 8+ forced builtins (e.g. HTLC, other Cardano V3 validators) will fail to parse under plutus-core ^>=1.45. The workaround is to use toUplc() (no CaseConstrApply optimization), which produces larger but parseable UPLC.
This blocks UPLC-CAPE benchmark submissions for complex Scalus 0.16.0 contracts.
Environment
plutus-core: 1.45.0.0
- Parser:
UntypedPlutusCore.Parser.parseProgram (megaparsec-based)
- UPLC language version:
1.1.0 (Plutus V3 / Conway)
- Confirmed via
cabal repl in UPLC-CAPE repo
Summary
The megaparsec-based UPLC textual parser in
plutus-core 1.45.0.0fails to parse a syntactically valid(case (constr 0 b1..b8) (lam ...))expression when the single case branch contains a body of approximately 490 lines. The error position reported is misleading (points to the outer(lamstart) due to megaparsec backtracking.Reproduction
File: the HTLC scenario from Unisay/scalus-cape-submissions, commit
0846510184d98b054c4405c4d036154444467d9f, filesrc/htlc/htlc.uplc.abd5609f2da11b37cdb71d8314b34d94012bee1f3b2a3b43793a715228818493(program 1.1.0 ...)Trigger structure (outer program body, lines 1–5 of the file):
How to reproduce:
Error:
What was ruled out
case/constrsupport: parser version 1.45.0.0 does supportcase/constr(language version 1.1.0). fibonacci Scalus 0.16.0 UPLC in the same repo — which also uses(force (case (constr 0 ...) ...))but with a small body — parses and verifies fine.(con string …)and(con data …)constants does not change the failure.(case (constr 0 b1..b8) (lam x1 (lam x2 ... (lam x8 small-body))))parses fine.Root cause hypothesis
caseTerminUntypedPlutusCore/Parser.hsusesmany termfor branches. Whentermstarts parsing the(lam __builtin_ChooseList ...)branch and then fails somewhere deep inside the ~490-line body, megaparsec'smanybacktracking causes the error to be reported at the outermost(of the failed branch (line 5:32) rather than at the actual failure site. The actual parser error is lost and the reported position is misleading.The underlying issue may be one of:
trycombinator on one of the sub-parsers insideterm, causing megaparsec to incorrectly drop progress and backtrack to themany termboundary.(lam __builtin_ChooseList BODY)due to some construct inside BODY — but the reported location is wrong.Impact
Any UPLC file generated by Scalus 0.16.0's
toUplcOptimized()for contracts with 8+ forced builtins (e.g. HTLC, other Cardano V3 validators) will fail to parse underplutus-core ^>=1.45. The workaround is to usetoUplc()(no CaseConstrApply optimization), which produces larger but parseable UPLC.This blocks UPLC-CAPE benchmark submissions for complex Scalus 0.16.0 contracts.
Environment
plutus-core: 1.45.0.0UntypedPlutusCore.Parser.parseProgram(megaparsec-based)1.1.0(Plutus V3 / Conway)cabal replin UPLC-CAPE repo