Implement interpolated strings via String.Concat#19971
Conversation
A string-typed interpolated string is lowered to System.String.Concat of its parts rather than the reflection-based printf engine: a string-typed hole is passed through directly, any other plain hole is converted with `string x`, an aligned/formatted hole with `String.Format(InvariantCulture, ...)`, and a printf-specifier hole with `sprintf`. This removes the reflection dependency on the common path, so these interpolations become trim- and NativeAOT-compatible. This generalizes and replaces the language-version-gated String.Concat optimization (dotnet#16556), which only handled all-string holes: the lowering now applies to every string-typed interpolation, ungated. The reflection path is used only for PrintfFormat/FormattableString-typed interpolation. The syntax tree now carries each hole's formatting explicitly, so a printf specifier no longer leaks into an adjacent literal and alignment is no longer a fake tuple: type SynInterpolatedStringPart = | String of value: string * range: range | FillExpr of fillExpr: SynExpr * formatting: SynInterpolationFormatting type SynInterpolationFormatting = | DotNet of alignment: SynExpr option * format: Ident option | Printf of specifier: string * range: range Behavioural change: plain `{x}` holes now render with invariant culture (the F# `string` operator) rather than the current thread culture, matching `string`. Adds a NativeAOT regression test under tests/AheadOfTime/NativeAOT. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
✅ No release notes required |
|
🔍 Tooling Safety Check — Affects-Build-Infra, Affects-Bootstrap, Affects-Compiler-Output, Affects-Test-Tooling
|
e2debcd to
f571df9
Compare
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| // print $"answer = %d{x}" | ||
| // print $"hello %s{name}" |
There was a problem hiding this comment.
%d and %s (unlike %.2f below) serve effectively only as a type annotation. Would it be difficult to identify this, strip the specifiers, and have these 2 lines also be lowered to concat?
There was a problem hiding this comment.
I assume the list is https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/plaintext-formatting, minus %a and %t.
The instructions for clean interpolated string usage could be "avoid format strings (Printf-style specifiers); use dotnet specifiers instead".
This could be modified to add "except you can use this whitelist" by doing what you suggest. It makes the instruction more complex so I wouldn't recommend that.
It could also be modified to "avoid using the %A printf-style format specifier" by working systematically through the list.
If these printf-style usages are important then going the whole way would be best.
There was a problem hiding this comment.
I believe that the current implementation of the feature does already convert $"hello %s{name}" into string.Concat, so changing it so that it doesn't isn't an improvement?
There was a problem hiding this comment.
@Numpsy, it applies only if every single hole is a string.
@charlesroddie, I was wrong in my assumption, it's not always just a type annotation :/
> $"{true}";;
val it: string = "True"
> $"%b{true}";;
val it: string = "true"Guess we could still do it by replicating the run-time logic of printf, but maybe leave it out of this PR.
There was a problem hiding this comment.
it applies only if every single hole is a string
Yes, my point was that the %s{name} case currently compiles into string.concat and doesn't produce any AOT warnings, but the comment here says it will produce warnings.
There was a problem hiding this comment.
I've updated %s so there is no regression. I'll investigate adding more format specifiers to the AOT-supported list.
There was a problem hiding this comment.
OK the status is that we have the legacy printf in F#, and it seeps into interpolated strings via printf format specifiers. Printf format specifiers are actually more complex than the simple list above, with various modifications possible. There is code which type checks according to a format specifier (CheckFormatStrings.ParseFormatString). But there is no single place which contains a spec, no type which defines a valid format specifier in the compiler, and no function which generates code based on a valid format specifier. It's all internal to printf and even the specs would involve looking at the printf code to see the behaviour.
So I think this prevents making significant progress here.
I suggest creating a language suggestion for discussion of this, which would define an "F# string format specifier" which would have all the nice things above that are currently missing.
It's very feasible to implement all this cleanly but it's a medium piece of work and requires a name for the feature without the term "printf" in it and with corresponding changes to public docs. So not for this PR in my opinion.
There was a problem hiding this comment.
> $"{true}";;
val it: string = "True"
> $"%b{true}";;
val it: string = "true"
I could add a change to make $"%b{true}" return True, in the spirit of matching string behaviour which is already done in matching string's cuture-independence in this PR.
Rewrite TcInterpolatedStringViaConcat to type-check each interpolation
part in place and convert it to a string expression, then String.Concat
them. This removes the parallel 'holeIsString' bool list and the
flat-fillExprs/dense-parts interleave entirely: 'build' now walks a
single list (the parts), threading only tpenv.
- Plain '{x}' holes are built directly in the typed tree: a string is
passed through raw (matching dotnet#16556's lean IL), anything else is
converted via the 'string' operator, emitted through a new
string_operator_info intrinsic + mkCallStringOperator helper.
- Aligned/formatted and printf holes are checked from a small synthesized
String.Format/sprintf expression, so name resolution still does the BCL
work.
- The function-value warning is re-homed per-hole.
Known follow-ups: ill-typed formatted holes currently report their error
twice (the formatted arm type-checks the hole once for the warning and
again inside String.Format); the warning wants to move to its own pass
over hole types, which also removes that double check.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A bare '%s' on a string hole now lowers to a String.Concat passthrough
(reflection-free, AOT-clean), like a plain '{string}' hole, instead of
routing through sprintf. This fixes the regression where '$"%s{name}"'
went through the printf engine. Other specifiers (and '%5s' etc.) still
format via sprintf, and the '%s' string constraint is still enforced.
convertHole now returns a (string expression, may-be-null) pair: only a
raw string passthrough can be null, so a lone such arg coalesces via the
'string' operator (e.g. $"%s{null}" and $"{(s: string)}" -> ""), while
multi-arg cases rely on String.Concat mapping null to "".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The formatted/aligned hole path lowered via a synthesized String.Format that re-type-checked the hole expression a second time, which could duplicate an error in the hole and leak a spurious 'Format' overload-resolution error. Bind the already-checked, boxed hole value to a temporary and reference that from the synthesized String.Format instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| // Format the already-checked hole via a synthesized 'String.Format', binding its boxed value | ||
| // to a temporary so the hole is not type-checked a second time. Re-checking 'synFill' would | ||
| // duplicate any error in it; boxing to 'obj' keeps the 'Format' overload unambiguous (so a | ||
| // hole that already failed to check doesn't also leak a confusing 'Format' overload error). | ||
| let boxedFill = mkCallBox g m fillTy fill | ||
| let tmpVal, _ = mkLocal mSynth "interpHole" (tyOfExpr g boxedFill) | ||
| let envInner = AddLocalVal g cenv.tcSink mSynth tmpVal env | ||
| let tmpRef = SynExpr.Ident(mkSynId mSynth tmpVal.LogicalName) |
There was a problem hiding this comment.
This code was added in e83ddfb. It avoids the previous short cut which has two minor issues of sometimes giving duplicate error messages, and type-checking twice. Worth considering if the benefits of the additional code are worth the cost.
These printf specifiers act only as a type annotation - their value renders the
same through the 'string' operator as through the specifier - so constrain the
hole to the specifier's type and render it with 'string', as for a plain '{x}'
hole, instead of routing through the reflection-based 'sprintf'. This makes them
NativeAOT compatible. '%u' is excluded as it reinterprets a signed value as
unsigned and so does not match 'string'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This PR implements fsharp/fslang-suggestions#1108 (comment), so that interpolated strings, where possible, are implemented via
String.Concat. The benefits are:DefaultInterpolatedStringHandlerfsharp/fslang-suggestions#1108 (comment), with perf measurements)There are some problems with the current implementation, and this PR takes a moderate approach, resolving some of them but retaining others to keep a broadly backwards-compatible implementation.
stringorPrintfFormat<...>. The latter is an unfortunate addition designed to allow expressions likeprintf $"...", working around the lack ofprintfunctions (fsharp/fslang-suggestions#1092). This should be marked obsolete whenprintfunctions are added.a) The parser handled these only partially: it left the specifier (e.g.
%f) inside the preceding string component, leaving the compiler step to locate and process it. In this PR the specifier is split off and attached to its hole in a parser helper instead. So it's improved but still messy.b) Any hole with these expressions goes through the previous reflection-based route.
stringare culture-independent. This is adjusted to match existing behaviour, using thestringfunction. This is in keeping with similar changes that have moved towardsstringbehaviour. (See the related discussion ofstringvsToStringbehaviour in fsharp/fslang-suggestions#919.)A NativeAOT test
A test under
tests/AheadOfTime/NativeAOT(wired into the Windows trimming CI job) AOT-publishes a program that uses interpolation. Plain and .NET-format holes ({x},{x:F2},{x,6}) are AOT compatible. Printf-format holes (%d{x}) still route throughsprintf, so they remain reflection-based and fail AOT with IL2026/IL2070/IL3050 — see the commented-out examples in the test.This would be a good place to add other currently AOT-incompatible expressions for similar future fixes.
Notes on the LowerInterpolatedStringToConcat feature flag
This work extends and supersedes the work under the LowerInterpolatedStringToConcat flag, an "optimization that lowers string interpolation into a call to concat iff there are at most 4 string parts and all fill expressions are strings".
The reason why this feature was gated is unclear since it's just an optimization, but it's unclear what to do with this gate here. Options:
stringin culture-independence) is a fix to existing behaviour.