Skip to content

Implement interpolated strings via String.Concat#19971

Open
charlesroddie wants to merge 9 commits into
dotnet:mainfrom
charlesroddie:InterpolatedStringCollector
Open

Implement interpolated strings via String.Concat#19971
charlesroddie wants to merge 9 commits into
dotnet:mainfrom
charlesroddie:InterpolatedStringCollector

Conversation

@charlesroddie

@charlesroddie charlesroddie commented Jun 18, 2026

Copy link
Copy Markdown

This PR implements fsharp/fslang-suggestions#1108 (comment), so that interpolated strings, where possible, are implemented via String.Concat. The benefits are:

There are some problems with the current implementation, and this PR takes a moderate approach, resolving some of them but retaining others to keep a broadly backwards-compatible implementation.

  1. The type of an interpolated string is either string or PrintfFormat<...>. The latter is an unfortunate addition designed to allow expressions like printf $"...", working around the lack of print functions (fsharp/fslang-suggestions#1092). This should be marked obsolete when print functions are added.
  2. Printf format expressions are allowed in holes. This creates:
    a) The parser handled these only partially: it left the specifier (e.g. %f) inside the preceding string component, leaving the compiler step to locate and process it. In this PR the specifier is split off and attached to its hole in a parser helper instead. So it's improved but still messy.
    b) Any hole with these expressions goes through the previous reflection-based route.
  3. The rendering of strings was culture-dependent. This was likely unintentional since F# string functions like string are culture-independent. This is adjusted to match existing behaviour, using the string function. This is in keeping with similar changes that have moved towards string behaviour. (See the related discussion of string vs ToString behaviour in fsharp/fslang-suggestions#919.)
  4. The previous syntax tree was faithful, allowing a hole formatted by both a printf specifier and a .NET alignment/format — which the language does not allow — to be expressed. This is adjusted to the following:
type SynInterpolatedStringPart =
    | String of value: string * range: range
    | FillExpr of fillExpr: SynExpr * formatting: SynInterpolationFormatting

type SynInterpolationFormatting =
    | DotNet of alignment: SynExpr option * format: Ident option
    | Printf of specifier: string * range: range

A NativeAOT test

A test under tests/AheadOfTime/NativeAOT (wired into the Windows trimming CI job) AOT-publishes a program that uses interpolation. Plain and .NET-format holes ({x}, {x:F2}, {x,6}) are AOT compatible. Printf-format holes (%d{x}) still route through sprintf, so they remain reflection-based and fail AOT with IL2026/IL2070/IL3050 — see the commented-out examples in the test.

This would be a good place to add other currently AOT-incompatible expressions for similar future fixes.

Notes on the LowerInterpolatedStringToConcat feature flag

This work extends and supersedes the work under the LowerInterpolatedStringToConcat flag, an "optimization that lowers string interpolation into a call to concat iff there are at most 4 string parts and all fill expressions are strings".

The reason why this feature was gated is unclear since it's just an optimization, but it's unclear what to do with this gate here. Options:

  • Apply the behaviour in this PR unconditionally. This is what I would recommend as the behaviour change (matching string in culture-independence) is a fix to existing behaviour.
  • Restore the existing LowerInterpolatedStringToConcat to apply to the current feature. The name is actually closer to the current feature than the previous gated feature. This would require keeping legacy code paths which if you ask me to do, I would re-add to a file clearly marked obsolete to keep current code clean.
  • Add a new gate term. The ultra-bureaucratic solution.

A string-typed interpolated string is lowered to System.String.Concat of its
parts rather than the reflection-based printf engine: a string-typed hole is
passed through directly, any other plain hole is converted with `string x`, an
aligned/formatted hole with `String.Format(InvariantCulture, ...)`, and a
printf-specifier hole with `sprintf`. This removes the reflection dependency on
the common path, so these interpolations become trim- and NativeAOT-compatible.

This generalizes and replaces the language-version-gated String.Concat
optimization (dotnet#16556), which only handled all-string holes: the lowering now
applies to every string-typed interpolation, ungated. The reflection path is
used only for PrintfFormat/FormattableString-typed interpolation.

The syntax tree now carries each hole's formatting explicitly, so a printf
specifier no longer leaks into an adjacent literal and alignment is no longer a
fake tuple:

    type SynInterpolatedStringPart =
        | String of value: string * range: range
        | FillExpr of fillExpr: SynExpr * formatting: SynInterpolationFormatting

    type SynInterpolationFormatting =
        | DotNet of alignment: SynExpr option * format: Ident option
        | Printf of specifier: string * range: range

Behavioural change: plain `{x}` holes now render with invariant culture (the F#
`string` operator) rather than the current thread culture, matching `string`.

Adds a NativeAOT regression test under tests/AheadOfTime/NativeAOT.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

✅ No release notes required

@github-actions github-actions Bot added ⚠️ Affects-Test-Tooling Tooling check: PR touches test framework infrastructure ⚠️ Affects-Compiler-Output Tooling check: PR touches IL emission or codegen ⚠️ Affects-Bootstrap Tooling check: PR touches compiler bootstrap chain ⚠️ Affects-Build-Infra Tooling check: PR touches build infrastructure labels Jun 18, 2026
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Tooling Safety Check — Affects-Build-Infra, Affects-Bootstrap, Affects-Compiler-Output, Affects-Test-Tooling
Affects-Build-Infra: new .fsproj with PublishAot, custom compiler paths, Versions.props import
Affects-Bootstrap: modifies pars.fsy grammar and core compiler checking
Affects-Compiler-Output: changes interpolated string lowering from printf to String.Concat
Affects-Test-Tooling: new check.ps1/check.cmd scripts wired into AheadOfTime CI suite

Generated by PR Tooling Safety Check · opus46 5.8M ·

@charlesroddie charlesroddie force-pushed the InterpolatedStringCollector branch from e2debcd to f571df9 Compare June 18, 2026 21:51
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread tests/AheadOfTime/NativeAOT/Program.fs Outdated
Comment on lines +17 to +18
// print $"answer = %d{x}"
// print $"hello %s{name}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%d and %s (unlike %.2f below) serve effectively only as a type annotation. Would it be difficult to identify this, strip the specifiers, and have these 2 lines also be lowered to concat?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the list is https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/plaintext-formatting, minus %a and %t.

The instructions for clean interpolated string usage could be "avoid format strings (Printf-style specifiers); use dotnet specifiers instead".

This could be modified to add "except you can use this whitelist" by doing what you suggest. It makes the instruction more complex so I wouldn't recommend that.

It could also be modified to "avoid using the %A printf-style format specifier" by working systematically through the list.

If these printf-style usages are important then going the whole way would be best.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the current implementation of the feature does already convert $"hello %s{name}" into string.Concat, so changing it so that it doesn't isn't an improvement?

@kerams kerams Jun 19, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Numpsy, it applies only if every single hole is a string.

@charlesroddie, I was wrong in my assumption, it's not always just a type annotation :/

> $"{true}";;
val it: string = "True"

> $"%b{true}";;
val it: string = "true"

Guess we could still do it by replicating the run-time logic of printf, but maybe leave it out of this PR.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it applies only if every single hole is a string

Yes, my point was that the %s{name} case currently compiles into string.concat and doesn't produce any AOT warnings, but the comment here says it will produce warnings.

@charlesroddie charlesroddie Jun 20, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated %s so there is no regression. I'll investigate adding more format specifiers to the AOT-supported list.

@charlesroddie charlesroddie Jun 20, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK the status is that we have the legacy printf in F#, and it seeps into interpolated strings via printf format specifiers. Printf format specifiers are actually more complex than the simple list above, with various modifications possible. There is code which type checks according to a format specifier (CheckFormatStrings.ParseFormatString). But there is no single place which contains a spec, no type which defines a valid format specifier in the compiler, and no function which generates code based on a valid format specifier. It's all internal to printf and even the specs would involve looking at the printf code to see the behaviour.

So I think this prevents making significant progress here.

I suggest creating a language suggestion for discussion of this, which would define an "F# string format specifier" which would have all the nice things above that are currently missing.

It's very feasible to implement all this cleanly but it's a medium piece of work and requires a name for the feature without the term "printf" in it and with corresponding changes to public docs. So not for this PR in my opinion.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> $"{true}";;
val it: string = "True"

> $"%b{true}";;
val it: string = "true"

I could add a change to make $"%b{true}" return True, in the spirit of matching string behaviour which is already done in matching string's cuture-independence in this PR.

charlesroddie and others added 6 commits June 19, 2026 17:27
Rewrite TcInterpolatedStringViaConcat to type-check each interpolation
part in place and convert it to a string expression, then String.Concat
them. This removes the parallel 'holeIsString' bool list and the
flat-fillExprs/dense-parts interleave entirely: 'build' now walks a
single list (the parts), threading only tpenv.

- Plain '{x}' holes are built directly in the typed tree: a string is
  passed through raw (matching dotnet#16556's lean IL), anything else is
  converted via the 'string' operator, emitted through a new
  string_operator_info intrinsic + mkCallStringOperator helper.
- Aligned/formatted and printf holes are checked from a small synthesized
  String.Format/sprintf expression, so name resolution still does the BCL
  work.
- The function-value warning is re-homed per-hole.

Known follow-ups: ill-typed formatted holes currently report their error
twice (the formatted arm type-checks the hole once for the warning and
again inside String.Format); the warning wants to move to its own pass
over hole types, which also removes that double check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A bare '%s' on a string hole now lowers to a String.Concat passthrough
(reflection-free, AOT-clean), like a plain '{string}' hole, instead of
routing through sprintf. This fixes the regression where '$"%s{name}"'
went through the printf engine. Other specifiers (and '%5s' etc.) still
format via sprintf, and the '%s' string constraint is still enforced.

convertHole now returns a (string expression, may-be-null) pair: only a
raw string passthrough can be null, so a lone such arg coalesces via the
'string' operator (e.g. $"%s{null}" and $"{(s: string)}" -> ""), while
multi-arg cases rely on String.Concat mapping null to "".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The formatted/aligned hole path lowered via a synthesized String.Format that
re-type-checked the hole expression a second time, which could duplicate an
error in the hole and leak a spurious 'Format' overload-resolution error. Bind
the already-checked, boxed hole value to a temporary and reference that from the
synthesized String.Format instead.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +7607 to +7614
// Format the already-checked hole via a synthesized 'String.Format', binding its boxed value
// to a temporary so the hole is not type-checked a second time. Re-checking 'synFill' would
// duplicate any error in it; boxing to 'obj' keeps the 'Format' overload unambiguous (so a
// hole that already failed to check doesn't also leak a confusing 'Format' overload error).
let boxedFill = mkCallBox g m fillTy fill
let tmpVal, _ = mkLocal mSynth "interpHole" (tyOfExpr g boxedFill)
let envInner = AddLocalVal g cenv.tcSink mSynth tmpVal env
let tmpRef = SynExpr.Ident(mkSynId mSynth tmpVal.LogicalName)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was added in e83ddfb. It avoids the previous short cut which has two minor issues of sometimes giving duplicate error messages, and type-checking twice. Worth considering if the benefits of the additional code are worth the cost.

These printf specifiers act only as a type annotation - their value renders the
same through the 'string' operator as through the specifier - so constrain the
hole to the specifier's type and render it with 'string', as for a plain '{x}'
hole, instead of routing through the reflection-based 'sprintf'. This makes them
NativeAOT compatible. '%u' is excluded as it reinterprets a signed value as
unsigned and so does not match 'string'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚠️ Affects-Bootstrap Tooling check: PR touches compiler bootstrap chain ⚠️ Affects-Build-Infra Tooling check: PR touches build infrastructure ⚠️ Affects-Compiler-Output Tooling check: PR touches IL emission or codegen ⚠️ Affects-Test-Tooling Tooling check: PR touches test framework infrastructure

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

3 participants