Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interprocedural propagation of slot refinements #57651

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

serenity4
Copy link
Contributor

This PR aims to propagate slot refinement information interprocedurally, which allows inference to narrow down the type of a slot as follows:

subfunc(x) = x::Float64 + 1
function topfunc(x) # `topfunc` will be inferred with `x::Any`
    y = subfunc(x)
    return sin(x), cos(y)
end
src = code_typed1(topfunc, (Any,); optimize = false)

Before:

CodeInfo(
1%1 = Main.subfunc::Core.Const(subfunc)
│        (y = (%1)(x))::Float64%3 = Main.sin::Core.Const(sin)
│   %4 =   dynamic (%3)(x)::Any%5 = Main.cos::Core.Const(cos)
│   %6 = y::Float64%7 =   dynamic (%5)(%6)::Float64%8 =   builtin Core.tuple(%4, %7)::Tuple{Any, Float64}
└──      return %8
)

After:

CodeInfo(
1%1 = Main.subfunc::Core.Const(subfunc)
│        (y = (%1)(x))::Float64%3 = Main.sin::Core.Const(sin)
│   %4 =   dynamic (%3)(x)::Float64%5 = Main.cos::Core.Const(cos)
│   %6 = y::Float64%7 =   dynamic (%5)(%6)::Float64%8 =   builtin Core.tuple(%4, %7)::Tuple{Float64, Float64}
└──      return %8
)

The general mechanism is a form of reverse type inference (in the sense that it looks at how the program uses a value to deduce its type), providing an upper bound on types which inference may then use within control-flow paths assumed to be error-free. For example, it should eventually enable accurate inference for code such as

f1(x::Number) = x
f2(x::Signed) = x
f2(x::Symbol) = x
f3(x::Integer) = x::Int64
f3(x::AbstractFloat) = x::Float64

g1(x) = x::AbstractString
g2(x) = isa(x, String) ? x : throw(ArgumentError("Expected a String"))

function f(x, cond::Bool) # assume `f` will be inferred with `x::Any`
    if cond
        f1(x) # implies x::Number (assuming a method match)
        f2(x) # implies x::Signed (assuming a method match)
        f3(x) # implies x::Integer -> x::Int64 (assuming a method match and a non-throwing type assert)
    else
        g1(x) # implies x::AbstractString (assuming a method match and a non-throwing type assert)
        g2(x) # implies x::String (assuming a method match and a non-throwing execution)
    end
    x # inferred to x::Union{Int64, String}
end

code_typed(f, (Any, Bool); optimize = false)

A local version was already been implemented for slots with a slot refinements mechanism, where e.g. type asserts and method matches1 within the same function are used to tighten slot types (see #55229). In addition to being limited to slots, the current implementation only propagates the information forward from the point where it is obtained. While future work may attempt to lift these limitations, the intention for this PR is only to propagate this information interprocedurally, so that refinements for slots within a callee may be mapped back to slot refinements within the caller.

We may distinguish between two types of refinement:

  • Local refinement, which contains type assumptions made along a program path (e.g. based on isa branches or type assertions made in a given block).
  • Global refinement, which contains assumptions that hold regardless of control-flow.

Global refinement is the only information that can be reliably propagated across function boundaries, allowing a caller to tighten the type of arguments provided to a callee when making the assumption that the call does not throw. It may be derived from local refinements using a merging mechanism across control-flow paths similarly to return type inference. As an example, isa conditionals will globally tighten the type of their input only if they are used to branch and one of the branches throws.

Note that because the core assumption for this type refinement mechanism is that the execution does not throw, this information would have to be discarded at try block boundaries.

Footnotes

  1. Relying on a finite set of method matches to deduce type information about their potential arguments may introduce more sources of invalidation.

@topolarity
Copy link
Member

topolarity commented Mar 5, 2025

reverse type inference

I've always been a bit confused by this - why is it reverse? I thought the more-precise type-information only applies from the type-assert onward

edit: I see now. The key condition is:

within control-flow paths assumed to be error-free

So within a nothrow region, a type-assert implies that the object was never that type to begin (since the user has told us it would be UB if it were, since we would throw())

Comment on lines +610 to +614
inferred = refinements[j].typ
sigt ⊑ inferred ? sigt : inferred ⊑ sigt ? inferred : begin
# Core.println("sigt and inferred are not ordered: ", sigt, ", ", inferred)
sigt
end
Copy link
Contributor Author

@serenity4 serenity4 Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where refinement information obtained from callees is hooked into the slot refinement mechanism. Setting this to sigt restores the previous behavior by discarding interprocedural information.

Unfortunately at the moment it seems that enabling this integration gives issues to bootstrapping, with

Compiling the compiler. This may take several minutes ...
error during bootstrap:
LoadError(at "Base_compiler.jl" line 3: BoundsError(a=Array{TypeVar, 1}(dims=(1,), mem=Memory{TypeVar}(8, 0x7f31788dbf80)[
  𝕃ᵢ,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>]), i=(0,)))
throw_boundserror at ./essentials.jl:15
getindex at ./essentials.jl:950 [inlined]
_apply_type_tfunc at ./../usr/share/julia/Compiler/src/tfuncs.jl:1926
#apply_type_tfunc#105 at ./../usr/share/julia/Compiler/src/tfuncs.jl:1764
unknown function (ip: 0x7f317b6ca5b9) at (unknown file)
_jl_invoke at /home/serenity4/julia/wip3/src/gf.c:3500 [inlined]
ijl_apply_generic at /home/serenity4/julia/wip3/src/gf.c:3700
unknown function (ip: 0x7f31841a020d) at (unknown file)
_jl_invoke at /home/serenity4/julia/wip3/src/gf.c:3500 [inlined]

although setting this to sigt then revising changes after the build shows no errors while running the Compiler test suite.

Comment on lines +4295 to +4299
postdominates(postdomtree, currbb, 1) && apply_refinement!(𝕃ᵢ, refinements.slot, refinements.typ, frame.refinements, changes)
elseif refinements isa Vector{SlotRefinement}
for refinement in refinements
apply_refinement!(𝕃ᵢ, refinement.slot, refinement.typ, currstate, changes)
postdominates(postdomtree, currbb, 1) && apply_refinement!(𝕃ᵢ, refinement.slot, refinement.typ, frame.refinements, changes)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is WIP, to be replaced with actual merging of refinements from different control-flow paths
(what this does at the moment is to only take information from blocks that all control-flow paths will inevitably traverse)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants