Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Float32/16 raised to integer typemin #57487

Closed
wants to merge 427 commits into from

Conversation

kuszmaul
Copy link

Fixes #57464

Pangoraw and others added 30 commits October 24, 2023 10:44
…#51785)

[Docs](https://docs.julialang.org/en/v1/manual/control-flow/#else-Clauses)
state:

> The try, catch, else, and finally clauses each introduce their own
> scope blocks.

But it is currently not the case for `else` blocks

```julia
julia> try
       catch
       else
           z = 1
       end
1

julia> z
1
```

This change actually makes `else` blocks have their own scope block:

```julia
julia> try
       catch
       else
           z = 1
       end
1

julia> z
ERROR: UndefVarError: `z` not defined
```

(cherry picked from commit 17a36ee)
Fixes JuliaLang#51771

The convert method that asserts in JuliaLang#51771 is arguably still faulty
though.

(cherry picked from commit cf00550)
…Lang#51843)

Otherwise `--heap-size-hint` will become a no-op.

Likely a merge bug from JuliaLang#51661.
It seems this case has already been fixed by other improvements, so we
no longer need this hack, which is now causing problems.

Fixes JuliaLang#51694
…ion. (JuliaLang#51840)

This allows other users of LLVM to use opaque pointers with their
contexts.

Co-authored-by: Jameson Nash <[email protected]>
Restores the method whose removal was probably causing problems.

(cherry picked from commit 14d9c7c)
Restores the method whose removal was probably causing problems.

(cherry picked from commit f6f1ee9)
This shouldn't be needed because `ldd` should do it itself.

(cherry picked from commit 5b34cdf)
Backported PRs:
- [x] JuliaLang#50932 <!-- types: fix hash values of Vararg -->
- [x] JuliaLang#50975 <!-- Use rr-safe `nopl; rdtsc` sequence -->
- [x] JuliaLang#50989 <!-- fix incorrect results in `expm1(::Union{Float16,
Float32})` -->
- [x] JuliaLang#51284 <!-- Avoid infinite loop when doing SIGTRAP in arm64-apple
-->
- [x] JuliaLang#51332 <!-- Add s4 field to Xoshiro -->
- [x] JuliaLang#51397 <!-- call Pkg precompile hook in latest world -->
- [x] JuliaLang#51405 <!-- Remove fallback that assigns a module to inlined
frames. -->
- [x] JuliaLang#51491 <!-- Throw clearer ArgumentError for strip with two string
args -->
- [x] JuliaLang#51531 <!-- fix `_tryonce_download_from_cache` (busybox.exe
download error) -->
- [x] JuliaLang#51541 <!-- Fix string index error in tab completion code -->
- [x] JuliaLang#51530 <!-- Don't mark nonlocal symbols as hidden -->
- [x] JuliaLang#51557 <!-- Fix last startup & shutdown precompiles -->
- [x] JuliaLang#51512 <!-- avoid limiting Type{Any} to Type -->
- [x] JuliaLang#51595 <!-- reset `maxprobe` on `empty!` -->
- [x] JuliaLang#51582 <!-- Aggressive constprop in LinearAlgebra.wrap -->
- [x] JuliaLang#51592 <!-- correctly track element pointer in heap snapshot -->
- [x] JuliaLang#51326 <!-- complete false & true more generally as vals -->
- [x] JuliaLang#51376 <!-- make `hash(::Xoshiro)` compatible with `==` -->
- [x] JuliaLang#51557 <!-- Fix last startup & shutdown precompiles -->
- [x] JuliaLang#51845 
- [x] JuliaLang#51840 
- [x] JuliaLang#50663 <!-- Fix Expr(:loopinfo) codegen -->
- [x] JuliaLang#51863 <!-- LLVM 15.0.7-9 -->

Contains multiple commits, manual intervention needed:

- [ ] JuliaLang#51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] JuliaLang#51092 <!-- inference: fix bad effects for recursion -->

Non-merged PRs with backport label:
- [ ] JuliaLang#51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] JuliaLang#51414 <!-- improvements on GC scheduler shutdown -->
- [ ] JuliaLang#51366 <!-- Handle infix operators in REPL completion -->
- [ ] JuliaLang#50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] JuliaLang#50824 <!-- Add some aliasing warnings to docstrings for mutating
functions in Base -->
- [ ] JuliaLang#49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
)

`jl_errorexception_type` is undefined at the point we (fail to) load a
sysimg.

(cherry picked from commit 20a5fa7)
This is required now once Distributed is not in the sysimage.

Fixes https://github.com/JuliaLang/julia/issues/51756

(cherry picked from commit 795d8d7)
…1848)

This aligns their behavior with manual calls to `finalize(o)`, and
prepares for a future time in which these functions are always run on a
separate thread. This means that they can wait to acquire locks in this
context, which otherwise would have been denied to them.

(cherry picked from commit c54a3f2)
Can cause spurious warnings about not closing these properly and
unexpected events to appear after `close` returns.

(cherry picked from commit d0c4284)
…dispatch (JuliaLang#51995)

The artifacts dict is not lowered to ensure_artifact_installed which
causes to load the ".toml" during runtime for lazy artifacts

(cherry picked from commit 9bc6994)
This bumps Statistics to the latest commit of the release-1.10 branch in
order to backport JuliaStats/Statistics.jl#153.

See JuliaData/DataFrames.jl#3383. Cc: @bkamins
@George9000
…51213)

This avoids a crashes where we run the destructors because C++ is fun
and runs destructors before thread exit.

(cherry picked from commit 3d88550)
…JuliaLang#50207)

Suggested by @vchuravy.

---------

Co-authored-by: Jameson Nash <[email protected]>
(cherry picked from commit 2adf54a)
Two chagnes wrapped into one `Base.copymutable` => `Base.copymutable` &
`collect` and `Base.copymutable` => `similar` & words.

Followup for JuliaLang#52086 and JuliaLang#46104; also fixes JuliaLang#51932 (though we still may
want to make `copymutable` public at some point)

---------

Co-authored-by: Jameson Nash <[email protected]>
(cherry picked from commit 42c088b)
This fixes a whole bunch of small but annoying bugs, as described in the
JuliaSyntax-0.4.7 release notes

https://github.com/JuliaLang/JuliaSyntax.jl/releases/tag/v0.4.7

I've been careful about cutting the JuliaSyntax-0.4.7 release from
nonbreaking changes, so we should be able to backport this to 1.10.

---

Extended notes about compatibility
* The public keyword in
JuliaLang/JuliaSyntax.jl#320 is released in
JuliaSyntax-0.4.7 but JuliaSyntax is multi-version aware so this is
disabled when used as the default parser in Julia 1.10, but is enabled
in 1.11-DEV. So should be backportable.
* We aim for parsing to `Expr` to always be stable in JuliaSyntax and
independent of the host Julia `VERSION`, but we're not fully there yet
for 1.11 / 1.10 due to
JuliaLang/JuliaSyntax.jl#377. Thus some
careful management of the JuliaSyntax-0.4.x branch for now.

(cherry picked from commit 85d7cca)
…aces (JuliaLang#51520)

On AMDGPU, this was generating a `addrspace(10)` pointer to an `alloca`
which is illegal and lead to other issues.

(cherry picked from commit af9a7af)
)

Fixes JuliaLang#51985

Ensure that the REPL completions escape and unescape text correctly,
using the correct functions, and accounting for exactly what the user
has currently typed.

The old broken method is left around for Pkg, since it has an
over-reliance on it returning incorrect answers. Once Pkg is fixed, we
can delete that code.

Co-authored-by: Jameson Nash <[email protected]>
(cherry picked from commit 5edcdc5)
d-netto and others added 28 commits September 4, 2024 11:14
) (#184)

`%M` is the format specifier for the minutes, not the month (which
should be `%m`), and it was used twice.

Also, on macOS `Libc.strptime` internally calls `mktime` which depends
on the local timezone. We now temporarily set `TZ=UTC` to avoid
depending on the local timezone.

Fix JuliaLang#55827.

Co-authored-by: Mosè Giordano <[email protected]>
* Add heartbeat pause/resume capability

* Add check to avoid negative sleep duration

* Disable heartbeats in `jl_print_task_backtraces()`

`jl_print_task_backtraces()` can take long enough that there can
be heartbeat loss, which can trigger printing task backtraces
again, unless it is called from the heartbeat thread which takes
care of that possible problem.

* Pause heartbeats for GC

* Address review comment

* Address review comment
…g#55826) (#189)

Additional GC observability tool.

This will help us to diagnose why some of our servers are triggering so
many full GCs in certain circumstances.
Similar to `--trace-compile`, emit the `precompile` statement for a method
once, but only when it is dynamically dispatched.

For this, we rename the `precompiled` field in `jl_method_instance_t` to
`flags` and use bit 0 as `precompiled` and bit 1 as `dispatched`.

When the method is dispatched, the `dispatched` bit is set to 1 and the
precompile statement is emitted. This check is done in
`jl_gf_invoke_by_method` and in the slow path (cache miss) of
`jl_apply_generic`.
…#192)

There was a missing re-assignment of old = -1; at the end of that loop
which means in the ABA case, we accidentally actually acquire the lock
on the thread despite not actually having stopped the thread; or in the
counter-case, we try to run through this logic with old==-1 on the next
iteration, and that isn't valid either (jl_thread_suspend_and_get_state
should return failure and the loop will abort too early).

Fix JuliaLang#56046

Co-authored-by: Jameson Nash <[email protected]>
One limitation of sampling CPU/thread profiles, as is currently done in
Julia, is that they primarily capture samples from CPU-intensive tasks.

If many tasks are performing IO or contending for concurrency primitives
like semaphores, these tasks won’t appear in the profile, as they aren't
scheduled on OS threads sampled by the profiler.

A wall-time profiler, like the one implemented in this PR, samples tasks
regardless of OS thread scheduling. This enables profiling of IO-heavy
tasks and detecting areas of heavy contention in the system.

Co-developed with @nickrobinson251.
Instead of always updating it. This should speed up loading only
method specializations.
…Lang#54634) (#199)

This avoids a: `error: non-private labels cannot appear between
.cfi_startproc / .cfi_endproc pairs` error.
That error was introduced in https://reviews.llvm.org/D155245#4657075
see also llvm/llvm-project#72802

(cherry picked from commit a4e793e)
(cherry picked from commit 3f35094)

Co-authored-by: Gabriel Baraldi <[email protected]>
* Optionally disallow defining new methods and drop backedges
… counter -- per (module, method name) pair (JuliaLang#53719) (#179)

As mentioned in JuliaLang#53716, we've
been noticing that `precompile` statements lists from one version of our
codebase often don't apply cleanly in a slightly different version.

That's because a lot of nested and anonymous function names have a
global numeric suffix which is incremented every time a new name is
generated, and these numeric suffixes are not very stable across
codebase changes.

To solve this, this PR makes the numeric suffixes a bit more fine
grained: every pair of (module, top-level/outermost function name) will
have its own counter, which should make nested function names a bit more
stable across different versions.

This PR applies @JeffBezanson's idea of making the symbol name changes
directly in `current-julia-module-counter`.

Here is an example:

```Julia
julia> function foo(x)
           function bar(y)
               return x + y
           end
       end
foo (generic function with 1 method)

julia> f = foo(42)
(::var"#bar#foo##0"{Int64}) (generic function with 1 method)
```

Co-authored-by: Diogo Netto <[email protected]>
* Add per-task metrics (JuliaLang#56320)

Close JuliaLang#47351 (builds on top of
JuliaLang#48416)

Adds two per-task metrics:
- running time = amount of time the task was actually running (according
to our scheduler). Note: currently inclusive of GC time, but would be
good to be able to separate that out (in a future PR)
- wall time = amount of time between the scheduler becoming aware of
this task and the task entering a terminal state (i.e. done or failed).

We record running time in `wait()`, where the scheduler stops running
the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which
bypass the scheduler. Other places where a task stops running (for
`Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all
implemented in terms of `wait(Condition)`, which in turn calls `wait()`.
`LibuvStream` similarly calls `wait()`.

This should capture everything (albeit, slightly over-counting task CPU
time by including any enqueuing work done before we hit `wait()`).

The various metrics counters could be a separate inlined struct if we
think that's a useful abstraction, but for now i've just put them
directly in `jl_task_t`. They are all atomic, except the
`metrics_enabled` flag itself (which we now have to check on task
start/switch/done even if metrics are not enabled) which is set on task
construction and marked `const` on the julia side.

In future PRs we could add more per-task metrics, e.g. compilation time,
GC time, allocations, potentially a wait-time breakdown (time waiting on
locks, channels, in the scheduler run queue, etc.), potentially the
number of yields.

Perhaps in future there could be ways to enable this on a per-thread and
per-task basis. And potentially in future these same timings could be
used by `@time` (e.g. writing this same timing data to a ScopedValue
like in JuliaLang#55103 but only for tasks
lexically scoped to inside the `@time` block).

Timings are off by default but can be turned on globally via starting
Julia with `--task-metrics=yes` or calling
`Base.Experimental.task_metrics(true)`. Metrics are collected for all
tasks created when metrics are enabled. In other words,
enabling/disabling timings via `Base.Experimental.task_metrics` does not
affect existing `Task`s, only new `Task`s.

The other new APIs are `Base.Experimental.task_running_time_ns(::Task)`
and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new
metrics. These are safe to call on any task (including the current task,
or a task running on another thread). All these are in
`Base.Experimental` to give us room to change up the APIs as we add more
metrics in future PRs (without worrying about release timelines).

cc @NHDaly @kpamnany @d-netto

---------

Co-authored-by: Pete Vilter <[email protected]>
Co-authored-by: K Pamnany <[email protected]>
Co-authored-by: Nathan Daly <[email protected]>
Co-authored-by: Valentin Churavy <[email protected]>

* Address review comments

---------

Co-authored-by: Pete Vilter <[email protected]>
Co-authored-by: K Pamnany <[email protected]>
Co-authored-by: Nathan Daly <[email protected]>
Co-authored-by: Valentin Churavy <[email protected]>
…uliaLang#56814) (#200)

I propose a change in the implementation of the `ReentrantLock` to
improve its overall throughput for short critical sections and fix the
quadratic wake-up behavior where each unlock schedules **all** waiting
tasks on the lock's wait queue.

This implementation follows the same principles of the `Mutex` in the
[parking_lot](https://github.com/Amanieu/parking_lot/tree/master) Rust
crate which is based on the Webkit
[WTF::ParkingLot](https://webkit.org/blog/6161/locking-in-webkit/)
class. Only the basic working principle is implemented here, further
improvements such as eventual fairness will be proposed separately.

The gist of the change is that we add one extra state to the lock,
essentially going from:
```
0x0 => The lock is not locked
0x1 => The lock is locked by exactly one task. No other task is waiting for it.
0x2 => The lock is locked and some other task tried to lock but failed (conflict)
```
To:
```
```

In the current implementation we must schedule all tasks to cause a
conflict (state 0x2) because on unlock we only notify any task if the
lock is in the conflict state. This behavior means that with high
contention and a short critical section the tasks will be effectively
spinning in the scheduler queue.

With the extra state the proposed implementation has enough information
to know if there are other tasks to be notified or not, which means we
can always notify one task at a time while preserving the optimized path
of not notifying if there are no tasks waiting. To improve throughput
for short critical sections we also introduce a bounded amount of
spinning before attempting to park.

Not spinning on the scheduler queue greatly reduces the CPU utilization
of the following example:

```julia
function example()
    lock = ReentrantLock()
    @sync begin
        for i in 1:10000
            Threads.@Spawn begin
                @lock lock begin
                    sleep(0.001)
                end
            end
        end
    end
end

@time example()
```

Current:
```
28.890623 seconds (101.65 k allocations: 7.646 MiB, 0.25% compilation time)
```

![image](https://github.com/user-attachments/assets/dbd6ce57-c760-4f5a-b68a-27df6a97a46e)

Proposed:
```
22.806669 seconds (101.65 k allocations: 7.814 MiB, 0.35% compilation time)
```

![image](https://github.com/user-attachments/assets/b0254180-658d-4493-86d3-dea4c500b5ac)

In a micro-benchmark where 8 threads contend for a single lock with a
very short critical section we see a ~2x improvement.

Current:
```
8-element Vector{Int64}:
 6258688
 5373952
 6651904
 6389760
 6586368
 3899392
 5177344
 5505024
Total iterations: 45842432
```

Proposed:
```
8-element Vector{Int64}:
 12320768
 12976128
 10354688
 12845056
  7503872
 13598720
 13860864
 11993088
Total iterations: 95453184
```

~~In the uncontended scenario the extra bookkeeping causes a 10%
throughput reduction:~~
EDIT: I reverted _trylock to the simple case to recover the uncontended
throughput and now both implementations are on the same ballpark
(without hurting the above numbers).

In the uncontended scenario:

Current:
```
Total iterations: 236748800
```

Proposed:
```
Total iterations: 237699072
```

Closes JuliaLang#56182

Co-authored-by: André Guedes <[email protected]>
…JuliaLang#57004) (#204)

Fixes JuliaLang#56889.

Before this PR, an exception thrown while constructing the objects to
log (the `msg`) would be caught and logged. However, an exception thrown
while _printing_ the msg to an IO would _not_ be caught, and can abort
the program. This breaks the promise that enabling verbose debug logging
shouldn't introduce new crashes.

After this PR, an exception thrown during handle_message is caught and
logged, just like an exception during `msg` construction:

```julia
julia> struct Foo end

julia> Base.show(::IO, ::Foo) = error("oh no")

julia> begin
           # Unexpectedly, the execption thrown while printing `Foo()` escapes
           @info Foo()
           # So we never reach this line! :'(
           println("~~~~~ ALL DONE ~~~~~~~~")
       end
┌ Error: Exception while generating log record in module Main at REPL[10]:3
│   exception =
│    oh no
│    Stacktrace:
│      [1] error(s::String)
│        @ Base ./error.jl:44
│      [2] show(::IOBuffer, ::Foo)
│        @ Main ./REPL[9]:1
...
│     [30] repl_main
│        @ ./client.jl:593 [inlined]
│     [31] _start()
│        @ Base ./client.jl:568
└ @ Main REPL[10]:3
~~~~~ ALL DONE ~~~~~~~~
```

This PR respects the change made in
JuliaLang#36600 to keep the codegen as
small as possible, by putting the new try/catch into a no-inlined
function, so that we don't have to introduce a new try/catch in the
macro-generated code body.

---------

Co-authored-by: Jameson Nash <[email protected]>
---------

Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
…7045) (#208)

This is still a work in progress, but it should help determine what a
straggler thread was doing during the stop-the-world phase and why it
failed to reach a safepoint in a timely manner.

We've encountered long TTSP issues in production, and this tool should
provide a valuable means to accurately diagnose them.
…215)

Minor tweak to the error message: embed the exit code of the Julia child
process that failed to compile the package.
@kuszmaul kuszmaul requested a review from a team as a code owner February 21, 2025 04:37
@kuszmaul kuszmaul closed this Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Float32/16 raised to an integer's typemin throws a DomainError