@mtlprintf #418

tgymnich · 2024-09-16T11:55:12Z

Implement @mtlprintf and friends using os_log

TODO:

Printing of floats does not work since they will be converted to doubles due to the vararg calling convention which will be caught by our IR checker
Version check the @mtlprintf macro
Add tests
Capturing and logging are mutually exclusive

depends on: JuliaGPU/GPUCompiler.jl#630
notify: #226

tgymnich · 2024-09-16T14:00:44Z

@maleadt Any idea how we can implement the version check for the @mtlprintf macro? I know we could check the air version inside the kernel but I'd like to avoid that.

Also can we get rid of the double check in check_ir?

christiangnrd · 2024-09-16T18:02:28Z

Would it be worth benchmarking the performance difference between having logging active vs not?

tgymnich · 2024-09-17T10:37:13Z

Would it be worth benchmarking the performance difference between having logging active vs not?

@christiangnrd Sure. I don't expect there to be much overhead besides allocation of the log buffer and checking it for logs after running a kernel. But we might want to look into only conditionally adding MTLLogState since logging also prevents GPU frame capture.

maleadt · 2024-09-17T14:24:41Z

@maleadt Any idea how we can implement the version check for the @mtlprintf macro? I know we could check the air version inside the kernel but I'd like to avoid that.

Given that the macro expands way to early, I don't think there's anything we can do but checking in the kernel. Why are you opposed to that? GPUCompiler.jl has infrastructure to optimize those checks away, see e.g. how CUDA.jl exposes the device capability and PTX ISA version to the kernel.

tgymnich · 2024-09-17T23:14:10Z

We could also wrap the macro and accompanying functions in if Metal.macos_version() >= v"15".

christiangnrd · 2024-09-18T00:11:56Z

I we do that we should have definitions in both cases and give an informative error if Metal.macos_version() < v"15".

maleadt · 2024-09-18T10:29:54Z

Actually, looks like I provided the run-time queries already:

Metal.jl/src/device/intrinsics/version.jl

Lines 64 to 65 in 6c82916

    
           @device_function @inline metal_version() = SimpleVersion(metal_major(), metal_minor()) 
        
           @device_function @inline air_version() = SimpleVersion(air_major(), air_minor())

So we can just use that in the generated code, generating an error when emitting code for an older platform. That of course depends on #416 for meaningful reporting, but we'll get there.

I'd rather not simply check based on the macOS version during macro expansion, since we might want to target older Metal versions than the system supports.

src/device/intrinsics/output.jl

christiangnrd

Looks good! However do you know what's causing the tests to hang?

lib/mtl/command_queue.jl

docs/src/usage/kernel.md

tgymnich · 2024-09-19T09:55:15Z

Looks good! However do you know what's causing the tests to hang?

@christiangnrd The hangs are caused by this one line:

@print_and_throw "@mtlprintf requires Metal 3.2 (macOS 15) or higher"

test/output.jl

christiangnrd · 2024-09-21T17:32:18Z

@maleadt Could we have one of the Apple Silicon runners upgraded to Sequoia so the output tests don't get ignored? Edit: All the runners are running 13.3.1. Should we also have one on macOS 14?

I would also like to see #420 merged first (with benchmarks run on macOS 15) to see how big the impact of enabling logging is.

tgymnich · 2024-09-21T19:38:09Z

@christiangnrd I recently made changes so that logging (e.g. MTLLogState and friends) is only enabled whenever we actually use the feature.

christiangnrd · 2024-09-21T21:22:44Z

Just pushed a whitespace-only formatting commit

christiangnrd · 2024-09-21T21:25:51Z

@christiangnrd I recently made changes so that logging (e.g. MTLLogState and friends) is only enabled whenever we actually use the feature.

In that case I still think we should be able to test on macOS 15, but I think we should merge this as soon as it's ready.

maleadt · 2024-09-23T14:43:24Z

I've upgraded one of the workers to macOS 15:

See the macos_version tag which can be used to select on this.

maleadt · 2024-09-24T05:51:04Z

@christiangnrd The hangs are caused by this one line:
@print_and_throw "@mtlprintf requires Metal 3.2 (macOS 15) or higher"

How did this get fixed?

maleadt · 2024-09-24T09:10:24Z

I've also updated one of the juliaecosystem workers to 15.0, so we can revert to that queue.

christiangnrd · 2024-09-24T11:37:30Z

How did this get fixed?

I assume by no longer running when macos_version() < 15. I think I got ahead of myself with the review.

.buildkite/pipeline.yml

maleadt · 2024-09-24T11:52:15Z

I assume by no longer running when macos_version() < 15.

Right; but that's not great. It means that any kernel using logging output will first generate a non-fatal error message on the host, and then hang in the kernel? Or, when we on macOS 15 use (the hypothetical, but useful) @metal metal=v"3.1" it would hang too?

EDIT: suggested capability implemented here: #430

christiangnrd · 2024-09-24T14:11:43Z

The following code hangs in the REPL, but not when run using include or when called from the terminal:

using Metal
function f()
    @mtlprintln("Testing...")
    return
end
@metal f()

maleadt · 2024-09-24T14:15:32Z

The following code hangs in the REPL, but not when run using include or when called from the terminal

Isn't that because in the REPL we force synchronization via an AST transform hook? What happens if you synchronize manually?

christiangnrd · 2024-09-24T14:31:21Z

@maleadt When I add Metal.synchronize() to the end of the file it hangs in all situations.

github-actions

Metal Benchmarks

Benchmark suite	Current: `0840aa4`	Previous: `8652754`	Ratio
`latency/precompile`	`4599693584` ns	`4401680834` ns	`1.04`
`latency/ttfp`	`6702643541.5` ns	`6678542687` ns	`1.00`
`latency/import`	`722647167` ns	`721498042` ns	`1.00`
`integration/metaldevrt`	`715958` ns	`708167` ns	`1.01`
`integration/byval/slices=1`	`1498958.5` ns	`1530625` ns	`0.98`
`integration/byval/slices=3`	`11746791` ns	`11010542` ns	`1.07`
`integration/byval/reference`	`1489417` ns	`1585084` ns	`0.94`
`integration/byval/slices=2`	`2602291.5` ns	`2472708` ns	`1.05`
`kernel/indexing`	`464895.5` ns	`454333` ns	`1.02`
`kernel/indexing_checked`	`466812.5` ns	`455667` ns	`1.02`
`kernel/launch`	`8417` ns	`8459` ns	`1.00`
`array/construct`	`27659.666666666668` ns	`27638.916666666664` ns	`1.00`
`array/broadcast`	`460729.5` ns	`464625` ns	`0.99`
`array/random/randn/Float32`	`804708` ns	`813083` ns	`0.99`
`array/random/randn!/Float32`	`610958` ns	`634041` ns	`0.96`
`array/random/rand!/Int64`	`552250` ns	`552750` ns	`1.00`
`array/random/rand!/Float32`	`581958.5` ns	`577083` ns	`1.01`
`array/random/rand/Int64`	`795125` ns	`800833.5` ns	`0.99`
`array/random/rand/Float32`	`599209` ns	`583709` ns	`1.03`
`array/copyto!/gpu_to_gpu`	`639042` ns	`643166.5` ns	`0.99`
`array/copyto!/cpu_to_gpu`	`585875.5` ns	`600020.5` ns	`0.98`
`array/copyto!/gpu_to_cpu`	`736041.5` ns	`777166.5` ns	`0.95`
`array/accumulate/1d`	`1332458` ns	`1334916` ns	`1.00`
`array/accumulate/2d`	`1420438` ns	`1419167` ns	`1.00`
`array/iteration/findall/int`	`2084291.5` ns	`2072542` ns	`1.01`
`array/iteration/findall/bool`	`1812750` ns	`1854833` ns	`0.98`
`array/iteration/findfirst/int`	`1687750` ns	`1674333` ns	`1.01`
`array/iteration/findfirst/bool`	`1644416.5` ns	`1643833` ns	`1.00`
`array/iteration/scalar`	`3675458.5` ns	`3625334` ns	`1.01`
`array/iteration/logical`	`3255666` ns	`3281021` ns	`0.99`
`array/iteration/findmin/1d`	`1615416` ns	`1572104` ns	`1.03`
`array/iteration/findmin/2d`	`1319125` ns	`1325292` ns	`1.00`
`array/reductions/reduce/1d`	`1048770.5` ns	`1055583` ns	`0.99`
`array/reductions/reduce/2d`	`691041.5` ns	`690959` ns	`1.00`
`array/reductions/mapreduce/1d`	`1052625` ns	`1057604.5` ns	`1.00`
`array/reductions/mapreduce/2d`	`694708` ns	`700416.5` ns	`0.99`
`array/permutedims/4d`	`836583` ns	`846917` ns	`0.99`
`array/permutedims/2d`	`846937.5` ns	`856979.5` ns	`0.99`
`array/permutedims/3d`	`922750` ns	`916917` ns	`1.01`
`array/copy`	`610166` ns	`610041` ns	`1.00`
`metal/synchronization/stream`	`14208` ns	`14667` ns	`0.97`
`metal/synchronization/context`	`14500` ns	`14916` ns	`0.97`

This comment was automatically generated by workflow using github-action-benchmark.

tgymnich · 2024-09-27T13:11:08Z

I opened an issue for the hang: #433

.buildkite/pipeline.yml

maleadt · 2024-10-01T20:20:15Z

In the assumption that the conditional @print_and_throw generating just a trap on macOS 14 is what caused the hangs here, I simplified the logic to make the kernel launch code simply error when using unsupported logging. However, that does not fix the issue. In fact, even on my now upgraded macOS 15 installation a simple kernel doing I/O hangs...

julia> using Metal

julia> function kernel()
           @mtlprint("Hello, World\n")
           return
       end
kernel (generic function with 1 method)

julia> Metal.@sync @metal kernel()
Hello, World

# hang

The Metal.@sync is there just to illustrate what the AST transform hook is doing behind the scenes. So I guess we'll have to fix #433 first, with the above being another datapoint that the generated IR is not necessarily what's the issue (which #433 (comment) already hinted towards).

tgymnich self-assigned this Sep 16, 2024

tgymnich marked this pull request as ready for review September 16, 2024 12:12

tgymnich mentioned this pull request Sep 17, 2024

Add exceptions to check_ir_values JuliaGPU/GPUCompiler.jl#630

Merged

christiangnrd mentioned this pull request Sep 17, 2024

Add benchmarking CI #419

Closed

2 tasks

tgymnich force-pushed the os-log branch from e9b543a to 72b20b1 Compare September 17, 2024 22:47

tgymnich force-pushed the os-log branch 3 times, most recently from 4ee3467 to b43bcb1 Compare September 18, 2024 13:01

christiangnrd mentioned this pull request Sep 18, 2024

Add Benchmarking CI #420

Merged

2 tasks

tgymnich force-pushed the os-log branch from f9bccdc to c6d53e4 Compare September 18, 2024 21:56

tgymnich requested review from christiangnrd and maleadt September 18, 2024 21:56

christiangnrd reviewed Sep 18, 2024

View reviewed changes

src/device/intrinsics/output.jl Outdated Show resolved Hide resolved

christiangnrd reviewed Sep 19, 2024

View reviewed changes

lib/mtl/command_queue.jl Show resolved Hide resolved

docs/src/usage/kernel.md Show resolved Hide resolved

tgymnich force-pushed the os-log branch from 6d9b3a4 to af92a28 Compare September 21, 2024 16:01

christiangnrd approved these changes Sep 21, 2024

View reviewed changes

christiangnrd reviewed Sep 21, 2024

View reviewed changes

test/output.jl Outdated Show resolved Hide resolved

christiangnrd reviewed Sep 24, 2024

View reviewed changes

.buildkite/pipeline.yml Outdated Show resolved Hide resolved

tgymnich force-pushed the os-log branch from ec5d38d to 0840aa4 Compare September 26, 2024 15:27

github-actions bot reviewed Sep 26, 2024

View reviewed changes

christiangnrd reviewed Sep 27, 2024

View reviewed changes

.buildkite/pipeline.yml Outdated Show resolved Hide resolved

maleadt force-pushed the os-log branch 3 times, most recently from f70ccac to 95e47f1 Compare October 1, 2024 20:15

maleadt added enhancement kernels Things about kernels and how they are compiled. labels Oct 1, 2024

christiangnrd mentioned this pull request Oct 12, 2024

Metal 3.1 and 3.2 #373

Open

tgymnich removed the enhancement label Oct 18, 2024

tgymnich and others added 2 commits October 27, 2024 11:38

Implement @mtlprintf using os_log

08a9e00

Simplify logic to work around hang.

d4cd19f

tgymnich force-pushed the os-log branch from 95e47f1 to d4cd19f Compare October 27, 2024 10:39

Merge branch 'main' into os-log

a392e5d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

@mtlprintf #418

@mtlprintf #418

tgymnich commented Sep 16, 2024 •

edited

Loading

tgymnich commented Sep 16, 2024

christiangnrd commented Sep 16, 2024

tgymnich commented Sep 17, 2024 •

edited

Loading

maleadt commented Sep 17, 2024

tgymnich commented Sep 17, 2024

christiangnrd commented Sep 18, 2024

maleadt commented Sep 18, 2024

christiangnrd left a comment •

edited

Loading

tgymnich commented Sep 19, 2024 •

edited

Loading

christiangnrd commented Sep 21, 2024 •

edited

Loading

tgymnich commented Sep 21, 2024

christiangnrd commented Sep 21, 2024

christiangnrd commented Sep 21, 2024

maleadt commented Sep 23, 2024 •

edited

Loading

maleadt commented Sep 24, 2024

maleadt commented Sep 24, 2024 •

edited

Loading

christiangnrd commented Sep 24, 2024

maleadt commented Sep 24, 2024 •

edited

Loading

christiangnrd commented Sep 24, 2024

maleadt commented Sep 24, 2024

christiangnrd commented Sep 24, 2024

github-actions bot left a comment

tgymnich commented Sep 27, 2024

maleadt commented Oct 1, 2024

@mtlprintf #418

Are you sure you want to change the base?

@mtlprintf #418

Conversation

tgymnich commented Sep 16, 2024 • edited Loading

TODO:

tgymnich commented Sep 16, 2024

christiangnrd commented Sep 16, 2024

tgymnich commented Sep 17, 2024 • edited Loading

maleadt commented Sep 17, 2024

tgymnich commented Sep 17, 2024

christiangnrd commented Sep 18, 2024

maleadt commented Sep 18, 2024

christiangnrd left a comment • edited Loading

Choose a reason for hiding this comment

tgymnich commented Sep 19, 2024 • edited Loading

christiangnrd commented Sep 21, 2024 • edited Loading

tgymnich commented Sep 21, 2024

christiangnrd commented Sep 21, 2024

christiangnrd commented Sep 21, 2024

maleadt commented Sep 23, 2024 • edited Loading

maleadt commented Sep 24, 2024

maleadt commented Sep 24, 2024 • edited Loading

christiangnrd commented Sep 24, 2024

maleadt commented Sep 24, 2024 • edited Loading

christiangnrd commented Sep 24, 2024

maleadt commented Sep 24, 2024

christiangnrd commented Sep 24, 2024

github-actions bot left a comment

Choose a reason for hiding this comment

Metal Benchmarks

tgymnich commented Sep 27, 2024

maleadt commented Oct 1, 2024

tgymnich commented Sep 16, 2024 •

edited

Loading

tgymnich commented Sep 17, 2024 •

edited

Loading

christiangnrd left a comment •

edited

Loading

tgymnich commented Sep 19, 2024 •

edited

Loading

christiangnrd commented Sep 21, 2024 •

edited

Loading

maleadt commented Sep 23, 2024 •

edited

Loading

maleadt commented Sep 24, 2024 •

edited

Loading

maleadt commented Sep 24, 2024 •

edited

Loading