Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Tuning Metal fails: Could not find any valid schedule for task #17279

Open
gmeeker opened this issue Aug 17, 2024 · 0 comments
Open

[Bug] Tuning Metal fails: Could not find any valid schedule for task #17279

gmeeker opened this issue Aug 17, 2024 · 0 comments
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug

Comments

@gmeeker
Copy link
Contributor

gmeeker commented Aug 17, 2024

Expected behavior

Tune retina-face-resnet50-fixed.onnx from this repo:

https://github.com/gmeeker/RetinaFace

This is a fixed size input version of this:
https://github.com/discipleofhamilton/RetinaFace

Actual behavior

[Task 30/37] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (27/27) | 552.19 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_nchw_winograd.cuda, args=(('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')). A file containing the errors has been written to /var/folders/bd/rc6mzcg1423fzylm2vd6qsd00000gn/T/tvm_tuning_errors_h172s9it.log.

In the log:

RPCError: Error caught from RPC call:
[21:48:30] [...]/src/runtime/metal/metal_module.mm:130: InternalError: Check failed: (state != nil) is false: cannot get state: for function default_function_kernelThread group memory requested is more than MAX allowed

Also, this issue is very frequent on Intel Macs, to the point where Metal targets are slower than CPU.

TVM 0.17.0's Metal timer may have made this more prevalent, but I believe that's irrelevant and earlier versions were just not tuning properly.

Environment

macOS 14.6.1
M1 2020 Mac Mini
Intel Mac: 2019 MacBook Pro, AMD 5500M
TVM 0.17.0

Steps to reproduce

tvmc tune --target metal --output retina-face-resnet50-autotuner_records.json retina-face-resnet50-fixed.onnx

Triage

  • needs-triage
  • backend:metal
@gmeeker gmeeker added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug labels Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug
Projects
None yet
Development

No branches or pull requests

1 participant