@@ -434,18 +434,18 @@ to the backend(s) targeted at export. To support multiple devices, such as
434
434
XNNPACK acceleration for Android and Core ML for iOS, export a separate PTE file
435
435
for each backend.
436
436
437
- To delegate to a backend at export time, ExecuTorch provides the ` to_backend() `
438
- function in the ` EdgeProgramManager ` object, which takes a backend-specific
439
- partitioner object. The partitioner is responsible for finding parts of the
440
- computation graph that can be accelerated by the target backend,and
441
- ` to_backend() ` function will delegate matched part to given backend for
442
- acceleration and optimization. Any portions of the computation graph not
443
- delegated will be executed by the ExecuTorch operator implementations.
437
+ To delegate a model to a specific backend during export, ExecuTorch uses the
438
+ ` to_edge_transform_and_lower() ` function. This function takes the exported program
439
+ from ` torch.export ` and a backend-specific partitioner object. The partitioner
440
+ identifies parts of the computation graph that can be optimized by the target
441
+ backend. Within ` to_edge_transform_and_lower() ` , the exported program is
442
+ converted to an edge dialect program. The partitioner then delegates compatible
443
+ graph sections to the backend for acceleration and optimization. Any graph parts
444
+ not delegated are executed by ExecuTorch's default operator implementations.
444
445
445
446
To delegate the exported model to a specific backend, we need to import its
446
447
partitioner as well as edge compile config from ExecuTorch codebase first, then
447
- call ` to_backend ` with an instance of partitioner on the ` EdgeProgramManager `
448
- object ` to_edge ` function created.
448
+ call ` to_edge_transform_and_lower ` .
449
449
450
450
Here's an example of how to delegate nanoGPT to XNNPACK (if you're deploying to an Android phone for instance):
451
451
@@ -457,7 +457,7 @@ from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPar
457
457
458
458
# Model to be delegated to specific backend should use specific edge compile config
459
459
from executorch.backends.xnnpack.utils.configs import get_xnnpack_edge_compile_config
460
- from executorch.exir import EdgeCompileConfig, to_edge
460
+ from executorch.exir import EdgeCompileConfig, to_edge_transform_and_lower
461
461
462
462
import torch
463
463
from torch.export import export
@@ -495,17 +495,14 @@ with torch.nn.attention.sdpa_kernel([SDPBackend.MATH]), torch.no_grad():
495
495
# Convert the model into a runnable ExecuTorch program.
496
496
# To be further lowered to Xnnpack backend, `traced_model` needs xnnpack-specific edge compile config
497
497
edge_config = get_xnnpack_edge_compile_config()
498
- edge_manager = to_edge(traced_model, compile_config = edge_config)
499
-
500
- # Delegate exported model to Xnnpack backend by invoking `to_backend` function with Xnnpack partitioner.
501
- edge_manager = edge_manager.to_backend(XnnpackPartitioner())
498
+ # Converted to edge program and then delegate exported model to Xnnpack backend
499
+ # by invoking `to` function with Xnnpack partitioner.
500
+ edge_manager = to_edge_transform_and_lower(traced_model, partitioner = [XnnpackPartitioner()], compile_config = edge_config)
502
501
et_program = edge_manager.to_executorch()
503
502
504
503
# Save the Xnnpack-delegated ExecuTorch program to a file.
505
504
with open (" nanogpt.pte" , " wb" ) as file :
506
505
file .write(et_program.buffer)
507
-
508
-
509
506
```
510
507
511
508
Additionally, update CMakeLists.txt to build and link the XNNPACK backend to
@@ -651,8 +648,8 @@ DuplicateDynamicQuantChainPass()(m)
651
648
traced_model = export(m, example_inputs)
652
649
```
653
650
654
- Additionally, add or update the ` to_backend ()` call to use ` XnnpackPartitioner ` . This instructs ExecuTorch to
655
- optimize the model for CPU execution via the XNNPACK backend.
651
+ Additionally, add or update the ` to_edge_transform_and_lower ()` call to use ` XnnpackPartitioner ` . This
652
+ instructs ExecuTorch to optimize the model for CPU execution via the XNNPACK backend.
656
653
657
654
``` python
658
655
from executorch.backends.xnnpack.partition.xnnpack_partitioner import (
@@ -661,8 +658,9 @@ from executorch.backends.xnnpack.partition.xnnpack_partitioner import (
661
658
```
662
659
663
660
``` python
664
- edge_manager = to_edge(traced_model, compile_config = edge_config)
665
- edge_manager = edge_manager.to_backend(XnnpackPartitioner()) # Lower to XNNPACK.
661
+ edge_config = get_xnnpack_edge_compile_config()
662
+ # Convert to edge dialect and lower to XNNPack.
663
+ edge_manager = to_edge_transform_and_lower(traced_model, partitioner = [XnnpackPartitioner()], compile_config = edge_config)
666
664
et_program = edge_manager.to_executorch()
667
665
```
668
666
@@ -682,20 +680,20 @@ target_link_libraries(
682
680
For more information, see [ Quantization in ExecuTorch] ( ../quantization-overview.md ) .
683
681
684
682
## Profiling and Debugging
685
- After lowering a model by calling ` to_backend ()` , you may want to see what got delegated and what didn’t. ExecuTorch
683
+ After lowering a model by calling ` to_edge_transform_and_lower ()` , you may want to see what got delegated and what didn’t. ExecuTorch
686
684
provides utility methods to give insight on the delegation. You can use this information to gain visibility into
687
685
the underlying computation and diagnose potential performance issues. Model authors can use this information to
688
686
structure the model in a way that is compatible with the target backend.
689
687
690
688
### Visualizing the Delegation
691
689
692
- The ` get_delegation_info() ` method provides a summary of what happened to the model after the ` to_backend ()` call:
690
+ The ` get_delegation_info() ` method provides a summary of what happened to the model after the ` to_edge_transform_and_lower ()` call:
693
691
694
692
``` python
695
693
from executorch.devtools.backend_debug import get_delegation_info
696
694
from tabulate import tabulate
697
695
698
- # ... After call to to_backend (), but before to_executorch()
696
+ # ... After call to to_edge_transform_and_lower (), but before to_executorch()
699
697
graph_module = edge_manager.exported_program().graph_module
700
698
delegation_info = get_delegation_info(graph_module)
701
699
print (delegation_info.get_summary())
@@ -762,7 +760,7 @@ Through the ExecuTorch Developer Tools, users are able to profile model executio
762
760
An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [ the ETRecord docs] ( ../etrecord.md ) .
763
761
764
762
765
- In your export script, after calling ` to_edge() ` and ` to_executorch() ` , call ` generate_etrecord() ` with the ` EdgeProgramManager ` from ` to_edge() ` and the ` ExecuTorchProgramManager ` from ` to_executorch() ` . Make sure to copy the ` EdgeProgramManager ` , as the call to ` to_backend ()` mutates the graph in-place.
763
+ In your export script, after calling ` to_edge() ` and ` to_executorch() ` , call ` generate_etrecord() ` with the ` EdgeProgramManager ` from ` to_edge() ` and the ` ExecuTorchProgramManager ` from ` to_executorch() ` . Make sure to copy the ` EdgeProgramManager ` , as the call to ` to_edge_transform_and_lower ()` mutates the graph in-place.
766
764
767
765
```
768
766
# export_nanogpt.py
0 commit comments