Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

51 support qnn context binaries #52

Merged
merged 28 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
2eccb57
Add qnn::ModelImpl method to save context binary
ciaranbor Jul 19, 2024
e746150
Save context binary if loaded a shared library model
ciaranbor Jul 19, 2024
3638281
Pull resultant context binary from adb script
ciaranbor Jul 19, 2024
e75ac0d
Add qnn::Backend method to get system interface
ciaranbor Jul 19, 2024
bd204f2
Backend needs to load system library for context binary models
ciaranbor Jul 20, 2024
5c199fa
Add Backend method to get device handle
ciaranbor Jul 20, 2024
54bc346
Move QnnTensor operations to dedicated file, add more operations
ciaranbor Jul 20, 2024
7f95896
Add qnn::ModelImpl method to load model from context binary
ciaranbor Jul 20, 2024
23f6dcb
Modify qnn::ModelImpl constructor to load .bin files as context binaries
ciaranbor Jul 20, 2024
cfc10a1
Add .bin models to createModel functions
ciaranbor Jul 20, 2024
e01d77d
Move graph types to dedicated header
ciaranbor Jul 20, 2024
090463c
Move further QnnTensor operations to dedicated file
ciaranbor Jul 20, 2024
e2038c8
Move graph functions to graph helpers
ciaranbor Jul 20, 2024
0932630
Encapsulate all graph logic in GraphInfoHelper class
ciaranbor Jul 21, 2024
f49a786
Manage graphs memory with containers
ciaranbor Jul 21, 2024
b5c29e7
Move complex graph methods to source file
ciaranbor Jul 21, 2024
ca5387a
Manage graph memory cleanup
ciaranbor Jul 21, 2024
82b628f
Move more graph logic to graph.cpp
ciaranbor Jul 25, 2024
6c02403
Bump version
ciaranbor Jul 25, 2024
c9b1456
Generalise run_with_adb beyond examples
ciaranbor Jul 26, 2024
124a220
Move run_with_adb to scripts directory
ciaranbor Jul 26, 2024
cad7518
Support running tests on android devices using adb
ciaranbor Jul 26, 2024
556839b
Add tflite NPU test
ciaranbor Jul 26, 2024
b6eeb20
Add QNN shared library test
ciaranbor Jul 26, 2024
6c682f9
Fix setting delegate in QNN backend
ciaranbor Jul 26, 2024
02efdc0
Add QNN context binary test
ciaranbor Jul 26, 2024
04751dd
Move QNN tensorOps function implementations to source file
ciaranbor Jul 27, 2024
1d43380
Move GPU and NPU configuration instructions to HACKING.md
ciaranbor Jul 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ if(edgerunner_ENABLE_NPU)
target_sources(
edgerunner_edgerunner
PRIVATE source/qnn/model.cpp source/qnn/tensor.cpp
source/qnn/backend.cpp
source/qnn/backend.cpp source/qnn/graph.cpp
source/qnn/tensorOps.cpp
)

find_package(qnn REQUIRED)
Expand Down
46 changes: 46 additions & 0 deletions HACKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,43 @@ cause issues. See the link above for profiles documentation.
[conan]: https://conan.io/
[profile]: https://docs.conan.io/2/reference/config_files/profiles.html

#### Android

An example Android profile is bundled with this repository. It can be installed
to your local conan prefix using:

```sh
conan config install profiles -tf profiles
```

Use it by adding `-pr android` to your `conan install` invocation.

#### GPU

For GPU support add `-o gpu=True` to the `conan install` invocation.
> [!NOTE]
> The tensorflow-lite conan package disables GPU by default and as such these
steps will not work currently. I have patched the recipe locally to enable GPU
support and will make this available on Conan Center or another repository
soon. In the mean time, my custom recipe can be be used as outlined
[here](https://github.com/neuralize-ai/tensorflow-lite-conan). If you have
previously `conan install`ed, remove the existing TFLite package(s) using
`conan remove "tensorflow-lite"`. Make sure to create the TFLite package
version that is required in [conanfile](/conanfile.py).

GPU support requires a functioning OpenCL installation. Refer to your OS
documentation for the steps for setting this up correctly for your GPU vendor.

#### NPU

There is support for executing on Qualcomm NPUs (more hardware support is
upcoming). Since this involves using Qualcomm's pre-compiled shared libraries,
I have created a Conan recipe that must be used
[here](https://github.com/neuralize-ai/qnn-conan). Follow the instructions on
that repository and the steps above with `-o with_npu=True` supplied to the
`conan install` invocation. Make sure to create the package version required
in [conanfile](/conanfile.py).

### Configure, build and test

If you followed the above instructions, then you can configure, build and test
Expand All @@ -134,6 +171,15 @@ the number of jobs to use, which should ideally be specified to the number of
threads your CPU has. You may also want to add that to your preset using the
`jobs` property, see the [presets documentation][1] for more details.

For Android, the above `ctest` approach does not work. Instead, provided that `conan install` is invoked with an appropriate android profile and Android compatible presets are used, there will be an additional `test-android` target that can be executed with:

```sh
cmake --build --preset=<preset> -t test-android
```

Ensure [adb](https://developer.android.com/tools/adb) is configured and a device
with USB debugging enabled is connected.

### Developer mode targets

These are targets you may invoke using the build command from above, with an
Expand Down
6 changes: 2 additions & 4 deletions example/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ function(add_example NAME)
if(ANDROID)
add_custom_target(
"run_${NAME}"
COMMAND "${CMAKE_SOURCE_DIR}/example/run_with_adb.sh" -b
"${CMAKE_BINARY_DIR}" -e "${NAME}"
COMMAND "${CMAKE_SOURCE_DIR}/scripts/run_with_adb.sh" -b
"${CMAKE_CURRENT_BINARY_DIR}" -e "${NAME}"
VERBATIM
)
else()
Expand All @@ -45,8 +45,6 @@ endfunction()

# NOTE: for Android, adb push fails on symlinks, push directly manually instead
if(ANDROID)
# file(COPY "${CMAKE_BINARY_DIR}/../runtimeLibs/" DESTINATION
# ${CMAKE_CURRENT_BINARY_DIR} )
foreach(dir ${CONAN_RUNTIME_LIB_DIRS})
file(GLOB_RECURSE shared_libs "${dir}/*.so")
file(COPY ${shared_libs} DESTINATION ${CMAKE_CURRENT_BINARY_DIR})
Expand Down
22 changes: 1 addition & 21 deletions example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,19 +39,7 @@ For MacOS, replace "Unix Makefiles" with "Xcode".
> Examples require additional dependencies to the main library. As such, it is
required to supply `-o examples=True` to the `conan install` command.

In the examples below, for GPU support add `-o gpu=True` to the `conan install` command.
> [!NOTE]
> The tensorflow-lite conan package disables GPU by default and as such these
steps will not work currently. I have patched the recipe locally to enable GPU
support and will make this available on Conan Center or another repository
soon. In the mean time, my custom recipe can be be used as outlined
[here](https://github.com/neuralize-ai/tensorflow-lite-conan). If you have
previously `conan install`ed, remove the existing TFLite package(s) using
`conan remove "tensorflow-lite"`. Make sure to create the TFLite package
version that is required in [conanfile](/conanfile.py).

GPU support requires a functioning OpenCL installation. Refer to your OS
documentation for the steps for setting this up correctly for your GPU vendor.
Refer to [HACKING](/HACKING.md) for further configuration options.

## Unix

Expand Down Expand Up @@ -118,11 +106,3 @@ cmake --build --preset=rel -t run_<example_name>
```

where `example_name` is the example filename without the extension (eg. `mobilenet_v3_small`).

There is support for executing on Qualcomm NPUs (more hardware support is
upcoming). Since this involves using Qualcomm's pre-compiled shared libraries,
I have created a Conan recipe that must be used
[here](https://github.com/neuralize-ai/qnn-conan). Follow the instructions on
that repository and the steps above with `-o with_npu=True` supplied to the
`conan install` invocation. Make sure to create the package version required
in [conanfile](/conanfile.py).
19 changes: 18 additions & 1 deletion include/edgerunner/qnn/backend.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,10 @@ class Backend {
/**
* @brief Constructor for the Backend class.
* @param delegate The delegate type for the backend (CPU, GPU, NPU).
* @param isContextBinary Whether the model will be loaded from a context
* binary.
*/
explicit Backend(DELEGATE delegate);
explicit Backend(DELEGATE delegate, bool isContextBinary);

Backend(const Backend&) = default;
Backend(Backend&&) = delete;
Expand All @@ -53,6 +55,15 @@ class Backend {
* @return Reference to the backend handle.
*/
auto getHandle() -> auto& { return m_backendHandle; }
/**
* @brief Returns a reference to the device handle.
*
* This function returns a reference to the device handle, allowing access
* to the underlying device handle object.
*
* @return Reference to the device handle.
*/
auto getDeviceHandle() -> auto& { return m_deviceHandle; }

/**
* @brief Get the context for the backend.
Expand All @@ -66,6 +77,12 @@ class Backend {
*/
auto getInterface() -> auto& { return m_qnnInterface; }

/**
* @brief Get the QNN system interface.
* @return Reference to the QNN system interface.
*/
auto getSystemInterface() -> auto& { return m_qnnSystemInterface; }

/**
* @brief Get the delegate type for the backend.
* @return The delegate type.
Expand Down
151 changes: 151 additions & 0 deletions include/edgerunner/qnn/graph.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
#pragma once

#include <cstring>
#include <memory>

#include <QnnCommon.h>
#include <QnnGraph.h>
#include <QnnInterface.h>
#include <QnnTypes.h>
#include <System/QnnSystemContext.h>
#include <dlfcn.h>
#include <nonstd/span.hpp>

#include "edgerunner/model.hpp"

namespace edge::qnn {

using GraphErrorT = enum GraphError {
GRAPH_NO_ERROR = 0,
GRAPH_TENSOR_ERROR = 1,
GRAPH_PARAMS_ERROR = 2,
GRAPH_NODES_ERROR = 3,
GRAPH_GRAPH_ERROR = 4,
GRAPH_CONTEXT_ERROR = 5,
GRAPH_GENERATION_ERROR = 6,
GRAPH_SETUP_ERROR = 7,
GRAPH_INVALID_ARGUMENT_ERROR = 8,
GRAPH_FILE_ERROR = 9,
GRAPH_MEMORY_ALLOCATE_ERROR = 10,
// Value selected to ensure 32 bits.
GRAPH_UNKNOWN_ERROR = 0x7FFFFFFF
};

using GraphInfoT = struct GraphInfo {
Qnn_GraphHandle_t graph;
char* graphName;
Qnn_Tensor_t* inputTensors;
uint32_t numInputTensors;
Qnn_Tensor_t* outputTensors;
uint32_t numOutputTensors;
};

using GraphConfigInfoT = struct GraphConfigInfo {
char* graphName;
const QnnGraph_Config_t** graphConfigs;
};

using ComposeGraphsFnHandleTypeT =
GraphErrorT (*)(Qnn_BackendHandle_t,
QnnInterface_ImplementationV2_16_t,
Qnn_ContextHandle_t,
const GraphConfigInfoT**,
const uint32_t,
GraphInfoT***,
uint32_t*,
bool,
QnnLog_Callback_t,
QnnLog_Level_t);

using FreeGraphInfoFnHandleTypeT = GraphErrorT (*)(GraphInfoT***, uint32_t);

class GraphsInfo {
public:
GraphsInfo() = default;

GraphsInfo(const GraphsInfo&) = delete;
GraphsInfo(GraphsInfo&&) = delete;
auto operator=(const GraphsInfo&) -> GraphsInfo& = delete;
auto operator=(GraphsInfo&&) -> GraphsInfo& = delete;

~GraphsInfo();

auto getPtr() -> GraphInfoT*** { return &m_graphsInfo; }

auto accessGraphs() -> auto& { return m_graphsInfo; }

auto setGraph() {
m_graphInfo = std::unique_ptr<GraphInfoT>(m_graphsInfo[0] /* NOLINT */);
}

auto getGraphsCountPtr() -> uint32_t* { return &m_graphsCount; }

auto getGraphCount() const { return m_graphsCount; }

auto accessGraphCount() -> auto& { return m_graphsCount; }

auto getGraph() -> auto& { return m_graphInfo->graph; }

auto accessGraph() -> auto& { return m_graphInfo; }

auto getInputs() -> nonstd::span<Qnn_Tensor_t> {
return {m_graphInfo->inputTensors, m_graphInfo->numInputTensors};
}

auto getOutputs() -> nonstd::span<Qnn_Tensor_t> {
return {m_graphInfo->outputTensors, m_graphInfo->numOutputTensors};
}

auto getNumInputs() const { return m_graphInfo->numInputTensors; }

auto getNumOutputs() const { return m_graphInfo->numOutputTensors; }

auto operator[](const size_t index) -> auto& {
return (*m_graphsInfo)[index] /* NOLINT */;
}

auto loadFromSharedLibrary(const std::filesystem::path& modelPath)
-> STATUS;

auto setComposeGraphsFnHandle(
ComposeGraphsFnHandleTypeT composeGraphsFnHandle) -> STATUS;

auto setFreeGraphInfoFnHandle(
FreeGraphInfoFnHandleTypeT freeGraphInfoFnHandle) -> STATUS;

auto composeGraphs(Qnn_BackendHandle_t& qnnBackendHandle,
QnnInterface_ImplementationV2_16_t& qnnInterface,
Qnn_ContextHandle_t& qnnContext) -> STATUS;

auto retrieveGraphFromContext(
QnnInterface_ImplementationV2_16_t& qnnInterface,
Qnn_ContextHandle_t& qnnContext) -> STATUS;

auto copyGraphsInfoV1(const QnnSystemContext_GraphInfoV1_t* graphInfoSrc,
GraphInfoT* graphInfoDst) -> bool;

auto copyGraphsInfo(const QnnSystemContext_GraphInfo_t* graphsInput,
uint32_t numGraphs) -> bool;

auto copyMetadataToGraphsInfo(
const QnnSystemContext_BinaryInfo_t* binaryInfo) -> bool;

private:
std::vector<GraphInfoT> m_graphs;
std::vector<GraphInfoT*> m_graphPtrs;

GraphInfoT** m_graphsInfo {};
uint32_t m_graphsCount {};

std::unique_ptr<GraphInfoT> m_graphInfo;

ComposeGraphsFnHandleTypeT m_composeGraphsFnHandle {};
FreeGraphInfoFnHandleTypeT m_freeGraphInfoFnHandle {};

void* m_libModelHandle {};

std::vector<Qnn_Tensor_t> m_inputTensors;
std::vector<Qnn_Tensor_t> m_outputTensors;
};

} // namespace edge::qnn
Loading