diff --git a/AI-GATEWAY.md b/AI-GATEWAY.md
new file mode 100644
index 0000000..a6c90df
--- /dev/null
+++ b/AI-GATEWAY.md
@@ -0,0 +1,204 @@
+---
+# You need to install [VS Code Reveal extension](https://marketplace.visualstudio.com/items?itemName=evilz.vscode-reveal) and then click on 'slides' at the botton to view in presentation mode
+title: AI Gateway
+theme: black
+enableMenu: true
+parallaxBackgroundImage: ../images/back.png
+parallaxBackgroundSize: 1500px 1024px
+
+---
+
+AI Gateway {style="font-size:60px"}
+
+
+
+
+---
+
+AI Gateway objectives
+
+* Aims to accelerate the experimentation of advanced AI use cases {style="font-size:20px"}
+* Ensures control and governance over the consumption of AI services {style="font-size:20px"}
+* Paves the road for a confident deployment of Intelligent Apps into production {style="font-size:20px"}
+
+---
+
+AI Gateway toolchain
+
+
+
+--------------
+
+* Powered by VS Code running locally or in the cloud with GitHub Codespaces {style="font-size:20px"}
+* Jupyter Notebooks structures the step-by-step instructions {style="font-size:20px"}
+* Python scripts define the variables and execute OpenAI API calls directly or with SDKs {style="font-size:20px"}
+* Bicep defines the infrastructure as code needed for the lab in a declarative way {style="font-size:20px"}
+* Azure CLI handles authentication with Azure and issues commands to the control plane {style="font-size:20px"}
+
+---
+
+Request forwarding
+
+Playground to try forwarding requests to either an Azure OpenAI endpoint or a mock server {style="font-size:20px"}
+
+
+
+--------------
+
+* APIM uses the managed identity (user or system assigned). {style="font-size:20px"}
+* APIM is authorized to consume the Azure OpenAI API through Role Based Access Controls. {style="font-size:20px"}
+* Zero impact on consumers using the API directly, with SDKs or orchestrators like LangChain. Just need to update the endpoint to use the APIM endpoint instead of Azure OpenAI endpoint. {style="font-size:20px"}
+* Keyless approach: API consumers use the APIM subscription keys, and the Azure OpenAI keys are never used {style="font-size:20px"}
+
+---
+
+Backend circuit breaking
+
+Playground to try the built-in backend circuit breaker functionality of APIM to either an Azure OpenAI endpoint or a mock server {style="font-size:20px"}
+
+
+
+--------------
+
+* Azure OpenAI endpoint is configured as an APIM backend, promoting reusability across APIs and improved governance. {style="font-size:20px"}
+* Circuit breaking rules define controlled availability for the OpenAI endpoint. {style="font-size:20px"}
+* When the circuit breaks, APIM stops sending requests to OpenAI. {style="font-size:20px"}
+* Handles the status code 429 (Too Many Requests) and any other status code sent by the OpenAI service. {style="font-size:20px"}
+* Doesn’t need any policy configuration. The rules are just properties of the backend. {style="font-size:20px"}
+
+---
+
+Backend pool load balancing
+
+Playground to try the built-in load balancing backend pool functionality of APIM {style="font-size:20px"}
+
+
+
+--------------
+
+* Spread the load to multiple backends, which may have individual backend circuit breakers. {style="font-size:20px"}
+* Shift the load from one set of backends to another for upgrade (blue-green deployment). {style="font-size:20px"}
+* Currently, the backend pool supports round-robin load balancing. {style="font-size:20px"}
+* Doesn’t need any policy configuration. The rules are just properties of the backend. {style="font-size:20px"}
+
+---
+
+Advanced load balancing
+
+Playground to try the advanced load balancing (based on a custom APIM policy) {style="font-size:20px"}
+
+
+
+--------------
+
+* Loads the load balancer configuration from a named value property. {style="font-size:20px"}
+* Uses backends to enable the combination with the built-in circuit breaking feature or chaining with the backend pool. {style="font-size:20px"}
+* The policy doesn't have to be changed to add/modify endpoints or configure the load balancer. {style="font-size:20px"}
+* Dynamically supports any number of OpenAI endpoints. {style="font-size:20px"}
+* Support advanced properties like priority or weights to give priority to Provisioned Throughput Unit (PTU). {style="font-size:20px"}
+
+---
+
+Response streaming
+
+Playground to try response streaming with APIM and Azure OpenAI endpoints to explore the advantages and shortcomings associated with streaming {style="font-size:20px"}
+
+
+
+--------------
+
+* The client application receives the completions in chunks as it's being generated. {style="font-size:20px"}
+* Might improve the user experience for intelligent apps with a ChatGPT interface. {style="font-size:20px"}
+* Streaming responses doesn't include the usage field to tell how many tokens were consumed. {style="font-size:20px"}
+* You sacrifice for now APIM built-in logging. {style="font-size:20px"}
+* Streaming in a production application makes it more difficult to moderate the content of the completions, as partial completions may be more difficult to evaluate. {style="font-size:20px"}
+
+---
+
+Vector searching
+
+Playground to try the Retrieval Augmented Generation (RAG) pattern with Azure AI Search, Azure OpenAI embeddings and Azure OpenAI completions {style="font-size:20px"}
+
+
+
+--------------
+
+* Implements the popular RAG pattern. {style="font-size:20px"}
+* Uses Azure AI Search as a vector store. {style="font-size:20px"}
+* Uses OpenAI to generate the embeddings. {style="font-size:20px"}
+* Supports key word search, hybrid search and semantic ranking. {style="font-size:20px"}
+* OpenAI completion is generated based on the user prompt and the AI search results. {style="font-size:20px"}
+* All the APIs from OpenAI and AI Search are served trough APIM without using keys. {style="font-size:20px"}
+
+---
+
+Built-in logging
+
+Playground to try the built-in logging capabilities of API Management {style="font-size:20px"}
+
+
+
+--------------
+
+* The requests are logged into Application Insights and metrics available in Azure Monitor. {style="font-size:20px"}
+* Doesn’t need any policy configuration. {style="font-size:20px"}
+* Enables tracking request/response details and token usage with the provided notebook. {style="font-size:20px"}
+* Metrics from the Azure OpenAI service might be correlated to provide a holistic view on service usage. {style="font-size:20px"}
+* The notebook can be easily customized to accommodate specific use cases. {style="font-size:20px"}
+* Enables the creation of Azure dashboards for a single pane of glass monitoring approach. {style="font-size:20px"}
+
+---
+
+SLM self-hosting
+
+Playground to try the self-hosted phy-2 Small Language Model (SLM) trough the APIM self-hosted gateway with OpenAI API compatibility {style="font-size:20px"}
+
+
+
+--------------
+
+* The APIM self-hosted gateway is a containerized version of the default managed gateway. {style="font-size:20px"}
+* Useful for scenarios where we need to self-host an open-source model from platforms such as Hugging Face. {style="font-size:20px"}
+* In this playground we have used Phi-2 that is a SLM suited to try on a laptop. {style="font-size:20px"}
+* Both APIM self-hosted gateway and the phy-2 could run on docker containers or in a Kubernetes cluster. {style="font-size:20px"}
+
+---
+
+Summary
+
+
+* The AI Gateway concept provides a range of labs that enables the experimentation of AI Services supported by an API management strategy. {style="font-size:20px"}
+* The experimentation will feed the design architecture and the landing zone that will go into production. {style="font-size:20px"}
+* The labs are based on Jupyter Notebooks to enable clear and documented instructions, Python scripts, Bicep IaC and APIM policies. {style="font-size:20px"}
+* There is a backlog of experiments that we plan to implement to take this work further and enable more advanced use cases. Stay tuned 🙂 {style="font-size:20px"}
+
+---
+
+
+
+### Whant to know more?
+
+[aka.ms/ai-gateway](https://aka.ms/ai-gateway)
+
+
+
+
+---
+
+
+
+
+
+
+## Thank You {style="margin-top: 20px;"}
\ No newline at end of file
diff --git a/AI-GATEWAY.pptx b/AI-GATEWAY.pptx
new file mode 100644
index 0000000..699da72
Binary files /dev/null and b/AI-GATEWAY.pptx differ
diff --git a/README.md b/README.md
index 4758253..08a0c6a 100644
--- a/README.md
+++ b/README.md
@@ -17,14 +17,14 @@ Acknowledging the rising dominance of Python, particularly in the realm of AI, a
| | | |
| ---- | ----- | ----------- |
-| [Request forwarding](labs/request-forwarding/request-forwarding.ipynb) | [](labs/request-forwarding/request-forwarding.ipynb) | Playground to try forwarding requests to either an Azure OpenAI endpoint or a mock server. APIM uses the system [managed identity](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-use-managed-service-identity) to authenticate into the Azure OpenAI service. |
-| [Backend circuit breaking](labs/backend-circuit-breaking/backend-circuit-breaking.ipynb) | [](labs/backend-circuit-breaking/backend-circuit-breaking.ipynb) | Playground to try the built-in [backend circuit breaker functionality of APIM](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) to either an Azure OpenAI endpoints or a mock server. |
-| [Backend pool load balancing](labs/backend-pool-load-balancing/backend-pool-load-balancing.ipynb) | [](labs/backend-pool-load-balancing/backend-pool-load-balancing.ipynb) | Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) to either a list of Azure OpenAI endpoints or mock servers. |
-| [Advanced load balancing](labs/advanced-load-balancing/advanced-load-balancing.ipynb) | [](labs/advanced-load-balancing/advanced-load-balancing.ipynb) | Playground to try the advanced load balancing (based on a custom [APIM policy](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-policies)) to either a list of Azure OpenAI endpoints or mock servers. |
-| [Response streaming](labs/response-streaming/response-streaming.ipynb) | [](labs/response-streaming/response-streaming.ipynb) | Playground to try response streaming with APIM and Azure OpenAI endpoints to explore the advantages and shortcomings associated with [streaming](https://learn.microsoft.com/en-us/azure/api-management/how-to-server-sent-events#guidelines-for-sse). |
-| [Vector searching](labs/vector-searching/vector-searching.ipynb) | [](labs/vector-searching/vector-searching.ipynb) | Playground to try the [Retrieval Augmented Generation (RAG) pattern](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview) with Azure AI Search, Azure OpenAI embeddings and Azure OpenAI completions. All the endpoints are managed via APIM. |
-| [Built-in logging](labs/built-in-logging/built-in-logging.ipynb) | [](labs/built-in-logging/built-in-logging.ipynb) | Playground to try the [buil-in logging capabilities of API Management](https://learn.microsoft.com/en-us/azure/api-management/observability). The requests are logged into Application Insights and it's easy to track request/response details and token usage with provided notebook. |
-| [SLM self-hosting](labs/slm-self-hosting/slm-self-hosting.ipynb) | [](labs/slm-self-hosting/slm-self-hosting.ipynb) | Playground to try the self-hosted [phy-2 Small Language Model (SLM)](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/) trough the [APIM self-hosted gateway](https://learn.microsoft.com/en-us/azure/api-management/self-hosted-gateway-overview) with OpenAI API compatibility. |
+| [Request forwarding](labs/request-forwarding/request-forwarding.ipynb) | [](labs/request-forwarding/request-forwarding.ipynb) | Playground to try forwarding requests to either an Azure OpenAI endpoint or a mock server. APIM uses the system [managed identity](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-use-managed-service-identity) to authenticate into the Azure OpenAI service. |
+| [Backend circuit breaking](labs/backend-circuit-breaking/backend-circuit-breaking.ipynb) | [](labs/backend-circuit-breaking/backend-circuit-breaking.ipynb) | Playground to try the built-in [backend circuit breaker functionality of APIM](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) to either an Azure OpenAI endpoints or a mock server. |
+| [Backend pool load balancing](labs/backend-pool-load-balancing/backend-pool-load-balancing.ipynb) | [](labs/backend-pool-load-balancing/backend-pool-load-balancing.ipynb) | Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) to either a list of Azure OpenAI endpoints or mock servers. |
+| [Advanced load balancing](labs/advanced-load-balancing/advanced-load-balancing.ipynb) | [](labs/advanced-load-balancing/advanced-load-balancing.ipynb) | Playground to try the advanced load balancing (based on a custom [APIM policy](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-policies)) to either a list of Azure OpenAI endpoints or mock servers. |
+| [Response streaming](labs/response-streaming/response-streaming.ipynb) | [](labs/response-streaming/response-streaming.ipynb) | Playground to try response streaming with APIM and Azure OpenAI endpoints to explore the advantages and shortcomings associated with [streaming](https://learn.microsoft.com/en-us/azure/api-management/how-to-server-sent-events#guidelines-for-sse). |
+| [Vector searching](labs/vector-searching/vector-searching.ipynb) | [](labs/vector-searching/vector-searching.ipynb) | Playground to try the [Retrieval Augmented Generation (RAG) pattern](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview) with Azure AI Search, Azure OpenAI embeddings and Azure OpenAI completions. All the endpoints are managed via APIM. |
+| [Built-in logging](labs/built-in-logging/built-in-logging.ipynb) | [](labs/built-in-logging/built-in-logging.ipynb) | Playground to try the [buil-in logging capabilities of API Management](https://learn.microsoft.com/en-us/azure/api-management/observability). The requests are logged into Application Insights and it's easy to track request/response details and token usage with provided notebook. |
+| [SLM self-hosting](labs/slm-self-hosting/slm-self-hosting.ipynb) | [](labs/slm-self-hosting/slm-self-hosting.ipynb) | Playground to try the self-hosted [phy-2 Small Language Model (SLM)](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/) trough the [APIM self-hosted gateway](https://learn.microsoft.com/en-us/azure/api-management/self-hosted-gateway-overview) with OpenAI API compatibility. |
### Backlog of experiments
* Developer tooling
@@ -34,6 +34,7 @@ Acknowledging the rising dominance of Python, particularly in the realm of AI, a
* Token rate limiting
* Cost tracking
* Content filtering
+* PII handling
* Prompt storing
* Function calling
* Prompt guarding
@@ -80,7 +81,14 @@ The [app.py](app.py) can be customized to tailor the Mock server to specific use
* [Run locally or deploy to Azure](mock-server/mock-server.ipynb)
-## 🥇 Resources
+## 🎒 Presenting the AI Gateway concept
+> [!TIP]
+> Install the [VS Code Reveal extension](https://marketplace.visualstudio.com/items?itemName=evilz.vscode-reveal), open AI-GATEWAY.md and click on 'slides' at the botton to present the AI Gateway without leaving VS Code.
+
+> [!TIP]
+> Or just open the [AI-GATEWAY.pptx](AI-GATEWAY.pptx) for a plain old PowerPoint experience.
+
+## 🥇 Other resources
Numerous reference architectures, best practices and starter kits are available on this topic. Please refer to the resources provided if you need comprehensive solutions or a landing zone to initiate your project. We suggest leveraging the AI-Gateway labs to discover additional capabilities that can be integrated into the reference architectures.
diff --git a/images/advanced-load-balancing-small.gif b/images/advanced-load-balancing-small.gif
new file mode 100644
index 0000000..e1fb3e2
Binary files /dev/null and b/images/advanced-load-balancing-small.gif differ
diff --git a/images/back.png b/images/back.png
new file mode 100644
index 0000000..e46b3d3
Binary files /dev/null and b/images/back.png differ
diff --git a/images/backend-circuit-breaking-small.gif b/images/backend-circuit-breaking-small.gif
new file mode 100644
index 0000000..1f6309e
Binary files /dev/null and b/images/backend-circuit-breaking-small.gif differ
diff --git a/images/backend-pool-load-balancing-small.gif b/images/backend-pool-load-balancing-small.gif
new file mode 100644
index 0000000..4b84b42
Binary files /dev/null and b/images/backend-pool-load-balancing-small.gif differ
diff --git a/images/built-in-logging-small.gif b/images/built-in-logging-small.gif
new file mode 100644
index 0000000..1662e36
Binary files /dev/null and b/images/built-in-logging-small.gif differ
diff --git a/images/developer-tooling-small.gif b/images/developer-tooling-small.gif
new file mode 100644
index 0000000..4d437c0
Binary files /dev/null and b/images/developer-tooling-small.gif differ
diff --git a/images/request-forwarding-small.gif b/images/request-forwarding-small.gif
new file mode 100644
index 0000000..e7843f5
Binary files /dev/null and b/images/request-forwarding-small.gif differ
diff --git a/images/response-streaming-small.gif b/images/response-streaming-small.gif
new file mode 100644
index 0000000..ec13315
Binary files /dev/null and b/images/response-streaming-small.gif differ
diff --git a/images/slm-self-hosting-small.gif b/images/slm-self-hosting-small.gif
new file mode 100644
index 0000000..9b27930
Binary files /dev/null and b/images/slm-self-hosting-small.gif differ
diff --git a/images/toolchain.png b/images/toolchain.png
new file mode 100644
index 0000000..68ef7ee
Binary files /dev/null and b/images/toolchain.png differ
diff --git a/images/vector-searching-small.gif b/images/vector-searching-small.gif
new file mode 100644
index 0000000..028dd54
Binary files /dev/null and b/images/vector-searching-small.gif differ