Skip to content

Commit

Permalink
Add clarifying information & remove mention of mock server, format
Browse files Browse the repository at this point in the history
  • Loading branch information
simonkurtz-MSFT committed Jan 15, 2025
1 parent cebe490 commit 30f0aad
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 10 deletions.
2 changes: 1 addition & 1 deletion labs/backend-pool-load-balancing/README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[![flow](../../images/backend-pool-load-balancing.gif)](backend-pool-load-balancing.ipynb)

Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to either a list of Azure OpenAI endpoints or mock servers.
Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to a list of Azure OpenAI endpoints.

### Result

Expand Down
19 changes: 10 additions & 9 deletions labs/backend-pool-load-balancing/backend-pool-load-balancing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,13 @@
"## Backend pool Load Balancing lab\n",
"![flow](../../images/backend-pool-load-balancing.gif)\n",
"\n",
"Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to either a list of Azure OpenAI endpoints.\n",
"Playground to try the built-in load balancing [backend pool functionality of APIM](https://learn.microsoft.com/azure/api-management/backends?tabs=bicep) to a list of Azure OpenAI endpoints.\n",
"\n",
"Notes:\n",
"- The backend pool uses round-robin by default\n",
"- Priority and weight-based routing are also supported: Adjust the `priority` (the lower the number, the higher the priority) and `weight` parameters in the `openai_resources` variable\n",
"- The `retry` API Management policy initiates a retry to an available backend if an HTTP 429 status code is encountered\n",
"- **This is a typical prioritized PTU with fallback consumption scenario**. The lab specifically showcases how a priority 1 (highest) backend is exhausted before gracefully falling back to two equally-weighted priority 2 backends. \n",
"- The backend pool uses round-robin by default.\n",
"- Priority and weight-based routing are supported and can be adjusted by modifying `priority` (the lower the number, the higher the priority) and `weight` parameters in the `openai_resources` variable below.\n",
"- The `retry` API Management policy initiates a retry to an available backend if an HTTP 429 status code is encountered. This is transparent to the caller.\n",
"\n",
"### Result\n",
"![result](result.png)\n",
Expand Down Expand Up @@ -132,7 +133,7 @@
"with open(\"policy.xml\", 'r') as policy_xml_file:\n",
" policy_template_xml = policy_xml_file.read()\n",
" if \"{backend-id}\" in policy_template_xml:\n",
" policy_xml = policy_template_xml.replace(\"{backend-id}\", str(\"openai-backend-pool\" if len(openai_resources) > 1 else openai_resources[0].get(\"name\"))) \n",
" policy_xml = policy_template_xml.replace(\"{backend-id}\", str(\"openai-backend-pool\" if len(openai_resources) > 1 else openai_resources[0].get(\"name\")))\n",
" policy_xml_file.close()\n",
"if policy_xml is not None:\n",
" open(\"policy.xml\", 'w').write(policy_xml)\n",
Expand All @@ -152,12 +153,12 @@
" }\n",
"}\n",
"\n",
"# write the parameters to a file \n",
"# write the parameters to a file\n",
"with open('params.json', 'w') as bicep_parameters_file:\n",
" bicep_parameters_file.write(json.dumps(bicep_parameters))\n",
"\n",
"# run the deployment\n",
"output = utils.run(f\"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file main.bicep --parameters params.json\", \n",
"output = utils.run(f\"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file main.bicep --parameters params.json\",\n",
" f\"Deployment '{deployment_name}' succeeded\", f\"Deployment '{deployment_name}' failed\")\n",
"open(\"policy.xml\", 'w').write(policy_template_xml)\n",
"\n"
Expand Down Expand Up @@ -370,7 +371,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": ".venv",
"language": "python",
"name": "python3"
},
Expand All @@ -384,7 +385,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
"version": "3.12.0"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 30f0aad

Please sign in to comment.