Skip to content

Commit a3b6341

Browse files
authored
Frcai patch 1 (Azure#1633)
* Add inferencing migration guide and scripts * Add files via upload * Update export-service-util.py * Update migrate-service.sh * Update README.md * Update README.md
1 parent afe0e92 commit a3b6341

File tree

4 files changed

+339
-1
lines changed

4 files changed

+339
-1
lines changed

README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ directory|description
2222
-|-
2323
[`.github`](.github)|GitHub files like issue templates and actions workflows.
2424
[`cli`](cli)|Azure Machine Learning CLI v2 examples.
25+
[`migration`](migration)|Migration guide and scripts for migrating Azure Machine Learning resources and assets from v1 to v2.
2526
[`sdk`](sdk)|Azure Machine Learning Python SDK v2 examples.
2627
[`python-sdk`](python-sdk)|Azure Machine Learning Python SDK v1 examples.
2728
[`notebooks`](notebooks)|Jupyter notebooks with MLflow tracking to an Azure ML workspace.
@@ -39,4 +40,4 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope
3940
## Reference
4041

4142
- [Documentation](https://docs.microsoft.com/azure/machine-learning)
42-
43+
+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Migration steps for ACI/AKS webservice to Managed online endpoint
2+
3+
[Managed online endpoints](https://docs.microsoft.com/azure/machine-learning/concept-endpoints) help to deploy your ML models in a turnkey manner. Managed online endpoints work with powerful CPU and GPU machines in Azure in a scalable, fully managed way. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. Details can be found on [Deploy and score a machine learning model by using an online endpoint](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoints).
4+
5+
You can deploy directly to the new compute target with your previous models and environments, or leverage the [scripts](https://github.com/Azure/azureml-examples/blob/main/migration/inferencing-migration/migrate-service.sh) (preview) provided by us to export the current services then deploy to the new compute. For customers who regularly create and delete ACI services, we strongly recommend the prior solution. Please notice that the **scoring URL will be changed after migration**. For example, the scoring url for ACI web service is like "http://aaaaaa-bbbbb-1111.westus.azurecontainer.io/score", the scoring url for AKS web service is like "http://1.2.3.4:80/api/v1/service/aks-service/score", while the new one is like "https://endpoint-name.westus.inference.ml.azure.com/score".
6+
7+
## Supported Scenarios and Differences
8+
9+
### Auth Mode
10+
No auth is not supported for managed online endpoint. We'll convert it to key auth if you migrate with below migration scripts.
11+
For key auth, the original keys will be used. Token-based auth is also supported.
12+
13+
### TLS
14+
For ACI service secured with HTTPS, you don't need to provide your own certificates any more, all the managed online endpoints are protected by TLS. Custom DNS name is not supported also.
15+
16+
### Resource Requirements
17+
[ContainerResourceRequirements](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.containerresourcerequirements?view=azure-ml-py) is not supported, you can choose the proper [SKU](https://docs.microsoft.com/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list) for your inferencing.
18+
With our migration tool, we'll map the CPU/Memory requirement to corresponding SKU. If you choose to redeploy manually through CLI/SDK V2, we also suggest the corresponding SKU for your new deployment.
19+
| CPU reqeust | Memory request in GB | SKU |
20+
| :----| :---- | :---- |
21+
| (0, 1] | (0, 1.2] | DS1 V2 |
22+
| (1, 2] | (1.2, 1.7] | F2s V2 |
23+
| (1, 2] | (1.7, 4.7] | DS2 V2 |
24+
| (1, 2] | (4.7, 13.7] | E2s V3 |
25+
| (2, 4] | (0, 5.7] | F4s V2 |
26+
| (2, 4] | (5.7, 11.7] | DS3 V2 |
27+
| (2, 4] | (11.7, 16] | E4s V3 |
28+
29+
### Network Isolation
30+
For private workspace and VNET scenarios, please check [Use network isolation with managed online endpoints (preview)](https://docs.microsoft.com/azure/machine-learning/how-to-secure-online-endpoint?tabs=model). As there're many settings for your workspace and VNET, we strongly suggest that redeploy through our new CLI instead of the below script tool.
31+
32+
## Not supported
33+
1. [EncryptionProperties](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.encryptionproperties?view=azure-ml-py) for ACI contaienr is not supported.
34+
2. ACI webservices deployed through deploy_from_model and deploy_from_image are not supported by the migration tool, please redeploy manually through CLI/SDK V2.
35+
36+
## Migration Steps
37+
38+
### With our [CLI](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoints) or [SDK preview](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoint-sdk-v2)
39+
Redeploy manually with your model fils and environment definition.
40+
You can find our examples on [azureml-examples](https://github.com/Azure/azureml-examples). Specifically, this is the [SDK example for managed online endpoint](https://github.com/Azure/azureml-examples/tree/main/sdk/endpoints/online/managed).
41+
42+
### With our migration tool (preview)
43+
Here're the steps to use these scripts. Please notice that the new endpoint will be created under the **same workspace**.
44+
45+
1. Linux/WSL to run the bash script.
46+
2. Install [Python SDK V1](https://docs.microsoft.com/python/api/overview/azure/ml/install?view=azure-ml-py) to run the python script.
47+
3. Install [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli).
48+
4. Clone this repository to your local env, git clone https://github.com/Azure/azureml-examples.
49+
5. Edit the subscription/resourcegroup/workspace/service name info in migrate-service.sh, also the expected new endpoint name and deployment name. We recommend that the new endpoint name is different from the previous one, otherwise, the original service will not be displayed if you check your endpoints on portal.
50+
6. Execute the bash script, it will take about 5-10 minutes to finish the new deployment.
51+
7. After the deployment is done successfully, you can verify the endpoint with [invoke command](https://docs.microsoft.com/cli/azure/ml/online-endpoint?view=azure-cli-latest#az-ml-online-endpoint-invoke).
52+
53+
## Cost comparision
54+
We have a rough cost comparison. That varies based on your region, currency and order type, just for your information.
55+
ACI cost is calculated by $29.5650 * X + $3.2485 * Y. (X is the CPU core request rounded up to the nearest number, Y is the memory GB request rounded up to the nearest tenths place)
56+
Both costs are calculated by month.
57+
58+
| CPU reqeust | Memory request in GB | ACI costs range | SKU | SKU pay as you go| SKU 1 year reserved| SKU 3 year reserved
59+
| :----| :---- | :---- | :---- | :---- | :---- | :---- |
60+
| (0, 1] | (0, 1.2] | ($29.565, $33.463] | DS1 V2 | $41.610 | $27.003 | $17.696 |
61+
| (1, 2] | (1.2, 1.7] | ($63.028, $64.652] | F2s V2 | $61.758 | $36.500 | $22.638 |
62+
| (1, 2] | (1.7, 4.7] | ($64.652, $74.398] | DS2 V2 | $83.220 | $54.086 | $35.391 |
63+
| (1, 2] | (4.7, 13.7] | ($74.398, $103.634] | E2s V3 | $97.090 | $57.086 | $36.500 |
64+
| (2, 3] | (0, 5.7] | ($88.695, $107.211] | F4s V2 | $123.37 | $73.000 | $45.275 |
65+
| (3, 4] | (0, 5.7] | ($118.26, $136.776] | F4s V2 | $123.37 | $73.000 | $45.275 |
66+
| (2, 3] | (5.7, 11.7] | ($107.211, $126.702] | DS3 V2 | $167.170 | $108.165 | $70.781 |
67+
| (3, 4] | (5.7, 11.7] | ($136.776, $156.267] | DS3 V2 | $167.170 | $108.165 | $70.781 |
68+
| (2, 3] | (11.7, 16] | ($126.702, $140.671] | E4s V3 | $194.180 | $114.165 | $73.000 |
69+
| (3, 4] | (11.7, 16] | ($156.267, $170.236] | E4s V3 | $194.180 | $114.165 | $73.000 |
70+
71+
## Contact us
72+
Reach out to us: [email protected] if you have any questions or feedback.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
import json
2+
import argparse
3+
import tempfile
4+
from azureml.core import Workspace
5+
from azureml.exceptions import WebserviceException
6+
from azureml._model_management._util import _get_mms_url, get_requests_session
7+
from azureml._model_management._constants import (
8+
AKS_WEBSERVICE_TYPE,
9+
ACI_WEBSERVICE_TYPE,
10+
UNKNOWN_WEBSERVICE_TYPE,
11+
MMS_SYNC_TIMEOUT_SECONDS,
12+
)
13+
from azureml.core.webservice import Webservice, AciWebservice, AksWebservice
14+
from azureml._restclient.clientbase import ClientBase
15+
16+
MIGRATION_WEBSERVICE_TYPES = [AKS_WEBSERVICE_TYPE, ACI_WEBSERVICE_TYPE]
17+
18+
19+
def export(
20+
ws: Workspace,
21+
service_name: str = None,
22+
timeout_seconds: int = None,
23+
show_output: bool = True,
24+
):
25+
"""
26+
Export all services under target workspace into template and parameters.
27+
:param ws: Target workspace.
28+
:param service_name: the service name to be migrated.
29+
:param show_output: Whether print outputs.
30+
:param timeout_seconds: Timeout settings for waiting export.
31+
"""
32+
base_url = _get_mms_url(ws)
33+
mms_endpoint = base_url + "/services/" + service_name
34+
headers = {"Content-Type": "application/json"}
35+
headers.update(ws._auth_object.get_authentication_header())
36+
try:
37+
resp = ClientBase._execute_func(
38+
get_requests_session().get,
39+
mms_endpoint,
40+
headers=headers,
41+
timeout=MMS_SYNC_TIMEOUT_SECONDS,
42+
)
43+
except:
44+
raise WebserviceException(f"Cannot get service {service_name}")
45+
46+
if resp.status_code == 404:
47+
raise WebserviceException(f"Service {service_name} does not exist.")
48+
49+
content = resp.content
50+
if isinstance(resp.content, bytes):
51+
content = resp.content.decode("utf-8")
52+
service = json.loads(content)
53+
if service["state"] != "Healthy":
54+
raise WebserviceException(
55+
f"service {service_name} is unhealthy, migration with this tool is not supported."
56+
)
57+
compute_type = service["computeType"]
58+
if compute_type.upper() not in MIGRATION_WEBSERVICE_TYPES:
59+
raise WebserviceException(
60+
'Invalid compute type "{}". Valid compute types are "{}"'.format(
61+
compute_type, ",".join(MIGRATION_WEBSERVICE_TYPES)
62+
)
63+
)
64+
compute_name = service_name
65+
if compute_type.upper() == AKS_WEBSERVICE_TYPE:
66+
compute_name = service["computeName"]
67+
68+
mms_endpoint = base_url + "/services/export"
69+
export_payload = {"serviceName": service_name}
70+
try:
71+
resp = ClientBase._execute_func(
72+
get_requests_session().post,
73+
mms_endpoint,
74+
headers=headers,
75+
json=export_payload,
76+
)
77+
except:
78+
raise WebserviceException(f"Cannot get service {service_name}")
79+
80+
if resp.status_code == 202:
81+
service_entity = None
82+
if compute_type.upper() == AKS_WEBSERVICE_TYPE:
83+
service_entity = AksWebservice(ws, service_name)
84+
elif compute_type.upper() == ACI_WEBSERVICE_TYPE:
85+
service_entity = AciWebservice(ws, service_name)
86+
service_entity.state = "Exporting"
87+
service_entity._operation_endpoint = (
88+
_get_mms_url(service_entity.workspace)
89+
+ f'/operations/{resp.content.decode("utf-8")}'
90+
)
91+
state, _, operation = service_entity._wait_for_operation_to_complete(
92+
show_output, timeout_seconds
93+
)
94+
if state == "Succeeded":
95+
export_folder = operation.get("resourceLocation").split("/")[-1]
96+
storage_account = service_entity.workspace.get_details().get(
97+
"storageAccount"
98+
)
99+
if show_output:
100+
print(
101+
f"Services have been exported to storage account: {storage_account} \n"
102+
f"Folder path: azureml/{export_folder}"
103+
)
104+
return storage_account.split("/")[-1], export_folder, compute_name
105+
else:
106+
raise WebserviceException(
107+
"Received bad response from Model Management Service:\n"
108+
"Response Code: {}\n"
109+
"Headers: {}\n"
110+
"Content: {}".format(resp.status_code, resp.headers, resp.content)
111+
)
112+
113+
114+
def overwrite_parameters(
115+
parms: dict, endpoint_name: str = None, deployment_name: str = None
116+
):
117+
"""
118+
Overwrite parameters
119+
:param deployment_name: v2 online-deployment name. Default will be v1 service name.
120+
:param endpoint_name: v2 online-endpoint name. Default will be v1 service name.
121+
:param parms: parameters as dict: loaded v2 parameters.
122+
"""
123+
properties = parms["onlineEndpointProperties"]["value"]
124+
traffic = parms["onlineEndpointPropertiesTrafficUpdate"]["value"]
125+
properties.pop("keys")
126+
traffic.pop("keys")
127+
if endpoint_name:
128+
parms["onlineEndpointName"]["value"] = endpoint_name
129+
130+
# this is optional
131+
if deployment_name:
132+
parms["onlineDeployments"]["value"][0]["name"] = deployment_name
133+
traffic["traffic"][deployment_name] = traffic["traffic"].pop(
134+
list(traffic["traffic"].keys())[0]
135+
)
136+
137+
temp_file = tempfile.NamedTemporaryFile(mode="w+", suffix=".json", delete=False)
138+
json.dump(online_endpoint_deployment, temp_file)
139+
temp_file.flush()
140+
print(temp_file.name)
141+
142+
143+
if __name__ == "__main__":
144+
145+
def parse_args():
146+
parser = argparse.ArgumentParser(description="Export v1 service script")
147+
parser.add_argument(
148+
"--export", action="store_true", help="using script for export services"
149+
)
150+
parser.add_argument(
151+
"--overwrite-parameters",
152+
action="store_true",
153+
help="using script for overwrite parameters purpose",
154+
)
155+
parser.add_argument("-w", "--workspace", type=str, help="workspace name")
156+
parser.add_argument(
157+
"-g", "--resource-group", type=str, help="resource group name"
158+
)
159+
parser.add_argument("-s", "--subscription", type=str, help="subscription id")
160+
parser.add_argument(
161+
"-sn",
162+
"--service-name",
163+
default=None,
164+
type=str,
165+
help="service name to be migrated",
166+
)
167+
parser.add_argument(
168+
"-e",
169+
"--export-json",
170+
action="store_true",
171+
dest="export_json",
172+
help="show export result in json",
173+
)
174+
parser.add_argument(
175+
"-mp", "--parameters-path", type=str, help="parameters file path"
176+
)
177+
parser.add_argument(
178+
"-me",
179+
"--migrate-endpoint-name",
180+
type=str,
181+
default=None,
182+
help="v2 online-endpoint name, default is v1 service name",
183+
)
184+
parser.add_argument(
185+
"-md",
186+
"--migrate-deployment-name",
187+
type=str,
188+
default=None,
189+
help="v2 online-deployment name, default is v1 service name",
190+
)
191+
parser.set_defaults(compute_type=None)
192+
return parser.parse_args()
193+
194+
# parse args
195+
args = parse_args()
196+
197+
if args.export:
198+
workspace = Workspace.get(
199+
name=args.workspace,
200+
resource_group=args.resource_group,
201+
subscription_id=args.subscription,
202+
)
203+
storage_account, blob_folder, v1_compute = export(
204+
workspace, args.service_name, show_output=not args.export_json
205+
)
206+
if args.export_json:
207+
print(
208+
json.dumps(
209+
{
210+
"storage_account": storage_account,
211+
"blob_folder": blob_folder,
212+
"v1_compute": v1_compute,
213+
}
214+
)
215+
)
216+
217+
if args.overwrite_parameters:
218+
with open(args.parameters_path) as f:
219+
online_endpoint_deployment = json.load(f)
220+
overwrite_parameters(
221+
online_endpoint_deployment,
222+
args.migrate_endpoint_name,
223+
args.migrate_deployment_name,
224+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
subscription_id="<SUBSCRIPTION_ID>"
6+
resource_group="<RESOURCEGROUP_NAME>"
7+
workspace_name="<WORKSPACE_NAME>"
8+
v1_service_name="<SERVICE_NAME>" # name of your aci/aks service
9+
local_dir="<LOCAL_PATH>"
10+
online_endpoint_name="<NEW_ENDPOINT_NAME>"
11+
online_deployment_name="<NEW_DEPLOYMENT_NAME>"
12+
13+
migrate_type="Managed"
14+
15+
# STEP1 Export services
16+
echo 'Exporting services...'
17+
output=$(python3 export-service-util.py --export --export-json -w $workspace_name -g $resource_group -s $subscription_id -sn $v1_service_name| tee /dev/tty)
18+
read -r storage_account blob_folder v1_compute < <(echo "$output" |tail -n1| jq -r '"\(.storage_account) \(.blob_folder) \(.v1_compute)"')
19+
20+
# STEP2 Download template & parameters files
21+
echo 'Downloading files...'
22+
az storage blob directory download -c azureml --account-name "$storage_account" -s "$blob_folder" -d $local_dir --recursive --subscription $subscription_id --only-show-errors 1> /dev/null
23+
24+
# STEP3 Overwrite parameters
25+
echo 'Overwriting parameters...'
26+
echo
27+
params_file="$local_dir/$blob_folder/$v1_compute/$migrate_type/$v1_service_name.params.json"
28+
template_file="$local_dir/$blob_folder/online.endpoint.template.json"
29+
output=$(python3 export-service-util.py --overwrite-parameters -mp "$params_file" -me "$online_endpoint_name" -md "$online_deployment_name"| tee /dev/tty)
30+
params=$(echo "$output"|tail -n1)
31+
32+
# STEP4 Deploy to managed online endpoints
33+
echo
34+
echo "Params have been saved to $params"
35+
echo "Deploying $migrate_type service $online_endpoint_name..."
36+
deployment_name="Migration-$online_endpoint_name-$(echo $RANDOM | md5sum | head -c 4)"
37+
az deployment group create --name "$deployment_name" --resource-group "$resource_group" --template-file "$template_file" --parameters "$params" --subscription $subscription_id
38+
39+
# STEP5 Clean up exported files in blob storage
40+
echo 'Cleaning up exported files in blob storage...'
41+
az storage blob directory delete -c azureml --account-name "$storage_account" -d "$blob_folder" --recursive --subscription $subscription_id --only-show-errors 1> /dev/null

0 commit comments

Comments
 (0)