lab | ||||
---|---|---|---|---|
|
In this module, students will learn how to secure a Synapse Analytics workspace and its supporting infrastructure. The student will observe the SQL Active Directory Admin, manage IP firewall rules, manage secrets with Azure Key Vault and access those secrets through a Key Vault linked service and pipeline activities. The student will understand how to implement column-level security, row-level security, and dynamic data masking when using dedicated SQL pools.
In this module, the student will be able to:
- Secure Azure Synapse Analytics supporting infrastructure
- Secure the Azure Synapse Analytics workspace and managed services
- Secure Azure Synapse Analytics workspace data
- Module 8 - End-to-end security with Azure Synapse Analytics
- Lab details
- Resource naming throughout this lab
- Lab setup and pre-requisites
- Exercise 0: Start the dedicated SQL pool
- Exercise 1 - Securing Azure Synapse Analytics supporting infrastructure
- Exercise 2 - Securing the Azure Synapse Analytics workspace and managed services
- Exercise 3 - Securing Azure Synapse Analytics workspace data
- Exercise 4: Cleanup
- Reference
- Other Resources
This lab will guide you through several security-related steps that cover an end-to-end security story for Azure Synapse Analytics. Some key take-aways from this lab are:
-
Leverage Azure Key Vault to store sensitive connection information, such as access keys and passwords for linked services as well as in pipelines.
-
Introspect the data that is contained within the SQL Pools in the context of potential sensitive/confidential data disclosure. Identify the columns representing sensitive data, then secure them by adding column-level security. Determine at the table level what data should be hidden from specific groups of users then define security predicates to apply row level security (filters) on the table. If desired, you also have the option of applying Dynamic Data Masking to mask sensitive data returned in queries on a column by column basis.
For the remainder of this guide, the following terms will be used for various ASA-related resources (make sure you replace them with actual names and values):
Azure Synapse Analytics Resource | To be referred to |
---|---|
Workspace resource group | WorkspaceResourceGroup |
Workspace / workspace name | Workspace |
Primary Storage Account | PrimaryStorage |
Default file system container | DefaultFileSystem |
SQL Pool | SqlPool01 |
SQL Serverless Endpoint | SqlServerless01 |
Active Directory Principal of New User | [email protected] |
SQL username of New User | newuser |
Azure Key Vault | KeyVault01 |
Azure Key Vault Private Endpoint Name | KeyVaultPrivateEndpointName |
Azure Subscription | WorkspaceSubscription |
Azure Region | WorkspaceRegion |
- You have successfully completed Module 0 to create your lab environment.
This lab uses the dedicated SQL pool. As a first step, make sure it is not paused. If so, start it by following these instructions:
-
Open Synapse Studio (https://web.azuresynapse.net/).
-
Select the Manage hub.
-
Select SQL pools in the left-hand menu (1). If the dedicated SQL pool is paused, hover over the name of the pool and select Resume (2).
-
When prompted, select Resume. It will take a minute or two to resume the pool.
Continue to the next exercise while the dedicated SQL pool resumes.
Azure Synapse Analytics (ASA) is a powerful solution that handles security for many of the resources that it creates and manages. In order to run ASA, however, some foundational security measures need to be put in place to ensure the infrastructure that it relies upon is secure. In this exercise, we will walk through securing the supporting infrastructure of ASA.
The SQL Active Directory Admin can be a user (the default) or group (best practice so that more than one user can be provided these permissions) security principal. The principal assigned to this will have administrative permissions to the SQL Pools contained in the workspace.
-
In the Azure Portal (https://portal.azure.com), browse to your lab resource group, and from the list of resources open your Synapse workspace (do not launch Synapse Studio).
-
From the left menu, select SQL Active Directory admin and observe who is listed as a SQL Active Directory Admin. Is it a user or group?
Having robust Internet security is a must for every technology system. One way to mitigate internet threat vectors is by reducing the number of public IP addresses that can access the Azure Synapse Analytics Workspace through the use of IP firewall rules. The Azure Synapse Analytics workspace will then delegate those same rules to all managed public endpoints of the workspace, including those for SQL pools and SQL Serverless endpoints.
-
In the Azure Portal, open the Synapse workspace (do not launch Studio).
-
From the left menu of the Azure Synapse Analytics page, select Firewalls.
-
Notice that an IP Firewall rule of
Allow All
has already been created for you in the lab environment. If you wanted to add your specific IP address you would instead select + Add Client IP from the taskbar menu (you do not need to do this now).Note: When connecting to Synapse from your local network, certain ports need to be open. To support the functions of Synapse Studio, ensure outgoing TCP ports 80, 443, and 1143, and UDP port 53 are open.
When dealing with connectivity to external data sources and services, sensitive connection information such as passwords and access keys should be properly handled. It is recommended that this type of information be stored in an Azure Key Vault. Leveraging Azure Key Vault not only protects against secrets being compromised, it also serves as a central source of truth; meaning that if a secret value needs to be updated (such as when cycling access keys on a storage account), it can be changed in one place and all services consuming this key will start pulling the new value immediately. Azure Key Vault encrypts and decrypts information transparently using 256-bit AES encryption, which is FIPS 140-2 compliant.
-
In the Azure Portal, open the resource group for this lab, and from the list of resources, select the Key vault resource.
-
From the left menu, under Settings, select Access Policies.
-
Observe that Managed Service Identity (MSI) representing your Synapse workspace (it has a name similar to
asaworkspaceNNNNNN
) has already been listed under Application and it has 4 selected Secret Management Operations. -
Select the drop-down that reads
4 selected
underSecret Management Operations
, observe that Get (which allows your workspace to retrieve the values of secrets from Key Vault) and List (which allows your workspace to enumerate secrets) are set.
Linked Services are synonymous with connection strings in Azure Synapse Analytics. Azure Synapse Analytics linked services provides the ability to connect to nearly 100 different types of external services ranging from Azure Storage Accounts to Amazon S3 and more. When connecting to external services, having secrets related to connection information is almost guaranteed. The best place to store these secrets is the Azure Key Vault. Azure Synapse Analytics provides the ability to configure all linked service connections with values from Azure Key Vault.
In order to leverage Azure Key Vault in linked services, you must first add asakeyvaultXX
as a linked service in Azure Synapse Analytics.
-
Navigate to Azure Synapse Studio (https://web.azuresynapse.net/) and sign in with the same user account you did in the Azure portal.
-
Select the Manage hub from the left menu.
-
Beneath External Connections, select Linked Services, observe that a Linked Service pointing to your Key Vault has been provided in the environment.
Since we have the Azure Key Vault set up as a linked service, we can leverage it when defining new linked services. Every New linked service provides the option to retrieve secrets from Azure Key Vault. The form requests the selection of the Azure Key Vault linked service, the secret name, and (optional) specific version of the secret.
It is recommended to store any secrets that are part of your pipeline in Azure Key Vault. In this task you will retrieve these values using a Web activity, just to show the mechanics. The second part of this task demonstrates using a Web activity in the pipeline to retrieve a secret from the Key Vault.
-
Return to the Azure portal.
-
Open the
asakeyvaultXX
Azure Key Vault resource, and select Secrets from the left menu. From the top toolbar, select + Generate/Import. -
Create a secret, with the name PipelineSecret and assign it a value of IsNotASecret, and select the Create button.
-
Open the secret that you just created, drill into the current version, and copy the value in the Secret Identifier field. Save this value in a text editor, or retain it in your clipboard for a future step.
-
Switch back to Synapse Studio, then select the Integrate hub from the left menu.
-
From the Integrate blade, select the + button and add a new Pipeline.
-
On the Pipeline tab, in the Activities pane search for Web and then drag an instance of a Web activity to the design area.
-
Select the Web1 web activity, and select the Settings tab. Fill out the form as follows:
-
URL: Paste the Key Vault Secret Identifier value you copied in step 4 above, then append
?api-version=7.1
to to the end of this value. For example, it should look something like:https://asakeyvaultNNNNN.vault.azure.net/secrets/PipelineSecret/f808d4fa99d84861872010f6c8d25c68?api-version=7.1
. -
Method: Select Get.
-
Expand the Advanced section, and for Authentication select Managed Identity. We have already established an Access Policy for the Managed Service Identity of our Synapse workspace, this means that the pipeline activity has permissions to access the key vault via an HTTP call.
-
Resource: Enter https://vault.azure.net
-
-
From the Activities pane, add a Set variable activity to the design surface of the pipeline.
-
On the design surface of the pipeline, select the Web1 activity and drag a Success activity pipeline connection (green box) to the Set variable1 activity.
-
With the pipeline selected in the designer (e.g., neither of the activities are selected), select the Variables tab and add a new String parameter named SecretValue.
-
Select the Set variable1 activity and select the Variables tab. Fill out the form as follows:
-
Debug the pipeline by selecting Debug from the toolbar menu. When it runs observe the inputs and outputs of both activities from the Output tab of the pipeline.
Note: On the Web1 activity, on the General tab there is a Secure Output checkbox that when checked will prevent the secret value from being logged in plain text, for instance in the pipeline run, you would see a masked value ***** instead of the actual value retrieved from the Key vault. Any activity that consumes this value should also have their Secure Input checkbox checked.
Transparent Data Encryption (TDE) is a feature of SQL Server that provides encryption and decryption of data at rest, this includes: databases, log files, and back ups. When using this feature with Synapse Analytics dedicated SQL pools, it will use a built-in symmetric Database Encryption Key (DEK) that is provided by the pool itself. With TDE, all stored data is encrypted on disk, when the data is requested, TDE will decrypt this data at the page level as it's read into memory, and vice-versa encrypting in-memory data before it gets written back to disk. As with the name, this happens transparently without affecting any application code. When creating a dedicated SQL pool through Synapse Analytics, Transparent Data Encryption is not enabled. The first part of this task will show you how to enable this feature.
-
In the Azure Portal, open your resource group, then locate and open the
SqlPool01
dedicated SQL pool resource. -
On the SQL pool resource screen, select Transparent data encryption from the left-hand menu. DO NOT turn on data encryption.
By default, this option is turned off. When you enable data encryption on this dedicated SQL pool, the pool is taken offline for a few minutes while TDE is applied.
It is important to identify data columns of that hold sensitive information. Types of sensitive could be social security numbers, email addresses, credit card numbers, financial totals, and more. Azure Synapse Analytics allows you define permissions that prevent users or roles select privileges on specific columns.
-
In Azure Synapse Studio, select Develop from the left menu.
-
From the Develop menu, expand the SQL scripts section, and select Lab 05 - Exercise 3 - Column Level Security.
-
In the toolbar menu, connect to the database on which you want to execute the query,
SQLPool01
. -
In the query window, run each step individually by highlighting the statement(s) in the step in the query window, and selecting the Run button from the toolbar (or enter
F5
). -
You may now close the script tab, when prompted choose to Discard all changes.
-
In Azure Synapse Studio, select Develop from the left menu.
-
From the Develop menu, expand the SQL scripts section, and select Lab05 - Exercise 3 - Row Level Security.
-
In the toolbar menu, connect to the database on which you want to execute the query,
SQLPool01
. -
In the query window, run each step individually by highlighting the statement(s) for the step in the query window, and selecting the Run button from the toolbar (or enter
F5
). -
You may now close the script tab, when prompted choose to Discard all changes.
-
In Azure Synapse Studio, select Develop from the left menu.
-
From the Develop menu, expand the SQL scripts section, and select Lab05 - Exercise 3 - Dynamic Data Masking.
-
In the toolbar menu, connect to the database on which you want to execute the query,
SQLPool01
. -
In the query window, run each step individually by highlighting the statement(s) for the step in the query window, and selecting the Run button from the toolbar (or enter
F5
). -
You may now close the script tab, when prompted choose to Discard all changes.
Complete these steps to free up resources you no longer need.
-
Open Synapse Studio (https://web.azuresynapse.net/).
-
Select the Manage hub.
-
Select SQL pools in the left-hand menu (1). Hover over the name of the dedicated SQL pool and select Pause (2).
-
When prompted, select Pause.