Study notes
This is a massive subject (Whole Azure certification) It will be touched very superficial.
Azure Machine Learning (Azure ML) provides several mechanisms that offers granular control:
- network environment
- resources used within it
- data being accessed.
Azure Role-Based Access Control (Azure RBAC) is an authorization system that allows fine-grained access management of Azure Machine Learning resources. It enables you to manage team members' access to Azure cloud resources by assigning roles.
User roles
Azure RBAC roles can be assigned to individuals and groups.
The rights assigned to each group are allocated as a permission-based system, with set access and restrictions being clearly defined.
This control is applied at the workspace level and can only be changed by administrators or owners of the specific workspace within Azure Machine Learning
- Owners
have full access to the workspace, including the ability to view, create, edit, or delete assets in a workspace. Owners can change role assignments. - Contributors
can view, create, edit, or delete assets in a workspace (can create an experiment, create or attach a compute cluster, submit a run, and deploy a web service.) - Readers
can only perform read-only actions in the workspace. (list and view assets, including datastore credentials in a workspace). They can't create or update these assets. - Custome roles
If the default roles do not meet your organization's need for more selective access control, you can create your own Custom roles.
Custom roles can be created by defining possible [Actions] permitted and [NotActions]to restrict specific activities or access.
{
"Name": "Data Scientist Custom",
"IsCustom": true,
"Description": "Can run experiment but can't create or delete compute or deploy production endpoints.",
"Actions": [
"Microsoft.MachineLearningServices/workspaces/*/read",
"Microsoft.MachineLearningServices/workspaces/*/action",
"Microsoft.MachineLearningServices/workspaces/*/delete",
"Microsoft.MachineLearningServices/workspaces/*/write"
],
"NotActions": [
"Microsoft.MachineLearningServices/workspaces/delete",
"Microsoft.MachineLearningServices/workspaces/write",
"Microsoft.MachineLearningServices/workspaces/computes/*/write",
"Microsoft.MachineLearningServices/workspaces/computes/*/delete",
"Microsoft.Authorization/*",
"Microsoft.MachineLearningServices/workspaces/computes/listKeys/action",
"Microsoft.MachineLearningServices/workspaces/listKeys/action",
"Microsoft.MachineLearningServices/workspaces/services/aks/write",
"Microsoft.MachineLearningServices/workspaces/services/aks/delete",
"Microsoft.MachineLearningServices/workspaces/endpoints/pipelines/write"
],
"AssignableScopes": [
"/subscriptions/<subscription_id>/resourceGroups/<resource_group_name>/providers/Microsoft.MachineLearningServices/workspaces/<workspace_name>"
]
}
Now we get to the Azure Active Directory (Azure AD)
Azure AD is a cloud-based identity and access management service that helps your employees' sign-in and access cloud resources on Azure.
There are four authentication workflows that you can use when connecting to the workspace (it is about Azure ML and Workspace is the place where we enforce restrictions)
- Interactive
You use your account in Azure Active Directory to either manually authenticate or obtain an authentication token. Interactive authentication is used during experimentation and iterative development. It enables you to control access to resources (such as a web service) on a per-user basis. - Service principal
You create a service principal account in Azure Active Directory, and use it to authenticate or obtain an authentication token. A service principal is used when you need an automated process to authenticate to the service. For example, a continuous integration and deployment script that trains and tests a modelevery time the training code changes needs ongoing access and so would benefit from a service principal account. - Azure CLI session
You use an active Azure CLI session to authenticate. Azure CLI authentication is used during experimentation and iterative development, or when you need an automated process to authenticate to the service using a pre-authenticated session. You can log in to Azure via the Azure CLI on your local workstation, without storing credentials in code or prompting the user to authenticate. - Managed identity
When using the Azure Machine Learning SDK on an Azure Virtual Machine, you can use a managed identity for Azure. This workflow allows the VM to connect to the workspace using the managed identity, without storing credentials in code or prompting the user to authenticate. Azure Machine Learning compute clusters can also be configured to use a managed identity to access the workspace when training models.
Once you have a managed identity, you can request tokens via a token endpoint on a resource such as a virtual machine
Azure Machine Learning compute clusters can use managed identities to authenticate access to Azure resources within Azure Machine Learning without including credentials in your code.
There are two types of managed identities:- System-assigned
Some Azure services allow you to enable a managed identity directly on a service instance. When you enable a system-assigned managed identity, an identity is created in Azure AD tied to that service instance's lifecycle. By design, only that Azure resource can use this identity to request tokens from Azure AD, and when the resource is deleted, Azure automatically deletes the identity for you. - User-assigned
You may also create a managed identity as a standalone Azure resource. You can create a user-assigned managed identity and assign it to one or more instances of an Azure service. The identity is managed separately from the resources that use it and will persist if a resource using it is removed. For simplicity, we recommend using system-assigned roles unless you require a custom access solution.
- System-assigned
Azure Key Vault provides secure storage of generic secrets for applications in Azure-hosted environments. Any type of secret can be stored, so long as its value is no larger than 25kb and it can be read and returned as a string. Secrets are named, and their content type (such as password or certificate) can optionally be stored alongside the value to provide a hint that assists in its interpretation when retrieved.
Secrets stored in Azure Key Vault are encrypted, optionally at the hardware level. This is handled transparently, and requires no action from the user or the application requesting the secrets. They can also be temporarily disabled, and automatically activate or expire on a certain date.
When you create an Azure Machine Learning workspace, this automatically creates a Key Vault. To view the Azure Key Vault associated with your workspace, open the workspace’s Overview tab
For example, you can use the Azure Shell to set an environmental variable holding Key Store’s name, and save a password to that key store
# export the name of the vault to an environmental variable
export KEY_VAULT_NAME=<your-unique-keyvault-name>
# Save a new secret, called ExamplePassword
az keyvault secret set --vault-name $KEY_VAULT_NAME --name "ExamplePassword" --value "xxxxxxxx"
This password is stored securely and is encrypted. As an example, a Python application using Azure Machine Learning’s SDK can access this key
'''
Simple example of obtaining a secret from the keyvault.
Assumes azure-identity and azure-keyvault-secrets have been
pip installed
'''
import os
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
# Get the key vault name
keyVaultName = os.environ["KEY_VAULT_NAME"]
# Create a client to access the secret
credential = DefaultAzureCredential()
client = SecretClient(vault_url= f"https://{keyVaultName}.vault.azure.net", credential=credential)
# Get a secret and print it to the console
# Note that printing out passwords is bad practice and only
# performed here for learning purposes
retrieved_secret = client.get_secret("ExamplePassword")
print(f"Your secret is '{retrieved_secret.value}'")
Secure your Azure Machine Learning network
To secure the Azure Machine Learning workspace and compute resources, we will use a virtual network (VNet).
Azure VNet
- Fundamental building block for your private network in Azure.
- Enables Azure resources, such as Azure Blob Storage and Azure Container Registry, to securely communicate with each other, the internet, and on-premises networks.
- Is similar to a traditional network that you'd operate in your own data center, but brings with it additional benefits of Azure's infrastructure such as scale, availability, and isolation.
Virtual Network (VNet) comprised of:
- IP address space
When creating a VNet, you must specify a custom private IP address space using public and private (RFC 1918) addresses. - Subnets
Shown above as separate virtual machines (VMs), subnets enable you to segment the virtual network into one or more sub-networks and allocate a portion of the virtual network's address space to each subnet, enhancing security and performance. - Network interfaces (NIC)
are the interconnection between a VM and a virtual network (VNet). When you create a VM in the Azure portal, a network interface is automatically created for you. - Network security groups (NSG)
can contain multiple inbound and outbound security rules that enable you to filter traffic to and from resources by source and destination IP address, port, and protocol. - Load balancers
can be configured to efficiently handle inbound and outbound traffic to VMs and VNets, while also offering metrics to monitor the health of VMs.
- Service endpoints
provide the identity of your virtual network to the Azure service. Once you enable service endpoints in your virtual network, you can add a virtual network rule to secure the Azure service resources to your virtual network. Service endpoints use public IP addresses. - Private endpoints
are network interfaces that securely connect you to a service powered by Azure Private Link. Private endpoint uses a private IP address from your VNet, effectively bringing the Azure services into your VNet.
The following methods can be used to connect to the secure workspace:
- Azure VPN gateway
Connects on-premises networks to the VNet over a private connection. Connection is made over the public internet. There are two types of VPN gateways that you might use:- Point-to-site: Each client computer uses a VPN client to connect to the VNet.
- Site-to-site: A VPN device connects the VNet to your on-premises network.
- ExpressRoute
Connects on-premises networks into the cloud over a private connection. Connection is made using a connectivity provider. - Azure Bastion
In this scenario, you create an Azure Virtual Machine (sometimes called a jump box) inside the VNet. You then connect to the VM using Azure Bastion. Bastion allows you to connect to the VM using either an RDP or SSH session from your local web browser. You then use the jump box as your development environment. Since it is inside the VNet, it can directly access the workspace.
Service tags
A service tag represents a group of IP address prefixes from a given Azure service.
Microsoft manages the address prefixes encompassed by the service tag and automatically updates the service tag as addresses change, minimizing the complexity of frequent updates to network security rules.
You can use service tags in place of specific IP addresses when you create security rules to define network access controls on network security groups or Azure Firewall.
By specifying the service tag name, such as ApiManagement, in the appropriate source or destination field of a rule, you can allow or deny the traffic for the corresponding service.
Private endpoints & Private Link
The Azure Machine Learning workspace can use Azure Private Link to create a private endpoint behind the VNet.
- Private endpoint
a network interface connected to your virtual network, assigned with a private IP address. It is used to connect privately and securely to a service powered by Azure Private Link or a Private Link Service that you or a partner might own. - Private Link Service
your own service, powered by Azure Private Link that runs behind an Azure Standard Load Balancer, enabled for Private Link access. This service can be privately connected with and consumed using Private Endpoints deployed in the user's virtual network.
Resources:
Explore security concepts in Azure Machine Learning - Training | Microsoft Learn
What is a workspace? - Azure Machine Learning | Microsoft Learn
Manage resource groups - Azure portal - Azure Resource Manager | Microsoft Learn
Manage roles in your workspace - Azure Machine Learning | Microsoft Learn
Azure Key Vault documentation | Microsoft Learn
azureml.core.keyvault.Keyvault class - Azure Machine Learning Python | Microsoft Learn
Authentication secrets - Azure Machine Learning | Microsoft Learn