Setup Unity Catalog
Unity Catalog is a data governance solution for Databricks, designed to provide a unified approach across all of your Databricks workspaces.
Important
Azure Databricks workspace should be Premium pricing tier.
Prerequisite
To enable the Databricks Account Console and establish your first Account Admin, you will need to engage someone who has the Microsoft Entra ID (Azure Active Directory) Global Administrator role
For security purposes, only someone with the Global Administrator role has permission to assign the first account admin role. After completing these steps, you can remove the Global Administrator role from the Azure Databricks account.
- Create Resource Group
- Create Premium Tier Azure Databricks Workspace
- Create Azure DataLake Gen2 Storage Account and Container
- Create Access Connector for Azure Databricks
- Grant Storage Blob Data Contributor to Access Connector for Azure Databricks on Azure DataLake Gen2 Storage Account
- Enable Unity Catalog by creating Metastore and assigning to Workspace
Note
If you do not create new Access Connector and use default provisioning, it
will not use any managed identity on this Access Connector. The default
Accesss Connector name is unity-catalog-access-connector
Getting Started
- Go to Azure Databricks Workspace Click on Manage Account Login into Account console
- Click on the Data tab Create Metastores tab
- Provide information to create metastore (1 metastore per Region):
- Metastore Name and Region (The best practice is choose same region and resource group)
- Azure DataLake Storage Gen2 (Example:
https://<container-name>@<storage-account-name>.dfs.core.windows.net/<path>
) - Access Connector ID (Resource ID of your Access Connector)
- Assign the workspace map to this metastore Click on Enable Unity Catalog
External Catalog
Create storage credentials:
- On your Azure Databricks Workspace Go to Data Explorer External Data Select Storage Credentials
- Click Add and then select Add a storage credential Select Service Principal
- Enter the Storage credential name of your choice Provide your service principle information Click Create
Create external location:
- In the Data Explorer Select External Locations Click Add an external location
- Enter the External location name and Azure DataLake Storage Gen2 URL Select the Storage credential you created Click Create