Skip to content

Setup Unity Catalog

Unity Catalog is a data governance solution for Databricks, designed to provide a unified approach across all of your Databricks workspaces.

Important

Azure Databricks workspace should be Premium pricing tier.


Prerequisite

To enable the Databricks Account Console and establish your first Account Admin, you will need to engage someone who has the Microsoft Entra ID (Azure Active Directory) Global Administrator role

For security purposes, only someone with the Global Administrator role has permission to assign the first account admin role. After completing these steps, you can remove the Global Administrator role from the Azure Databricks account.

  1. Create Resource Group
  2. Create Premium Tier Azure Databricks Workspace
  3. Create Azure DataLake Gen2 Storage Account and Container
  4. Create Access Connector for Azure Databricks
  5. Grant Storage Blob Data Contributor to Access Connector for Azure Databricks on Azure DataLake Gen2 Storage Account
  6. Enable Unity Catalog by creating Metastore and assigning to Workspace

Note

If you do not create new Access Connector and use default provisioning, it will not use any managed identity on this Access Connector. The default Accesss Connector name is unity-catalog-access-connector


Getting Started

  • Go to Azure Databricks Workspace Click on Manage Account Login into Account console
  • Click on the Data tab Create Metastores tab
  • Provide information to create metastore (1 metastore per Region):
    • Metastore Name and Region (The best practice is choose same region and resource group)
    • Azure DataLake Storage Gen2 (Example: https://<container-name>@<storage-account-name>.dfs.core.windows.net/<path>)
    • Access Connector ID (Resource ID of your Access Connector)
  • Assign the workspace map to this metastore Click on Enable Unity Catalog

External Catalog

Create storage credentials:

  • On your Azure Databricks Workspace Go to Data Explorer External Data Select Storage Credentials
  • Click Add and then select Add a storage credential Select Service Principal
  • Enter the Storage credential name of your choice Provide your service principle information Click Create

Create external location:

  • In the Data Explorer Select External Locations Click Add an external location
  • Enter the External location name and Azure DataLake Storage Gen2 URL Select the Storage credential you created Click Create

Read Mores