Skip to content

External Data Source

Database Scope Credential

A Database Credential is not mapped to a server login or database user. The credential is used by the database to access to the external location anytime the database is performing an operation that requires access.

List Credentials

SELECT * FROM [sys].[database_scoped_credentials];

Create Master Key

Create a new Master Key, ENCRYPTION to encrypt the credentials for the external data source.

-- Optional: Create MASTER KEY if not exists in database:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'P@ssW0rd'
GO

If the master key already exists on the database, you can use:

OPEN MASTER KEY DECRYPTION BY PASSWORD = 'P@ssW0rd';
...
CLOSE MASTER KEY;

Create Credential

CREATE DATABASE SCOPED CREDENTIAL <credential-name>
WITH IDENTITY = 'Managed Identity'
GO
-- authority-url: `https://login.microsoftonline.com/<tenant-id>/oauth2/token`
CREATE DATABASE SCOPED CREDENTIAL <credential-name>
WITH IDENTITY = '<client-id>@<authority-url>',
    SECRET = '<client-secret>'
GO
-- The secret value must remove the leading '?'
CREATE DATABASE SCOPED CREDENTIAL <credential-name>
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
    SECRET = 'sv=2018-03-28&ss=bfqt&...&sig=lQHczN...'
GO

And the permission of User, Managed Identity, or Service Principle that want to access data on target external data source should be any role in Storage Blob Data Owner/Contributor/Reader roles in order for the application to access the data via RBAC in Azure Portal.


External Data Source

List Data Source

SELECT * FROM [sys].[external_data_sources];

Create Data Source

CREATE EXTERNAL DATA SOURCE [<external-data-source>]
WITH(
    LOCATION = 'abfss://<container>@<storage-account>.dfs.core.windows.net',
    CREDENTIAL = <credential-name>,
    PUSHDOWN = ON,
    TYPE = HADOOP
);

Note

PolyBase data virtualization is used when the EXTERNAL DATA SOURCE is created with TYPE=HADOOP.

PUSHDOWN = ON | OFF is set to ON by default, meaning the ODBC Driver can leverage server-side processing for complex queries.

CREATE EXTERNAL DATA SOURCE [<external-data-source>]
WITH(
    LOCATION = 'abfss://<container>@<storage-account>.dfs.core.windows.net',
    CREDENTIAL = <credential-name>
);

For the LOCATION, it provide the connectivity protocol and path to the external data source. See More Supported Protocol

Note

If you want to use Azure AD for access an external data source you can use:

-- The Permission from this solution will up to user that want to access
-- target external data source.
CREATE EXTERNAL DATA SOURCE [<external-data-source>]
WITH (
    LOCATION  = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>'
)

External File Format

List File Format

SELECT * FROM [sys].[external_file_formats]

Create File Format

CREATE EXTERNAL FILE FORMAT <parquet_snappy>
WITH (
    FORMAT_TYPE = PARQUET,
    DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
);
CREATE EXTERNAL FILE FORMAT <skip_header_csv>
WITH (
    FORMAT_TYPE = DELIMITEDTEXT,
    FORMAT_OPTIONS(
        FIELD_TERMINATOR    = ',',
        STRING_DELIMITER    = '"',
        FIRST_ROW           = 2,
        USE_TYPE_DEFAULT    = True
    )
);
CREATE EXTERNAL FILE FORMAT <json-format>
WITH (
    FORMAT_TYPE = JSON,
    DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
);
CREATE EXTERNAL FILE FORMAT <delta-format>
WITH (
    FORMAT_TYPE = DELTA
);

Examples

CREATE EXTERNAL TABLE [<schema-name>].[<external-table-name>]
(
    [PurposeId] [varchar](max),
    [RetireOnDate] [datetime],
    [CreatedDate] [datetime]
)
WITH (
    DATA_SOURCE = [<data-source-name>],
    FILE_FORMAT = [<external-file-format>],
    LOCATION = N'/path/of/data/date=20240708'
)
GO

Example

USE [master];
CREATE LOGIN [username] WITH PASSWORD = 'P@ssW0rd';
GO

USE [database];
CREATE USER [username] FROM LOGIN [username];
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[credential-name] TO [username];
GO
IF NOT EXISTS (
    SELECT *
    FROM [sys].[external_data_sources]
    WHERE [name] = '<external-data-source-name>'
)
    CREATE EXTERNAL DATA SOURCE <external-data-source-name>
    WITH (
        CREDENTIAL = <credential-name>,
        LOCATION = 'abfss://<container>@<storage-account>.dfs.core.windows.net'
    )
GO
CREATE OR ALTER VIEW [CURATED].[<view-name>]
AS
    SELECT *
    FROM OPENROWSET(
        BULK '/delta_silver/<delta-table-name>',
        DATA_SOURCE = '<external-data-source-name>',
        FORMAT = 'DELTA'
) AS [r]
GO

GRANT SELECT ON OBJECT::[CURATED].[<view-name>] TO <user-name>
GO

Read Mores