Skip to content

Init Script

Init Script (Initialization Script) is a shell script that runs during startup of each cluster node before the Apache Spark driver or executor JVM starts.

Danger

Legacy Global and Legacy Cluster-Named init scripts run before other init scripts. These init scripts might be present in workspaces created before February 21, 2023.

Prerequisite

We want to initialize some program before a cluster started like:

init_script.sh
#!/bin/bash

echo "Start running init script: adb-default"
echo "Running on the driver? $DB_IS_DRIVER"
echo "Driver IP: $DB_DRIVER_IP"

timedatectl set-timezone Asia/Bangkok

Getting Started

Cluster-Scoped init scripts

To use the UI to configure a cluster to run an init script, complete the following steps:

  • On the Cluster Configuration Page Click the Advanced Options toggle
  • At the bottom of the page click the Init Scripts tab
  • In the Destination drop-down Select the Workspace type
  • Specify a path to the init script like SYS/init_script.sh Click Add.

Note

Each user has a Home directory configured under the /Users directory in the workspace. If a user with the name user@databricks.com stored an init script called my-init.sh in their home directory, the configured path would be /Users/user@databricks.com/my-init.sh.


Cluster-Scoped with Shared Cluster

For shared access mode, you must add init scripts to the allowlist. See Allowlist libraries and init scripts on shared compute.

  • In your Azure Databricks Workspace Click Catalog
  • Click Gear Icon to open the metastore details and permissions UI Select Allowed JARs/Init Scripts Click Add

Warning

Init scripts use the identity of the cluster owner.


Global init scripts

  • Go to the Admin Settings Click Global Init Scripts
  • Click Add Name the script and enter it by typing, pasting, or dragging a text file into the Script field.

Read Mores