Connect to Google Services
Authentication
BigQuery
Using JSON Encoding
credentials_json_str: str = dbutils.secrets.get(scope="<scope-name>", key="<secret-key-name>")
df = (
spark.read
.format("bigquery")
.option("credentials", base64.b64encode(credentials_json_str.encode()).decode('utf-8'))
.option("parentProject", "<project-id>")
.option("table", "<dataset>.<table-name>")
.load()
)
df.show()
Using GOOGLE_APPLICATION_CREDENTIALS
Using Filepath
df = (
spark.read
.format("bigquery")
.option("credentialsFile", "</path/to/key/file>")
.option("table", "<dataset>.<table-name>")
.load()
)
Access Token
# Globally
spark.conf.set("gcpAccessToken", "<access-token>")
# Per read/Write
df = (
spark.read
.format("bigquery")
.option("gcpAccessToken", "<access-token>")
)
References:
- (https://docs.databricks.com/en/external-data/bigquery.html#step-2-set-up-databricks)
- (https://github.com/GoogleCloudDataproc/spark-bigquery-connector)