Configure storage credentials for Delta Lake

Databricks stores data for Delta Lake tables in cloud object storage. Configuring access to cloud object storage requires permissions within the cloud account that contains your storage account. See Interact with external data on Databricks.

Pass storage credentials as DataFrame options

Delta Lake supports specifying storage credentials as options for DataFrameReader and DataFrameWriter. You might use this if you need to interact with data in several storage accounts governed by different access keys.

Note

This feature is available in Databricks Runtime 10.1 and above.

For example, you can pass your storage credentails through DataFrame options:

    spark.conf.set("google.cloud.auth.service.account.enable", "true")
df1 = spark.read \
  .option("fs.gs.auth.service.account.email", "<client-email-1>") \
  .option("fs.gs.project.id", "<project-id-1>") \
  .option("fs.gs.auth.service.account.private.key", "<private-key-1>") \
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-1>") \
  .read("...")
df2 = spark.read \
  .option("fs.gs.auth.service.account.email", "<client-email-2>") \
  .option("fs.gs.project.id", "<project-id-2>") \
  .option("fs.gs.auth.service.account.private.key", "<private-key-2>") \
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-2>") \
  .read("...")
df1.union(df2).write \
  .mode("overwrite") \
  .option("fs.gs.auth.service.account.email", "<client-email-3>") \
  .option("fs.gs.project.id", "<project-id-3>") \
  .option("fs.gs.auth.service.account.private.key", "<private-key-3>") \
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-3>") \
  .save("...")
      spark.conf.set("google.cloud.auth.service.account.enable", "true")

val df1 = spark.read
  .option("fs.gs.auth.service.account.email", "<client-email-1>")
  .option("fs.gs.project.id", "<project-id-1>")
  .option("fs.gs.auth.service.account.private.key", "<private-key-1>")
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-1>")
  .read("...")
val df2 = spark.read
  .option("fs.gs.auth.service.account.email", "<client-email-2>")
  .option("fs.gs.project.id", "<project-id-2>")
  .option("fs.gs.auth.service.account.private.key", "<private-key-2>")
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-2>")
  .read("...")
df1.union(df2).write
  .mode("overwrite")
  .option("fs.gs.auth.service.account.email", "<client-email-3>")
  .option("fs.gs.project.id", "<project-id-3>")
  .option("fs.gs.auth.service.account.private.key", "<private-key-3>")
  .option("fs.gs.auth.service.account.private.key.id", "<private-key-id-3>")
  .save("...")