HIPAA compliance features

Databricks relies on the built-in features of GKE to enforce encryption at rest and encryption in transit within a cluster.

This feature requires your workspace to be on the Premium pricing tier. The purchase of the HIPAA Compliance Add-on is required if you process PHI data in your account.

Ensure that sensitive information is never entered in customer-defined input fields, such as workspace names, cluster names, and job names.

HIPAA overview

The Health Insurance Portability and Accountability Act of 1996 (HIPAA), the Health Information Technology for Economic and Clinical Health (HITECH), and the regulations issued under HIPAA are a set of US healthcare laws. Among other provisions, these laws establish requirements for the use, disclosure, and safeguarding of protected health information (PHI).

HIPAA applies to covered entities and business associates that create, receive, maintain, transmit, or access PHI. When a covered entity or business associate engages the services of a cloud service provider (CSP), such as Databricks, the CSP becomes a business associate under HIPAA.

HIPAA regulations require that covered entities and their business associates enter into a contract called a Business Associate Agreement (BAA) to ensure the business associates will protect PHI adequately. Among other things, a BAA establishes the permitted and required uses and disclosures of PHI by the business associate, based on the relationship between the parties and the activities and services being performed by the business associate.

Does Databricks permit the processing of PHI data on Databricks?

Databricks permits the processing of PHI data if you have a BAA agreement with Databricks. Contact your Databricks account team for more information. It is your responsibility before you process PHI data to have a BAA agreement with Databricks.

Enable HIPAA on a workspace

HIPAA compliance features on the Google Cloud platform are enabled at the account level.

If you have a Google Cloud account and your account is not enabled for HIPAA, contact your Databricks account team to upgrade your account to include HIPAA compliance features. Note that enabling HIPAA compliance features for an account is permanent.

After your Databricks account is enabled for HIPAA on Google Cloud, workspaces in the account have HIPAA compliance features for all regions. To deploy a workspace without HIPAA compliance features, you must create a separate Databricks account.

Important

  • You are wholly responsible for ensuring your own compliance with all applicable laws and regulations. Information provided in Databricks online documentation does not constitute legal advice, and you should consult your legal advisor for any questions regarding regulatory compliance.

  • Databricks does not support the use of preview features for the processing of PHI on the HIPAA on Google Cloud platform, with the exception of the features listed in Preview features that are supported for processing of PHI data.

Preview features that are supported for processing of PHI data

The following preview features are supported for processing of PHI:

Shared responsibility of HIPAA compliance

Complying with HIPAA has three major areas, with different responsibilities. While each party has numerous responsibilities, below we enumerate key responsibilities of ours, along with your responsibilities.

This article use the Databricks terminology control plane and a compute plane, which are two main parts of how Databricks works:

  • The Databricks control plane includes the backend services that Databricks manages in its own Google Cloud account.

  • The compute plane is where your data lake is processed. The classic compute plane includes a VPC in your Google Cloud account, and clusters of compute resources to process your notebooks, jobs, and pro or classic SQL warehouses.

Key responsibilities of Google include:

  • Perform its obligations as a business associate under your BAA with Google.

  • Provide you virtual machines under your contract with Google Cloud that support HIPAA compliance.

  • Provide encryption at rest and in-transit encryption within GKE clusters that is adequate under HIPAA.

  • Delete encryption keys and data when Databricks releases the VM instances.

Key responsibilities of Databricks include:

  • Encrypt in-transit PHI data that is transmitted to or from the control plane.

  • Encrypt PHI data at rest in the control plane

  • Deprovision VM instances when you indicate in Databricks that they are to be deprovisioned, for example auto-termination or manual termination, so that Google Cloud can wipe them.

Key responsibilities of yours:

  • Do not use preview features within Databricks to process PHI without our written permission. However, it is supported to use the preview features listed in Preview features that are supported for processing of PHI data.

  • Follow security best practices, such as disable unnecessary egress from the compute plane and use the Databricks secrets feature (or other similar functionality) to store access keys that provide access to PHI.

  • Enter into a business associate agreement with Google Cloud to cover all data processed within the VPC where the VM instances are deployed.

  • Do not do something within a virtual machine that would be a violation of HIPAA. For example, direct Databricks to send unencrypted PHI to an endpoint.

  • Ensure that all data that may contain PHI is encrypted at rest when you store it in locations that the Databricks platform may interact with. You are responsible for ensuring the encryption (as well as performing backups) for your buckets that Databricks creates in your account for each workspace and all other data sources.

  • Ensure that all data that may contain PHI is encrypted in transit between Databricks and any of your data storage locations or external locations you access from a compute plane machine. For example, any APIs that you use in a notebook that might connect to external data source must use appropriate encryption on any outgoing connections.

  • Ensure that all data that may contain PHI is encrypted at rest when you store it in locations that the Databricks platform may interact with.

  • Ensure the encryption (as well as performing backups) for your workspace’s buckets and all other data sources.

  • Ensure that all data that may contain PHI is encrypted in transit between Databricks and any of your data storage locations or external locations you access from a compute plane machine. For example, any APIs that you use in a notebook that might connect to external data source must use appropriate encryption on any outgoing connections.