Security, compliance, and privacy for the data lakehouse

The architectural principles of the security, compliance, and privacy pillar are about protecting a Databricks application, customer workloads, and customer data from threats. As a starting point, the Databricks Security and Trust Center provides a good overview of the Databricks approach to security.

Security, compliance, and privacy lakehouse architecture diagram for Databricks.

Principles of security, compliance, and privacy

  1. Manage identity and access using least privilege

    The practice of identity and access management (IAM) helps you ensure that the right people can access the right resources. IAM addresses the following aspects of authentication and authorization: account management including provisioning, identity governance, authentication, access control (authorization), and identity federation.

  2. Protect data in transit and at rest

    Classify your data into sensitivity levels and use mechanisms such as encryption, tokenization, and access control where appropriate.

  3. Secure your network and identify and protect endpoints

    Secure your network and monitor and protect the network integrity of internal and external endpoints through security appliances or cloud services like firewalls.

  4. Review the Shared Responsibility Model

    Security and compliance are a shared responsibility between Databricks, the Databricks customer, and the cloud provider. It is important to understand which party is responsible for what part.

  5. Meet compliance and data privacy requirements

    You might have internal (or external) requirements that require you to control the data storage locations and processing. These requirements vary based on systems design objectives, industry regulatory concerns, national law, tax implications, and culture. Be mindful that you might need to obfuscate or redact personally identifiable information (PII) to meet your regulatory requirements. Where possible, automate your compliance efforts.

  6. Monitor system security

    Use automated tools to monitor your application and infrastructure. To scan your infrastructure for vulnerabilities and detect security incidents, use automated scanning in your continuous integration and continuous deployment (CI/CD) pipelines.

Next: Best practices for security, compliance, and privacy

See Best practices for security, compliance & privacy.