Security, compliance, and privacy for the data lakehouse
The architectural principles of the security, compliance, and privacy pillar are about protecting a Databricks application, customer workloads, and customer data from threats. As a starting point, the Databricks Security and Trust Center provides a good overview of the Databricks approach to security.
Principles of security, compliance, and privacy
Manage identity and access using least privilege
The practice of identity and access management (IAM) helps you ensure that the right people can access the right resources. IAM addresses the following aspects of authentication and authorization: account management including provisioning, identity governance, authentication, access control (authorization), and identity federation.
Protect data in transit and at rest
Classify your data into sensitivity levels and use mechanisms such as encryption, tokenization, and access control where appropriate.
Secure your network and identify and protect endpoints
Secure your network and monitor and protect the network integrity of internal and external endpoints through security appliances or cloud services like firewalls.
Review the Shared Responsibility Model
Security and compliance are a shared responsibility between Databricks, the Databricks customer, and the cloud provider. It is important to understand which party is responsible for what part.
Meet compliance and data privacy requirements
You might have internal (or external) requirements that require you to control the data storage locations and processing. These requirements vary based on systems design objectives, industry regulatory concerns, national law, tax implications, and culture. Be mindful that you might need to obfuscate or redact personally identifiable information (PII) to meet your regulatory requirements. Where possible, automate your compliance efforts.
Monitor system security
Use automated tools to monitor your application and infrastructure. To scan your infrastructure for vulnerabilities and detect security incidents, use automated scanning in your continuous integration and continuous deployment (CI/CD) pipelines.
Next: Best practices for security, compliance, and privacy
See Best practices for security, compliance & privacy.