Column mask
clause
Applies to: Databricks SQL Databricks Runtime 12.2 LTS and above Unity Catalog only
Specifies a function that is applied to a column whenever rows are fetched from the table. All future queries from that column will receive the result of evaluating that function over the column in place of the column’s original value. This can be useful for fine-grained access control purposes wherein the function can inspect the identity or group memberships of the invoking user to decide whether to redact the value.
You can add column masks when you:
Create a table using CREATE TABLE.
Add columns to a table with ALTER TABLE … ADD COLUMN.
Altering a column with ALTER TABLE … ALTER COLUMN.
Important
The mask is applied as soon as each row is fetched from the data source. Any expressions, predicates, or ordering are applied after the masking. For example, joining between the masked column against another column from another table will use the masked values for the join comparison.
For more information on how to use column masks see Filter sensitive table data using row filters and column masks.
Parameters
-
A scalar SQL UDF with at least one parameter.
The first parameter of the SQL UDF maps 1:1 with the masked column. The masked column’s type must be castable to the SQL UDF parameter type. If
func_name
requires more parameters, arguments must be provided by theUSING COLUMNS
clause.The return type of the function must be castable to the data type of the masked column.
-
Optionally specifies additional columns of the masked column’s table to pass to
func_name
. Eachother_column_name
must be castable to the corresponding parameter offunc_name
.Use a column mask to selectively anonymize the value of
column_identifier
based on the user executing a query againsttable_name
, the value ofcolumn_identifier
and the optionalother_column
. constant_literal
Specifies a constant parameter with the type matching a function parameter. The following types are supported:
STRING
, numeric (INTEGER
,FLOAT,
DOUBLE
,DECIMAL
…),BOOLEAN
,INTERVAL
,NULL
.
Examples
You can find more examples in Filter sensitive table data using row filters and column masks.
-- Create a table with a masked column
> CREATE FUNCTION mask_ssn(ssn STRING) RETURN CASE WHEN is_member('HumanResourceDept') THEN ssn ELSE '***-**-****' END;
> CREATE TABLE persons(name STRING, ssn STRING MASK mask_ssn);
> INSERT INTO persons VALUES('James', '123-45-6789';
-- As a non-member of 'HumanResourceDept'
> SELECT * FROM persons;
James ***-**-****
-- As a member of 'HumanResourceDept'
> SELECT * FROM persons;
James 123-45-6789
-- Create a masking function with multiple parameters. When applied first parameter will be the column that masking function applies to
> CREATE FUNCTION mask_pii_regional(value STRING, region STRING)
RETURN IF(is_account_group_member(region || '_HumanResourceDept'), value, 'REDACTED');
-- Create a table with a masked column. Masking function first parameter will be the column that is masked.
-- The rest of the parameters should be specified in `USING COLUMNS (<columnList>)` clause
> CREATE TABLE persons(name STRING, address STRING MASK mask_pii_regional USING COLUMNS (region), region STRING);
> INSERT INTO persons('James', '160 Spear St, San Francisco', 'US')
-- As a non-member of 'US_HumanResourceDept'
> SELECT * FROM persons;
James | REDACTED | US
-- As a member of 'US_HumanResourceDept'
> SELECT * FROM persons;
James | 160 Spear St, San Francisco | US