User-defined functions (UDFs) in Unity Catalog


This feature is in Public Preview and requires Databricks Runtime 14.1 and above.

Databricks provides a SQL-native syntax to register custom functions to schemas governed by Unity Catalog. Standard CREATE FUNCTION SQL syntax is supported for all Unity Catalog compute on Databricks. On pro SQL warehouses and in Databricks Runtime 14.1 and above, you can include Python code in functions registered to Unity Catalog.


Views registered from queries that use UDFs must use supported compute. SQL warehouses and Databricks Runtime 14.0 and below cannot resolve views that contain UDFs registered to Unity Catalog.

For the complete SQL language reference, see CREATE FUNCTION (SQL and Python).

For information about how Unity Catalog manages permissions on functions, see CREATE FUNCTION.


Python UDFs registered as functions in Unity Catalog differ in scope and support from PySpark UDFs scoped to a notebook or SparkSession. See User-defined scalar functions - Python.

Custom SQL functions in Unity Catalog

When you create a SQL function using compute configured for Unity Catalog, the function is registered to the currently active schema by default. The following example demonstrates the syntax you might use to declare a target catalog and schema for a new function:

CREATE FUNCTION target_catalog.target_schema.roll_dice()
    COMMENT 'Roll a single 6 sided die'
    RETURN (rand() * 6)::INT + 1;

All users with sufficient privileges on the function can then use the function in compute environments configured for Unity Catalog, as in the following example:

SELECT target_catalog.target_schema.roll_dice()


You can use UDFs using LANGUAGE SQL to return tables or scalar values.

Register a Python UDF to Unity Catalog

In Databricks Runtime 14.1 and above, you can use the SQL CREATE FUNCTION statement to register scalar Python UDFs to Unity Catalog.


Only pro SQL warehouses support Python UDFs for Unity Catalog.

Python UDFs are designed to provide the full expressiveness of Python directly within SQL functions, allowing for customized operations such as advanced transformations, data masking, and hashing.

Python UDFs execute in a secure, isolated environment and do not have access to file systems or internal services.

Python UDFs running in shared access mode allow TCP/UDP network traffic over ports 80, 443, and 53.

See Which UDFs are most efficient?.


Syntax and semantics for Python UDFs in Unity Catalog differ from Python UDFs registered to the SparkSession. See User-defined scalar functions - Python.

Python UDFs for Unity Catalog use statements set off by double dollar signs ($$), as in the following code example:

CREATE FUNCTION target_catalog.target_schema.greet(s STRING)
AS $$
  return f"Hello, {s}"

The following example demonstrates using this function to return greeting statements for all names stored in the first_name column of a table named students:

SELECT target_catalog.target_schema.greet(first_name)
FROM students;

You can define any number of Python functions within a Python UDF, but must return a scalar value.

Python functions must handle NULL values independently, and all type mappings must follow Databricks SQL language mappings.

You can import standard Python libraries included by Databricks, but you cannot include custom libraries or external dependencies.

If no catalog or schema is specified, Python UDFs are registered to the current active schema.

The following example imports a library and uses multiple functions within a Python UDF:

CREATE FUNCTION roll_dice(num_dice INTEGER, num_sides INTEGER)
AS $$
  import numpy as np

  def roll_die(num_sides):
    return np.random.randint(num_sides) + 1

  def sum_dice(num_dice,num_sides):
    return sum([roll_die(num_sides) for x in range(num_dice)])

  return sum_dice(num_dice, num_sides)