Serverless compute limitations

Preview

This feature is in Private Preview. For information on eligibility and enablement, see Enable serverless compute.

This article explains the current limitations of serverless compute for notebooks and jobs. Starting with an overview of the most important considerations, followed by a comprehensive reference list of limitations.

Limitations overview

Before creating new workloads or migrating workloads to serverless compute, first consider the following limitations:

  • Python and SQL are the only supported languages.

  • Only Spark connect APIs are supported. Spark RDD APIs are not supported.

  • JAR libraries are not supported. For workarounds, see Best practices for serverless compute.

  • Serverless compute has unrestricted access for all workspace users.

  • Notebook tags are not supported.

  • For streaming, only incremental batch logic can be used. There is no support for default or time-based trigger intervals. See Streaming limitations.

Limitations reference list

The following sections list the current limitations of serverless compute.

Serverless compute is based on the shared compute architecture. The most relevant limitations inherited from shared compute are listed below, along with additional serverless-specific limitations. For a full list of shared compute limitations, see Compute access mode limitations for Unity Catalog.

General limitations

  • Scala and R are not supported.

  • ANSI SQL is the default when writing SQL. Opt-out of ANSI mode by setting spark.sql.ansi.enabled to false.

  • Spark RDD APIs are not supported.

  • Spark Context (sc), spark.sparkContext, and sqlContext are not supported.

  • The web terminal is not supported.

  • No query can run longer than 48 hours.

  • You must use Unity Catalog to connect to external data sources. Use external locations to access cloud storage.

  • Support for data sources is limited to AVRO, BINARYFILE, CSV, DELTA, JSON, KAFKA, ORC, PARQUET, ORC, TEXT, and XML.

  • User-defined functions (UDFs) cannot access the internet.

  • Individual rows must not exceed the maximum size of 128MB.

  • The Spark UI is not available. Instead, use the query profile to view information about your Spark queries. See Query profile.

  • Python clients that use Databricks endpoints may encounter SSL verification errors such as “CERTIFICATE_VERIFY_FAILED”. To work around these errors, configure the client to trust the CA file located in /etc/ssl/certs/ca-certificates.crt. For example, run the following command at the beginning of a serverless notebook or job: import os; os.environ['SSL_CERT_FILE'] = '/etc/ssl/certs/ca-certificates.crt'

  • Cross-workspace API requests are not supported.

Streaming limitations

Machine learning limitations

Notebooks limitations

  • Notebooks have access to 8GB memory which cannot be configured.

  • Notebook-scoped libraries are not cached across development sessions.

  • Sharing TEMP tables and views when sharing a notebook among users is not supported.

  • Autocomplete and Variable Explorer for dataframes in notebooks are not supported.

Workflow limitations

  • The driver size for serverless compute for jobs is currently fixed and cannot be changed.

  • Task logs are not isolated per task run. Logs will contain the output from multiple tasks.

  • Task libraries are not supported for notebook tasks. Use notebook-scoped libraries instead. See Notebook-scoped Python libraries.

Compute-specific limitations

The following compute-specific features are not supported:

  • Compute policies

  • Compute-scoped init scripts

  • Compute-scoped libraries, including custom data sources and Spark extensions. Use notebook-scoped libraries instead.

  • Compute-level data access configurations, including instance profiles. As a consequence, accessing tables and files via HMS on cloud paths, or with DBFS mounts that have no embedded credentials, will not work.

  • Instance pools

  • Compute event logs

  • Apache Spark compute configs and environment variables