ANSI_MODE

Applies to: check marked yes Databricks SQL

The ANSI_MODE configuration parameter controls key behaviors of built-in functions and cast operations.

This article describes ANSI mode in Databricks SQL. For ANSI compliance in Databricks Runtime, see ANSI compliance in Databricks Runtime.

Settings

  • TRUE

    Follows the SQL standard in how it deals with certain arithmetic operations and type conversions, similar to most databases and data warehouses. Following this standard promotes better data quality, integrity, and portability.

  • FALSE

    Databricks SQL uses Hive-compatible behavior.

You can set this parameter at the session level using the SET statement and at the global level using SQL configuration parameters or the SQL Warehouse API.

System default

The system default value is TRUE for accounts added on Databricks SQL 2022.35 and above.

Detailed description

The Databricks SQL reference documentation describes SQL standard behavior.

The following sections describe the differences between ANSI_MODE TRUE (ANSI mode) and FALSE (non-ANSI mode).

Operators

In non-ANSI mode, arithmetic operations performed on numeric types may return overflowed values or NULL, while in ANSI mode such operations return an error.

Operator

Description

Example

ANSI_MODE = true

ANSI_MODE = false

dividend / divisor

Returns dividend divided by divisor.

1/0

Error

NULL

- expr

Returns the negated value of expr.

-(-128y)

Error

-128y (Overflow)

expr1 - expr2

Returns the subtraction of expr2 from expr1.

-128y - 1y

Error

127y (Overflow)

expr1 + expr2

Returns the sum of expr1 and expr2.

127y + 1y

Error

-128y (Overflow)

dividend % divisor

Returns the remainder after dividend / divisor.

1 % 0

Error

NULL

multiplier * multiplicand

Returns multiplier multiplied by multiplicand.

100y * 100y

Error

16y (Overflow)

arrayExpr[index]

Returns the element of an arrayExpr at index.

Invalid array index

Error

NULL

mapExpr[key]

Returns the value of mapExpr for key.

Invalid map key

Error

NULL

divisor div dividend

Returns the integral part of the division of divisor by dividend.

1 div 0

Error

NULL

Functions

The behavior of some built-in functions can be different under ANSI mode vs non-ANSI mode under the conditions specified below.

Operator

Description

Condition

ANSI_MODE = true

ANSI_MODE = false

abs(expr)

Returns the absolute value of the numeric value in expr.

abs(-128y)

Error

-128y (Overflow)

element_at(mapExpr, key)

Returns the value of mapExpr for key.

Invalid map key

Error

NULL

element_at(arrayExpr, index)

Returns the element of an arrayExpr at index.

Invalid array index

Error

NULL

elt(index, expr1 [, …] )

Returns the nth expression.

Invalid index

Error

NULL

make_date(y,m,d)

Creates a date from year, month, and day fields.

Invalid result date

Error

NULL

make_timestamp(y,m,d,h,mi,s[,tz])

Creates a timestamp from fields.

Invalid result timestamp

Error

NULL

make_interval(y,m,w,d,h,mi,s)

Creates an interval from fields.

Invalid result interval

Error

NULL

mod(dividend, divisor)

Returns the remainder after dividend / divisor.

mod(1, 0)

Error

NULL

next_day(expr,dayOfWeek)

Returns the first date which is later than expr and named as in dayOfWeek.

Invalid day of week

Error

NULL

parse_url(url, partToExtract[, key])

Extracts a part from url.

Invalid URL

Error

NULL

pmod(dividend, divisor)

Returns the positive remainder after dividend / divisor.

pmod(1, 0)

Error

NULL

size(expr)

Returns the cardinality of expr.

size(NULL)

NULL

-1

to_date(expr[,fmt])

Returns expr cast to a date using an optional formatting.

Invalid expr or format string

Error

NULL

to_timestamp(expr[,fmt])

Returns expr cast to a timestamp using an optional formatting.

Invalid expr or format string

Error

NULL

to_unix_timestamp(expr[,fmt])

Returns the timestamp in expr as a UNIX timestamp.

Invalid expr or format string

Error

NULL

unix_timestamp([expr[, fmt]])

Returns the UNIX timestamp of current or specified time.

Invalid expr or format string

Error

NULL

Casting rules

The rules and behaviors regarding CAST are stricter in ANSI mode. They can be divided into the following three categories:

Compile-time conversion rules

Source type

Target type

Example

ANSI_MODE = true

ANSI_MODE = false

Boolean

Timestamp

cast(TRUE AS TIMESTAMP)

Error

1970-01-01 00:00:00.000001 UTC

Date

Boolean

cast(DATE'2001-08-09' AS BOOLEAN)

Error

NULL

Timestamp

Boolean

cast(TIMESTAMP'1970-01-01 00:00:00Z' AS BOOLEAN)

Error

FALSE

Integral numeric

Binary

cast(15 AS BINARY)

Error

binary representation

Runtime errors

Source type

Target type

Condition

Example

ANSI_MODE = true

ANSI_MODE = false

String

Non-string

Invalid input

cast('a' AS INTEGER)

Error

NULL

Array, Struct, Map

Array, Struct, Map

Invalid input

cast(ARRAY('1','2','3') AS ARRAY<DATE>)

Error

NULL

Numeric

Numeric

Overflow

cast(12345 AS BYTE)

Error

NULL

Numeric

Integral numeric

Truncation

cast(5.1 AS INTEGER)

Error

5

Note

For each of these casts you can use try_cast instead of cast to return NULL rather than of an error.

Implicit type coercion rules

Under ANSI_MODE = TRUE, Databricks SQL uses clear SQL data type casting rules for:

By contrast ANSI_MODE = FALSE is inconsistent and more lenient. For example:

  • When using a STRING type with any arithmetic operator, the string is implicitly cast to DOUBLE.

  • When comparing a STRING to any numeric type the string is implicitly cast to the type it compares to.

  • When performing a UNION, COALESCE, or other operations where a least common type must be found all types are cast to STRING if there is any STRING type present.

Databricks recommends using the explicit cast or try_cast function instead of relying on ANSI_MODE = FALSE.

Examples

> SET ansi_mode = true;

-- Protects against integral numeric overflow
> SELECT cast(12345 AS TINYINT);
  Casting 12345 to tinyint causes overflow

-- For invalid values raises errors instead of returning NULL.
> SELECT cast('a' AS INTEGER);
  Invalid input syntax for type numeric: a.
  To return NULL instead, use 'try_cast'

-- try_cast() is consistent for both modes
> SELECT try_cast('a' AS INTEGER);
  NULL

-- Does not allow ambiguous crosscasting.
> SELECT c1 + c2 FROM VALUES('5', '7.6') AS T(c1, c2);
  Cannot resolve '(T.c1 + T.c2)' due to data type mismatch:
  '(T.c1 + T.c2)' requires (numeric or interval day to second or interval year to month or interval) type, not string

-- Promotes STRING to least common type (STRING, INTEGER --> BIGINT) for arithmetic operation.
> SELECT typeof(5 - '3');
  bigint

-- Promotes STRING to least common type (INTEGER, STRING --> BIGINT) with runtime check
> SELECT c1 = c2 FROM VALUES(10, '10.1') AS T(c1, c2);
  Invalid input syntax for type numeric: 10.1. To return NULL instead, use 'try_cast'.

-- Promotes STRING to least common type (STRING, INTEGER --> BIGINT) for set operation with runtime check.
> SELECT typeof(c1) FROM (SELECT 5 UNION ALL SELECT '6') AS T(c1);
  bigint
  bigint
> SET ansi_mode = false;

-- Silent integral numeric overflow
> SELECT cast(12345 AS TINYINT);
  57

-- Returns NULL instead of an error
> SELECT cast('a' AS INTEGER);
  NULL

-- try_cast() is safe for both modes
> SELECT try_cast('a' AS INTEGER);
  NULL

-- Does allow ambiguous crosscasting using DOUBLE.
> SELECT c1 + c2 FROM VALUES('5', '7.6') AS T(c1, c2);
  12.6

-- Crosscasts STRING to DOUBLE for arithmetic operation.
> SELECT typeof(5 - '3');
  double

-- Implicitly casts STRING to INTEGER equating 10 with 10.1
> SELECT c1 = c2 FROM VALUES(10, '10.1') AS T(c1, c2);
  true

-- Promotes to string for set operations
> SELECT typeof(c1) FROM (SELECT 5 UNION ALL SELECT '6') AS T(c1);
  string
  string