CAST_INVALID_INPUT error class

The value <expression> of the type <sourceType> cannot be cast to <targetType> because it is malformed. Correct the value as per the syntax, or change its target type. Use try_cast to tolerate malformed input and return NULL instead. If necessary set <ansiConfig> to “false” to bypass this error.”

Parameters

  • expression: The expression that needs to be cast to targettype

  • sourceType: The data type of expression.

  • targetType: The target type of the cast operation.

  • ansiConfig: The configuration setting to alter ANSI mode.

Explanation

The expression cannot be cast to the targetType due to one of the following reasons:

  • expression is too big for the domain of the type. For example the number 1000 cannot be cast to TINYINT because that domain only ranges from -128 to +127.

  • expression contains characters which are not part of the type. For example a cannot be cast to any numeric type.

  • expression is formatted in a way the cast operation cannot parse. For example 1.0 and 1e1 cannot be cast to any integral numeric type.

The cast may not have been explicitly specified, but may have been injected implicitly by Databricks.

The context information provided with this error isolates the object and the expression within which the error occurred.

For a definition of the domain and accepted literal formats see the definition for the data type of tyopeName.

Mitigation

The mitigation for this error depends on the cause:

  • Is the value expected to comply with the domain and format of the specified typeName?

    Verify the input producing value and correct the data source.

  • Is the target of the cast too narrow?

    Widen the type by moving, for example, from DATE to TIMESTAMP, INT to BIGINT or DOUBLE.

  • Is the format of value incorrect?

    Consider using:

    These functions allow for a wide variety of formats you can specify.

    When casting numeric literals with decimal points (e.g. 1.0 or scientific notation (e.g. 1e0) consider double casting first to DECIMAL or DOUBLE and then to the exact numeric.

  • Is data with incorrect values expected, and should be tolerated by producing NULLs?

    Change the expression use or inject try_cast(value AS typeName). This function returns NULL when it is passed without a value that satisfies the type.

    If you cannot change the expression, as a last resort, you can temporarily disable ANSI mode using ansiConfig.

Examples

-- A view with a cast and string literals outside the domain of the target type
> CREATE OR REPLACE TEMPORARY VIEW v(c1) AS SELECT CAST(a AS SMALLINT) FROM VALUES('100'), ('50000') AS t(a);
> SELECT c1 FROM v;
 [CAST_INVALID_INPUT] The value '50000' of the type "STRING" cannot be cast to "SMALLINT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL of VIEW v(line 1, position 8) ==
 SELECT CAST(a AS SMALLINT) FROM VALUES('100'), ('50000') A...
        ^^^^^^^^^^^^^^^^^^^

-- Widen the target type to match the domain of the input
> CREATE OR REPLACE TEMPORARY VIEW v(c1) AS SELECT cast(a AS INTEGER) FROM VALUES('100'), ('50000') AS t(a);
> SELECT c1 FROM v;
 100
 50000

-- The input data format does not match the target type
> SELECT cast(a AS INTEGER) FROM VALUES('1.0'), ('1e0') AS t(a);
 [CAST_INVALID_INPUT] The value '1.0' of the type "STRING" cannot be cast to "INT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(a AS INTEGER) FROM VALUES('1.0'), ('1e0') AS ...
        ^^^^^^^^^^^^^^^^^^

-- Adjust the target type to the match the format if the format is indicative of the domain.
> SELECT cast(a AS DOUBLE) FROM VALUES('1.0'), ('1e0') AS t(a);
 1.0
 1.0

-- ALternatively double cast to preserver the target type
> SELECT cast(cast(a AS DOUBLE) AS INTEGER) FROM VALUES('1.0'), ('1e0') AS t(a);
 1
 1

-- The format of the numeric input contains display artifacts
> SELECT cast(a AS DECIMAL(10, 3)) FROM VALUES('12,345.30-'), ('12+') AS t(a);
 [CAST_INVALID_INPUT] The value '12,345.30-' of the type "STRING" cannot be cast to "DECIMAL(10,3)" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(a AS DECIMAL(10, 3)) FROM VALUES('$<123,45.30>'), ('...
        ^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use to_number() to parse formatted values
> SELECT to_number(a, '9,999,999.999S') FROM VALUES('123,45.30-'), ('12+') AS t(a);
 -12345.300
 12.000

-- The format of a date input does not match the default format
> SELECT cast(geburtsdatum AS DATE) FROM VALUES('6.6.2000'), ('31.10.1970') AS t(geburtsdatum);
 [CAST_INVALID_INPUT] The value '6.6.2000' of the type "STRING" cannot be cast to "DATE" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(geburtsdatum AS DATE) FROM VALUES('6.6.2000'), ('31.1...
        ^^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use to_date to parse the correct input format for a date
> SELECT to_date(geburtsdatum, 'dd.MM.yyyy') FROM VALUES('6.6.2000'), ('31.10.1970') AS t(geburtsdatum);
  2000-06-06
  1970-10-31

-- The type resolution of Databricks did not derive a sufficiently wide type, failing an implicit cast
> SELECT 12 * monthly AS yearly FROM VALUES ('1200'), ('1520.56') AS t(monthly);
 [CAST_INVALID_INPUT] The value '1520.56' of the type "STRING" cannot be cast to "BIGINT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT 12 * monthly AS yearly FROM VALUES ('1200'),...
        ^^^^^^^^^^^^

-- Explicitly declare the expected type
> SELECT 12 * cast(monthly AS DECIMAL(8, 2)) AS yearly FROM VALUES ('1200'), ('1520.56') AS t(monthly);
 14400.00
 18246.72

-- The input data is occasionally expected to incorrect
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 [CAST_INVALID_INPUT] The value 'prefer not to say' of the type "STRING" cannot be cast to "DECIMAL(9,2)" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer ...
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use try_cast to tolerate incorrect input
> SELECT try_cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL

-- In Databricks SQL temporarily disable ANSI mode to tolerate incorrect input.
> SET ANSI_MODE = false;
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL
> SET ANSI_MODE = true;

-- In Databricks Runtime temporarily disable ANSI mode to tolerate incorrect input.
> SET spark.sql.ansi.enabled = false;
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL
> SET spark.sql.ansi.enabled = true;