SELECT clause

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Collects the columns to be returned from the subquery, including the execution of expressions, aggregations, and deduplication.

Syntax

SELECT [ hints ] [ ALL | DISTINCT ] { named_expression | star_clause } [, ...]

Parameters

  • hints

    Hints help the Databricks optimizer make better planning decisions. Databricks supports hints that influence selection of join strategies and repartitioning of the data.

  • ALL

    Select all matching rows from the table references. Enabled by default.

  • DISTINCT

    Select all matching rows from the table references after removing duplicates in results.

  • named_expression

    An expression with an optional assigned name.

    • expression

      A combination of one or more values, operators, and SQL functions that evaluates to a value.

    • column_alias

      An optional column identifier naming the expression result. If no column_alias is provided Databricks SQL derives one.

  • star_clause

    A shorthand to name all the referencable columns in the FROM clause or a specific table reference’s columns or fields in the FROM clause.

Examples

-- select all referencable columns from all tables
> SELECT * FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
  1   2   3   4

-- select all referencable columns from one table
> SELECT t2.* FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
  3   4

-- select all referencable columns from all tables except t2.c4
> SELECT * EXCEPT(c4) FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
  1   2   3

-- select all referencable columns from a table, except a nested field.
> SELECT * EXCEPT(c2.b) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
  1  { "a" : 2 }

-- Removing all fields results in an empty struct
> SELECT * EXCEPT(c2.b, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
  1  { }

-- Overlapping names result in an error
> SELECT * EXCEPT(c2, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
  Error: EXCEPT_OVERLAPPING_COLUMNS