Databricks Runtime 15.2 (Beta)

Beta

Databricks Runtime 15.2 is in Beta. The contents of the supported environments may change during the Beta. Changes can include the list of packages or versions of installed packages.

The following release notes provide information about Databricks Runtime 15.2, powered by Apache Spark 3.5.0.

Note

These release notes may include references to features that are not available on Google Cloud as of this release.

New features and improvements

Liquid clustering is GA

Support for liquid clustering is now generally available using Databricks Runtime 15.2 and above. See Use liquid clustering for Delta tables.

Type widening is in Public Preview

You can now enable type widening on tables backed by Delta Lake. Tables with type widening enabled allow changing the type of columns to a wider data type without rewriting underlying data files. See Type widening.

Schema evolution clause added to SQL merge syntax

You can now add the WITH SCHEMA EVOLUTION clause to a SQL merge statement to enable schema evolution for the operation. See Schema evolution syntax for SQL.

PySpark DataSources are available in Public Preview

PySpark DataSources can be created using the Python (PySpark) DataSource API, which enables reading from custom data sources and writing to custom data sinks in Apache Spark using Python. See What is a PySpark DataSource?

applyInPandas and mapInPandas now available on Unity Catalog compute with shared access mode

As part of a Databricks Runtime 14.3 LTS maintenance release, applyInPandas and mapInPandas UDF types are now supported on shared access mode compute running Databricks Runtime 14.3 and above.

Use dbutils.widgets.getAll() to get all widgets in a notebook

Use dbutils.widgets.getAll() to get all widget values in a notebook. This is especially helpful when passing multiple widgets values to a Spark SQL query.

Vacuum inventory support

You can now specify an inventory of files to consider when running the VACUUM command on a Delta table. See the OSS Delta docs for details.

Bug fixes

  • When displayed in the SQL UI, write commands in query plans incorrectly showed PhotonWriteStage as an operator. With this release, the UI is updated to show PhotonWriteStage as a stage. This is a UI change only, and does not affect how queries are run.

Library upgrades

  • Upgraded Python libraries:

    • GitPython from 3.1.42 to 3.1.43

    • google-api-core from 2.17.1 to 2.18.0

    • google-auth from 2.28.1 to 2.29.0

    • google-cloud-storage from 2.15.0 to 2.16.0

    • googleapis-common-protos from 1.62.0 to 1.63.0

    • ipywidgets from 8.0.4 to 7.7.2

    • mlflow-skinny from 2.11.1 to 2.11.3

    • s3transfer from 0.10.0 to 0.10.1

    • sqlparse from 0.4.4 to 0.5.0

    • typing_extensions from 4.7.1 to 4.10.0

  • Upgraded R libraries:

  • Upgraded Java libraries:

    • com.amazonaws.aws-java-sdk-autoscaling from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudformation from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudfront from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudhsm from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudsearch from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudtrail from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudwatch from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cloudwatchmetrics from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-codedeploy from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cognitoidentity from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-cognitosync from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-config from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-core from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-datapipeline from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-directconnect from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-directory from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-dynamodb from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-ec2 from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-ecs from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-efs from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-elasticache from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-elasticbeanstalk from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-elasticloadbalancing from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-elastictranscoder from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-emr from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-glacier from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-glue from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-iam from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-importexport from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-kinesis from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-kms from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-lambda from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-logs from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-machinelearning from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-opsworks from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-rds from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-redshift from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-route53 from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-s3 from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-ses from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-simpledb from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-simpleworkflow from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-sns from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-sqs from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-ssm from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-storagegateway from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-sts from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-support from 1.12.390 to 1.12.610

    • com.amazonaws.aws-java-sdk-workspaces from 1.12.390 to 1.12.610

    • com.amazonaws.jmespath-java from 1.12.390 to 1.12.610

    • io.delta.delta-sharing-client_2.12 from 1.0.4 to 1.0.5

    • org.roaringbitmap.RoaringBitmap from 0.9.45-databricks to 0.9.45

    • org.roaringbitmap.shims from 0.9.45-databricks to 0.9.45

Apache Spark

Databricks Runtime 15.2 includes Apache Spark 3.5.0. This release includes all Spark fixes and improvements included in Databricks Runtime 15.1, as well as the following additional bug fixes and improvements made to Spark:

  • [SPARK-48044] [SC-164284][PYTHON][CONNECT] Cache DataFrame.isStreaming

  • [SPARK-48018] [ES-1103097][SC-164103][SS][12.2][13.3][14.3][15.0][15.1][15.2] Fix null groupId causing missing param error when throwing KafkaException.couldNotReadOffsetRange

  • [SPARK-47973] [SC-163832][CORE] Log call site in SparkContext.stop() and later in SparkContext.assertNotStopped()

  • [SPARK-47941] [SC-163568] [SS] [Connect] Propagate ForeachBatch worker initialization errors to users for PySpark

  • [SPARK-47412] [SC-163455][SQL] Add Collation Support for LPad/RPad.

  • [SPARK-47907] [SC-163408][SQL] Put bang under a config

  • [SPARK-47956] [SC-163493][SQL] Sanity check for unresolved LCA reference

  • [SPARK-46820] [SC-157093][PYTHON] Fix error message regression by restoring new_msg

  • [SPARK-47602] [SPARK-47577][SPARK-47598][SPARK-47577]Core/MLLib/Resource managers: structured logging migration

  • [SPARK-47890] [SC-163324][CONNECT][PYTHON] Add variant functions to Scala and Python.

  • [SPARK-47805] [SC-163459][SS] Implementing TTL for MapState

  • [SPARK-47900] [SC-163326] Fix check for implicit (UTF8_BINARY) collation

  • [SPARK-47902] [SC-163316][SQL]Making Compute Current Time* expressions foldable

  • [SPARK-47845] [SC-163315][SQL][PYTHON][CONNECT] Support Column type in split function for scala and python

  • [SPARK-47754] [SC-162144][SQL] Postgres: Support reading multidimensional arrays

  • [SPARK-47416] [SC-163001][SQL] Add new functions to CollationBenchmark #90339

  • [SPARK-47839] [SC-163075][SQL] Fix aggregate bug in RewriteWithExpression

  • [SPARK-47821] [SC-162967][SQL] Implement is_variant_null expression

  • [SPARK-47883] [SC-163184][SQL] Make CollectTailExec.doExecute lazy with RowQueue

  • [SPARK-47390] [SC-163306][SQL] PostgresDialect distinguishes TIMESTAMP from TIMESTAMP_TZ

  • [SPARK-47924] [SC-163282][CORE] Add a DEBUG log to DiskStore.moveFileToBlock

  • [SPARK-47897] [SC-163183][SQL][3.5] Fix ExpressionSet performance regression in scala 2.12

  • [SPARK-47739] [SC-162851][SQL] Register logical avro type

  • [SPARK-47565] [SC-161786][PYTHON] PySpark worker pool crash resilience

  • [SPARK-47885] [SC-162989][PYTHON][CONNECT] Make pyspark.resource compatible with pyspark-connect

  • [SPARK-47887] [SC-163122][CONNECT] Remove unused import spark/connect/common.proto from spark/connect/relations.proto

  • [SPARK-47751] [SC-161991][PYTHON][CONNECT] Make pyspark.worker_utils compatible with pyspark-connect

  • [SPARK-47691] [SC-161760][SQL] Postgres: Support multi dimensional array on the write side

  • [SPARK-47371] [SC-162831] [SQL] XML: Ignore row tags found in CDATA

  • [SPARK-47895] [SC-163098][SQL] group by all should be idempotent

  • [SPARK-47617] [SC-162513][SQL] Add TPC-DS testing infrastructure for collations

  • [SPARK-47356] [SC-162858][SQL] Add support for ConcatWs & Elt (all collations)

  • [SPARK-47543] [SC-161234][CONNECT][PYTHON] Inferring dict as MapType from Pandas DataFrame to allow DataFrame creation

  • [SPARK-47863] [SC-162974][SQL] Fix startsWith & endsWith collation-aware implementation for ICU

  • [SPARK-47867] [SC-162966][SQL] Support variant in JSON scan.

  • [SPARK-47366] [SC-162475][SQL][PYTHON] Add VariantVal for PySpark

  • [SPARK-47803] [SC-162726][SQL] Support cast to variant.

  • [SPARK-47769] [SC-162841][SQL] Add schema_of_variant_agg expression.

  • [SPARK-47420] [SC-162842][SQL] Fix test output

  • [SPARK-47430] [SC-161178][SQL] Support GROUP BY for MapType

  • [SPARK-47357] [SC-162751][SQL] Add support for Upper, Lower, InitCap (all collations)

  • [SPARK-47788] [SC-162729][SS] Ensure the same hash partitioning for streaming stateful ops

  • [SPARK-47776] [SC-162291][SS] Disallow binary inequality collation be used in key schema of stateful operator

  • [SPARK-47855] [SC-162818][CONNECT] Add spark.sql.execution.arrow.pyspark.fallback.enabled in the unsupported list

  • [SPARK-47673] [SC-162824][SS] Implementing TTL for ListState

  • [SPARK-47818] [SC-162845][CONNECT] Introduce plan cache in SparkConnectPlanner to improve performance of Analyze requests

  • [SPARK-47694] [SC-162783][CONNECT] Make max message size configurable on the client side

  • [SPARK-47274] Revert “[SC-162479][PYTHON][SQL] Provide more usef…

  • [SPARK-47616] [SC-161193][SQL] Add User Document for Mapping Spark SQL Data Types from MySQL

  • [SPARK-47862] [SC-162837][PYTHON][CONNECT]Fix generation of proto files

  • [SPARK-47849] [SC-162724][PYTHON][CONNECT] Change release script to release pyspark-connect

  • [SPARK-47410] [SC-162518][SQL] Refactor UTF8String and CollationFactory

  • [SPARK-47807] [SC-162505][PYTHON][ML] Make pyspark.ml compatible with pyspark-connect

  • [SPARK-47707] [SC-161768][SQL] Special handling of JSON type for MySQL Connector/J 5.x

  • [SPARK-47765] Revert “[SC-162636][SQL] Add SET COLLATION to pars…

  • [SPARK-47081] [SC-162151][CONNECT][FOLLOW] Improving the usability of the Progress Handler

  • [SPARK-47289] [SC-161877][SQL] Allow extensions to log extended information in explain plan

  • [SPARK-47274] [SC-162479][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors

  • [SPARK-47765] [SC-162636][SQL] Add SET COLLATION to parser rules

  • [SPARK-47819] [SC-162633][CONNECT] Use asynchronous callback for execution cleanup

  • [SPARK-47828] [SC-162722][CONNECT][PYTHON] DataFrameWriterV2.overwrite fails with invalid plan

  • [SPARK-47812] [SC-162696][CONNECT] Support Serialization of SparkSession for ForEachBatch worker

  • [SPARK-47253] [SC-162698][CORE] Allow LiveEventBus to stop without the completely draining of event queue

  • [SPARK-47827] [SC-162625][PYTHON] Missing warnings for deprecated features

  • [SPARK-47733] [SC-162628][SS] Add custom metrics for transformWithState operator part of query progress

  • [SPARK-47784] [SC-162623][SS] Merge TTLMode and TimeoutMode into a single TimeMode.

  • [SPARK-47775] [SC-162319][SQL] Support remaining scalar types in the variant spec.

  • [SPARK-47736] [SC-162503][SQL] Add support for AbstractArrayType

  • [SPARK-47081] [SC-161758][CONNECT] Support Query Execution Progress

  • [SPARK-47682] [SC-162138][SQL] Support cast from variant.

  • [SPARK-47802] [SC-162478][SQL] Revert () from meaning struct() back to meaning *

  • [SPARK-47680] [SC-162318][SQL] Add variant_explode expression.

  • [SPARK-47809] [SC-162511][SQL] checkExceptionInExpression should check error for each codegen mode

  • [SPARK-41811] [SC-162470][PYTHON][CONNECT] Implement SQLStringFormatter with WithRelations

  • [SPARK-47693] [SC-162326][SQL] Add optimization for lowercase comparison of UTF8String used in UTF8_BINARY_LCASE collation

  • [SPARK-47541] [SC-162006][SQL] Collated strings in complex types supporting operations reverse, array_join, concat, map

  • [SPARK-46812] [SC-161535][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile

  • [SPARK-47727] [SC-161982][PYTHON] Make SparkConf to root level to for both SparkSession and SparkContext

  • [SPARK-47406] [SC-159376][SQL] Handle TIMESTAMP and DATETIME in MYSQLDialect

  • [SPARK-47081] Revert “[SC-161758][CONNECT] Support Query Executi…

  • [SPARK-47681] [SC-162043][SQL] Add schema_of_variant expression.

  • [SPARK-47783] [SC-162222] Add some missing SQLSTATEs an clean up the YY000 to use…

  • [SPARK-47634] [SC-161558][SQL] Add legacy support for disabling map key normalization

  • [SPARK-47746] [SC-162022] Implement ordinal-based range encoding in the RocksDBStateEncoder

  • [SPARK-47285] [SC-158340][SQL] AdaptiveSparkPlanExec should always use the context.session

  • [SPARK-47643] [SC-161534][SS][PYTHON] Add pyspark test for python streaming source

  • [SPARK-47582] [SC-161943][SQL] Migrate Catalyst logInfo with variables to structured logging framework

  • [SPARK-47558] [SC-162007][SS] State TTL support for ValueState

  • [SPARK-47358] [SC-160912][SQL][COLLATION] Improve repeat expression support to return correct datatype

  • [SPARK-47504] [SC-162044][SQL] Resolve AbstractDataType simpleStrings for StringTypeCollated

  • [SPARK-47719] Revert “[SC-161909][SQL] Change spark.sql.legacy.t…

  • [SPARK-47657] [SC-162010][SQL] Implement collation filter push down support per file source

  • [SPARK-47081] [SC-161758][CONNECT] Support Query Execution Progress

  • [SPARK-47744] [SC-161999] Add support for negative-valued bytes in range encoder

  • [SPARK-47713] [SC-162009][SQL][CONNECT] Fix a self-join failure

  • [SPARK-47310] [SC-161930][SS] Add micro-benchmark for merge operations for multiple values in value portion of state store

  • [SPARK-47700] [SC-161774][SQL] Fix formatting of error messages with treeNode

  • [SPARK-47752] [SC-161993][PS][CONNECT] Make pyspark.pandas compatible with pyspark-connect

  • [SPARK-47575] [SC-161402][SPARK-47576][SPARK-47654] Implement logWarning/logInfo API in structured logging framework

  • [SPARK-47107] [SC-161201][SS][PYTHON] Implement partition reader for python streaming data source

  • [SPARK-47553] [SC-161772][SS] Add Java support for transformWithState operator APIs

  • [SPARK-47719] [SC-161909][SQL] Change spark.sql.legacy.timeParserPolicy default to CORRECTED

  • [SPARK-47655] [SC-161761][SS] Integrate timer with Initial State handling for state-v2

  • [SPARK-47665] [SC-161550][SQL] Use SMALLINT to Write ShortType to MYSQL

  • [SPARK-47210] [SC-161777][SQL] Addition of implicit casting without indeterminate support

  • [SPARK-47653] [SC-161767][SS] Add support for negative numeric types and range scan key encoder

  • [SPARK-46743] [SC-160777][SQL] Count bug after constant folding

  • [SPARK-47525] [SC-154568][SQL] Support subquery correlation joining on map attributes

  • [SPARK-46366] [SC-151277][SQL] Use WITH expression in BETWEEN to avoid duplicate expressions

  • [SPARK-47563] [SC-161183][SQL] Add map normalization on creation

  • [SPARK-42040] [SC-161171][SQL] SPJ: Introduce a new API for V2 input partition to report partition statistics

  • [SPARK-47679] [SC-161549][SQL] Use HiveConf.getConfVars or Hive conf names directly

  • [SPARK-47685] [SC-161566][SQL] Restore the support for Stream type in Dataset#groupBy

  • [SPARK-47646] [SC-161352][SQL] Make try_to_number return NULL for malformed input

  • [SPARK-47366] [SC-161324][PYTHON] Add pyspark and dataframe parse_json aliases

  • [SPARK-47491] [SC-161176][CORE] Add slf4j-api jar to the class path first before the others of jars directory

  • [SPARK-47270] [SC-158741][SQL] Dataset.isEmpty projects CommandResults locally

  • [SPARK-47364] [SC-158927][CORE] Make PluginEndpoint warn when plugins reply for one-way message

  • [SPARK-47280] [SC-158350][SQL] Remove timezone limitation for ORACLE TIMESTAMP WITH TIMEZONE

  • [SPARK-47551] [SC-161542][SQL] Add variant_get expression.

  • [SPARK-47559] [SC-161255][SQL] Codegen Support for variant parse_json

  • [SPARK-47572] [SC-161351][SQL] Enforce Window partitionSpec is orderable.

  • [SPARK-47546] [SC-161241][SQL] Improve validation when reading Variant from Parquet

  • [SPARK-47543] [SC-161234][CONNECT][PYTHON] Inferring dict as MapType from Pandas DataFrame to allow DataFrame creation

  • [SPARK-47485] [SC-161194][SQL][PYTHON][CONNECT] Create column with collations in dataframe API

  • [SPARK-47641] [SC-161376][SQL] Improve the performance for UnaryMinus and Abs

  • [SPARK-47631] [SC-161325][SQL] Remove unused SQLConf.parquetOutputCommitterClass method

  • [SPARK-47674] [SC-161504][CORE] Enable spark.metrics.appStatusSource.enabled by default

  • [SPARK-47273] [SC-161162][SS][PYTHON] implement Python data stream writer interface.

  • [SPARK-47637] [SC-161408][SQL] Use errorCapturingIdentifier in more places

  • [SPARK-47497] Revert “Revert “[SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings””

  • [SPARK-47492] [SC-161316][SQL] Widen whitespace rules in lexer

  • [SPARK-47664] [SC-161475][PYTHON][CONNECT] Validate the column name with cached schema

  • [SPARK-47638] [SC-161339][PS][CONNECT] Skip column name validation in PS

  • [SPARK-47363] [SC-161247][SS] Initial State without state reader implementation for State API v2.

  • [SPARK-47447] [SC-160448][SQL] Allow reading Parquet TimestampLTZ as TimestampNTZ

  • [SPARK-47497] Revert “[SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings”

  • [SPARK-47434] [SC-160122][WEBUI] Fix statistics link in StreamingQueryPage

  • [SPARK-46761] [SC-159045][SQL] Quoted strings in a JSON path should support ? characters

  • [SPARK-46915] [SC-155729][SQL] Simplify UnaryMinus Abs and align error class

  • [SPARK-47431] [SC-160919][SQL] Add session level default Collation

  • [SPARK-47620] [SC-161242][PYTHON][CONNECT] Add a helper function to sort columns

  • [SPARK-47570] [SC-161165][SS] Integrate range scan encoder changes with timer implementation

  • [SPARK-47497] [SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings

  • [SPARK-47562] [SC-161166][CONNECT] Factor literal handling out of plan.py

  • [SPARK-47509] [SC-160902][SQL] Block subquery expressions in lambda and higher-order functions

  • [SPARK-47539] [SC-160750][SQL] Make the return value of method castToString be Any => UTF8String

  • [SPARK-47372] [SC-160905][SS] Add support for range scan based key state encoder for use with state store provider

  • [SPARK-47517] [SC-160642][CORE][SQL] Prefer Utils.bytesToString for size display

  • [SPARK-47243] [SC-158059][SS] Correct the package name of StateMetadataSource.scala

  • [SPARK-47367] [SC-160913][PYTHON][CONNECT] Support Python data sources with Spark Connect

  • [SPARK-47521] [SC-160666][CORE] Use Utils.tryWithResource during reading shuffle data from external storage

  • [SPARK-47474] [SC-160522][CORE] Revert SPARK-47461 and add some comments

  • [SPARK-47560] [SC-160914][PYTHON][CONNECT] Avoid RPC to validate column name with cached schema

  • [SPARK-47451] [SC-160749][SQL] Support to_json(variant).

  • [SPARK-47528] [SC-160727][SQL] Add UserDefinedType support to DataTypeUtils.canWrite

  • [SPARK-44708] Revert “[SC-160734][PYTHON] Migrate test_reset_index assert_eq to use assertDataFrameEqual”

  • [SPARK-47506] [SC-160740][SQL] Add support to all file source formats for collated data types

  • [SPARK-47256] [SC-160784][SQL] Assign names to error classes LEGACYERROR_TEMP_102[4-7]

  • [SPARK-47495] [SC-160720][CORE] Fix primary resource jar added to spark.jars twice under k8s cluster mode

  • [SPARK-47398] [SC-160572][SQL] Extract a trait for InMemoryTableScanExec to allow for extending functionality

  • [SPARK-47479] [SC-160623][SQL] Optimize cannot write data to relations with multiple paths error log

  • [SPARK-47483] [SC-160629][SQL] Add support for aggregation and join operations on arrays of collated strings

  • [SPARK-47458] [SC-160237][CORE] Fix the problem with calculating the maximum concurrent tasks for the barrier stage

  • [SPARK-47534] [SC-160737][SQL] Move o.a.s.variant to o.a.s.types.variant

  • [SPARK-47396] [SC-159312][SQL] Add a general mapping for TIME WITHOUT TIME ZONE to TimestampNTZType

  • [SPARK-44708] [SC-160734][PYTHON] Migrate test_reset_index assert_eq to use assertDataFrameEqual

  • [SPARK-47309] [SC-157733][SC-160398][SQL] XML: Add schema inference tests for value tags

  • [SPARK-47007] [SC-160630][SQL] Add the MapSort expression

  • [SPARK-47523] [SC-160645][SQL] Replace deprecated JsonParser#getCurrentName with JsonParser#currentName

  • [SPARK-47440] [SC-160635][SQL] Fix pushing unsupported syntax to MsSqlServer

  • [SPARK-47512] [SC-160617][SS] Tag operation type used with RocksDB state store instance lock acquisition/release

  • [SPARK-47346] [SC-159425][PYTHON] Make daemon mode configurable when creating Python planner workers

  • [SPARK-47446] [SC-160163][CORE] Make BlockManager warn before removeBlockInternal

  • [SPARK-46526] [SC-156099][SQL] Support LIMIT over correlated subqueries where predicates only reference outer table

  • [SPARK-47461] [SC-160297][CORE] Remove private function totalRunningTasksPerResourceProfile from ExecutorAllocationManager

  • [SPARK-47422] [SC-160219][SQL] Support collated strings in array operations

  • [SPARK-47500] [SC-160627][PYTHON][CONNECT] Factor column name handling out of plan.py

  • [SPARK-47383] [SC-160144][CORE] Support spark.shutdown.timeout config

  • [SPARK-47342] [SC-159049]Revert “[SQL] Support TimestampNTZ for DB2 TIMESTAMP WITH TIME ZONE”

  • [SPARK-47486] [SC-160491][CONNECT] Remove unused private ArrowDeserializers.getString method

  • [SPARK-47233] [SC-154486][CONNECT][SS][2/2] Client & Server logic for Client side streaming query listener

  • [SPARK-47487] [SC-160534][SQL] Simplify code in AnsiTypeCoercion

  • [SPARK-47443] [SC-160459][SQL] Window Aggregate support for collations

  • [SPARK-47296] [SC-160457][SQL][COLLATION] Fail unsupported functions for non-binary collations

  • [SPARK-47380] [SC-160164][CONNECT] Ensure on the server side that the SparkSession is the same

  • [SPARK-47327] [SC-160069][SQL] Move sort keys concurrency test to CollationFactorySuite

  • [SPARK-47494] [SC-160495][Doc] Add migration doc for the behavior change of Parquet timestamp inference since Spark 3.3

  • [SPARK-47449] [SC-160372][SS] Refactor and split list/timer unit tests

  • [SPARK-46473] [SC-155663][SQL] Reuse getPartitionedFile method

  • [SPARK-47423] [SC-160068][SQL] Collations - Set operation support for strings with collations

  • [SPARK-47439] [SC-160115][PYTHON] Document Python Data Source API in API reference page

  • [SPARK-47457] [SC-160234][SQL] Fix IsolatedClientLoader.supportsHadoopShadedClient to handle Hadoop 3.4+

  • [SPARK-47366] [SC-159348][SQL] Implement parse_json.

  • [SPARK-46331] [SC-152982][SQL] Removing CodegenFallback from subset of DateTime expressions and version() expression

  • [SPARK-47395] [SC-159404] Add collate and collation to other APIs

  • [SPARK-47437] [SC-160117][PYTHON][CONNECT] Correct the error class for DataFrame.sort*

  • [SPARK-47174] [SC-154483][CONNECT][SS][1/2] Server side SparkConnectListenerBusListener for Client side streaming query listener

  • [SPARK-47324] [SC-158720][SQL] Add missing timestamp conversion for JDBC nested types

  • [SPARK-46962] [SC-158834][SS][PYTHON] Add interface for python streaming data source API and implement python worker to run python streaming data source

  • [SPARK-45827] [SC-158498][SQL] Move data type checks to CreatableRelationProvider

  • [SPARK-47342] [SC-158874][SQL] Support TimestampNTZ for DB2 TIMESTAMP WITH TIME ZONE

  • [SPARK-47399] [SC-159378][SQL] Disable generated columns on expressions with collations

  • [SPARK-47146] [SC-158247][CORE] Possible thread leak when doing sort merge join

  • [SPARK-46913] [SC-159149][SS] Add support for processing/event time based timers with transformWithState operator

  • [SPARK-47375] [SC-159063][SQL] Add guidelines for timestamp mapping in JdbcDialect#getCatalystType

  • [SPARK-47394] [SC-159282][SQL] Support TIMESTAMP WITH TIME ZONE for H2Dialect

  • [SPARK-45827] Revert “[SC-158498][SQL] Move data type checks to …

  • [SPARK-47208] [SC-159279][CORE] Allow overriding base overhead memory

  • [SPARK-42627] [SC-158021][SPARK-26494][SQL] Support Oracle TIMESTAMP WITH LOCAL TIME ZONE

  • [SPARK-47055] [SC-156916][PYTHON] Upgrade MyPy 1.8.0

  • [SPARK-46906] [SC-157205][SS] Add a check for stateful operator change for streaming

  • [SPARK-47391] [SC-159283][SQL] Remove the test case workaround for JDK 8

  • [SPARK-47272] [SC-158960][SS] Add MapState implementation for State API v2.

  • [SPARK-47375] [SC-159278][Doc][FollowUp] Fix a mistake in JDBC’s preferTimestampNTZ option doc

  • [SPARK-42328] [SC-157363][SQL] Remove LEGACYERROR_TEMP_1175 from error classes

  • [SPARK-47375] [SC-159261][Doc][FollowUp] Correct the preferTimestampNTZ option description in JDBC doc

  • [SPARK-47344] [SC-159146] Extend INVALID_IDENTIFIER error beyond catching ‘-‘ in an unquoted identifier and fix “IS ! NULL” et al.

  • [SPARK-47340] [SC-159039][SQL] Change “collate” in StringType typename to lowercase

  • [SPARK-47087] [SC-157077][SQL] Raise Spark’s exception with an error class in config value check

  • [SPARK-47327] [SC-158824][SQL] Fix thread safety issue in ICU Collator

  • [SPARK-47082] [SC-157058][SQL] Fix out-of-bounds error condition

  • [SPARK-47331] [SC-158719][SS] Serialization using case classes/primitives/POJO based on SQL encoder for Arbitrary State API v2.

  • [SPARK-47250] [SC-158840][SS] Add additional validations and NERF changes for RocksDB state provider and use of column families

  • [SPARK-47328] [SC-158745][SQL] Rename UCS_BASIC collation to UTF8_BINARY

  • [SPARK-47207] [SC-157845][CORE] Support spark.driver.timeout and DriverTimeoutPlugin

  • [SPARK-47370] [SC-158956][Doc] Add migration doc: TimestampNTZ type inference on Parquet files

  • [SPARK-47309] [SC-158827][SQL][XML] Add schema inference unit tests

  • [SPARK-47295] [SC-158850][SQL] Added ICU StringSearch for the startsWith and endsWith functions

  • [SPARK-47343] [SC-158851][SQL] Fix NPE when sqlString variable value is null string in execute immediate

  • [SPARK-46293] [SC-150117][CONNECT][PYTHON] Use protobuf transitive dependency

  • [SPARK-46795] [SC-154143][SQL] Replace UnsupportedOperationException by SparkUnsupportedOperationException in sql/core

  • [SPARK-46087] [SC-149023][PYTHON] Sync PySpark dependencies in docs and dev requirements

  • [SPARK-47169] [SC-158848][SQL] Disable bucketing on collated columns

  • [SPARK-42332] [SC-153996][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression

  • [SPARK-45827] [SC-158498][SQL] Move data type checks to CreatableRelationProvider

  • [SPARK-47341] [SC-158825][Connect] Replace commands with relations in a few tests in SparkConnectClientSuite

  • [SPARK-43255] [SC-158026][SQL] Replace the error class LEGACYERROR_TEMP_2020 by an internal error

  • [SPARK-47248] [SC-158494][SQL][COLLATION] Improved string function support: contains

  • [SPARK-47334] [SC-158716][SQL] Make withColumnRenamed reuse the implementation of withColumnsRenamed

  • [SPARK-46442] [SC-153168][SQL] DS V2 supports push down PERCENTILE_CONT and PERCENTILE_DISC

  • [SPARK-47313] [SC-158747][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError

  • [SPARK-45827] [SC-158732][SQL] Add variant singleton type for Java

  • [SPARK-47337] [SC-158743][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0

  • [SPARK-47302] [SC-158609][SQL] Collate keyword as identifier

  • [SPARK-46817] [SC-154196][CORE] Fix spark-daemon.sh usage by adding decommission command

  • [SPARK-46739] [SC-153553][SQL] Add the error class UNSUPPORTED_CALL

  • [SPARK-47102] [SC-158253][SQL] Add the COLLATION_ENABLED config flag

  • [SPARK-46774] [SC-153925][SQL][AVRO] Use mapreduce.output.fileoutputformat.compress instead of deprecated mapred.output.compress in Avro write jobs

  • [SPARK-45245] [SC-146961][PYTHON][CONNECT] PythonWorkerFactory: Timeout if worker does not connect back.

  • [SPARK-46835] [SC-158355][SQL][Collations] Join support for non-binary collations

  • [SPARK-47131] [SC-158154][SQL][COLLATION] String function support: contains, startswith, endswith

  • [SPARK-46077] [SC-157839][SQL] Consider the type generated by TimestampNTZConverter in JdbcDialect.compileValue.

  • [SPARK-47311] [SC-158465][SQL][PYTHON] Suppress Python exceptions where PySpark is not in the Python path

  • [SPARK-47319] [SC-158599][SQL] Improve missingInput calculation

  • [SPARK-47316] [SC-158606][SQL] Fix TimestampNTZ in Postgres Array

  • [SPARK-47268] [SC-158158][SQL][Collations] Support for repartition with collations

  • [SPARK-47191] [SC-157831][SQL] Avoid unnecessary relation lookup when uncaching table/view

  • [SPARK-47168] [SC-158257][SQL] Disable parquet filter pushdown when working with non default collated strings

  • [SPARK-47236] [SC-158015][CORE] Fix deleteRecursivelyUsingJavaIO to skip non-existing file input

  • [SPARK-47238] [SC-158466][SQL] Reduce executor memory usage by making generated code in WSCG a broadcast variable

  • [SPARK-47249] [SC-158133][CONNECT] Fix bug where all connect executions are considered abandoned regardless of their actual status

  • [SPARK-47202] [SC-157828][PYTHON] Fix typo breaking datetimes with tzinfo

  • [SPARK-46834] [SC-158139][SQL][Collations] Support for aggregates

  • [SPARK-47277] [SC-158351][3.5] PySpark util function assertDataFrameEqual should not support streaming DF

  • [SPARK-47155] [SC-158473][PYTHON] Fix Error Class Issue

  • [SPARK-47245] [SC-158163][SQL] Improve error code for INVALID_PARTITION_COLUMN_DATA_TYPE

  • [SPARK-39771] [SC-158425][CORE] Add a warning msg in Dependency when a too large number of shuffle blocks is to be created.

  • [SPARK-47277] [SC-158329] PySpark util function assertDataFrameEqual should not support streaming DF

  • [SPARK-47293] [SC-158356][CORE] Build batchSchema with sparkSchema instead of append one by one

  • [SPARK-46732] [SC-153517][CONNECT]Make Subquery/Broadcast thread work with Connect’s artifact management

  • [SPARK-44746] [SC-158332][PYTHON] Add more Python UDTF documentation for functions that accept input tables

  • [SPARK-47120] [SC-157517][SQL] Null comparison push down data filter from subquery produces in NPE in Parquet filter

  • [SPARK-47251] [SC-158291][PYTHON] Block invalid types from the args argument for sql command

  • [SPARK-47251] Revert “[SC-158121][PYTHON] Block invalid types from the args argument for sql command”

  • [SPARK-47015] [SC-157900][SQL] Disable partitioning on collated columns

  • [SPARK-46846] [SC-154308][CORE] Make WorkerResourceInfo extend Serializable explicitly

  • [SPARK-46641] [SC-156314][SS] Add maxBytesPerTrigger threshold

  • [SPARK-47244] [SC-158122][CONNECT] SparkConnectPlanner make internal functions private

  • [SPARK-47266] [SC-158146][CONNECT] Make ProtoUtils.abbreviate return the same type as the input

  • [SPARK-46961] [SC-158183][SS] Using ProcessorContext to store and retrieve handle

  • [SPARK-46862] [SC-154548][SQL] Disable CSV column pruning in the multi-line mode

  • [SPARK-46950] [SC-155803][CORE][SQL] Align not available codec error-class

  • [SPARK-46368] [SC-153236][CORE] Support readyz in REST Submission API

  • [SPARK-46806] [SC-154108][PYTHON] Improve error message for spark.table when argument type is wrong

  • [SPARK-47211] [SC-158008][CONNECT][PYTHON] Fix ignored PySpark Connect string collation

  • [SPARK-46552] [SC-151366][SQL] Replace UnsupportedOperationException by SparkUnsupportedOperationException in catalyst

  • [SPARK-47147] [SC-157842][PYTHON][SQL] Fix PySpark collated string conversion error

  • [SPARK-47144] [SC-157826][CONNECT][SQL][PYTHON] Fix Spark Connect collation error by adding collateId protobuf field

  • [SPARK-46575] [SC-153200][SQL][HIVE] Make HiveThriftServer2.startWithContext DevelopApi retriable and fix flakiness of ThriftServerWithSparkContextInHttpSuite

  • [SPARK-46696] [SC-153832][CORE] In ResourceProfileManager, function calls should occur after variable declarations

  • [SPARK-47214] [SC-157862][Python] Create UDTF API for ‘analyze’ method to differentiate constant NULL arguments and other types of arguments

  • [SPARK-46766] [SC-153909][SQL][AVRO] ZSTD Buffer Pool Support For AVRO datasource

  • [SPARK-47192] [SC-157819] Convert some LEGACYERROR_TEMP_0035 errors

  • [SPARK-46928] [SC-157341][SS] Add support for ListState in Arbitrary State API v2.

  • [SPARK-46881] [SC-154612][CORE] Support spark.deploy.workerSelectionPolicy

  • [SPARK-46800] [SC-154107][CORE] Support spark.deploy.spreadOutDrivers

  • [SPARK-45484] [SC-146014][SQL] Fix the bug that uses incorrect parquet compression codec lz4raw

  • [SPARK-46791] [SC-154018][SQL] Support Java Set in JavaTypeInference

  • [SPARK-46332] [SC-150224][SQL] Migrate CatalogNotFoundException to the error class CATALOG_NOT_FOUND

  • [SPARK-47164] [SC-157616][SQL] Make Default Value From Wider Type Narrow Literal of v2 behave the same as v1

  • [SPARK-46664] [SC-153181][CORE] Improve Master to recover quickly in case of zero workers and apps

  • [SPARK-46759] [SC-153839][SQL][AVRO] Codec xz and zstandard support compression level for avro files

Databricks ODBC/JDBC driver support

Databricks supports ODBC/JDBC drivers released in the past 2 years. Please download the recently released drivers and upgrade (download ODBC, download JDBC).

System environment

  • Operating System: Ubuntu 22.04.4 LTS

  • Java: Zulu 8.74.0.17-CA-linux64

  • Scala: 2.12.15

  • Python: 3.11.0

  • R: 4.3.2

  • Delta Lake: 3.1.0

Installed Python libraries

Library

Version

Library

Version

Library

Version

asttokens

2.0.5

astunparse

1.6.3

azure-core

1.30.1

azure-storage-blob

12.19.1

azure-storage-file-datalake

12.14.0

backcall

0.2.0

black

23.3.0

blinker

1.4

boto3

1.34.39

botocore

1.34.39

cachetools

5.3.3

certifi

2023.7.22

cffi

1.15.1

chardet

4.0.0

charset-normalizer

2.0.4

click

8.0.4

cloudpickle

2.2.1

comm

0.1.2

contourpy

1.0.5

cryptography

41.0.3

cycler

0.11.0

Cython

0.29.32

databricks-sdk

0.20.0

dbus-python

1.2.18

debugpy

1.6.7

decorator

5.1.1

distlib

0.3.8

entrypoints

0.4

executing

0.8.3

facets-overview

1.1.1

filelock

3.13.1

fonttools

4.25.0

gitdb

4.0.11

GitPython

3.1.43

google-api-core

2.18.0

google-auth

2.29.0

google-cloud-core

2.4.1

google-cloud-storage

2.16.0

google-crc32c

1.5.0

google-resumable-media

2.7.0

googleapis-common-protos

1.63.0

grpcio

1.60.0

grpcio-status

1.60.0

httplib2

0.20.2

idna

3.4

importlib-metadata

6.0.0

ipyflow-core

0.0.198

ipykernel

6.25.1

ipython

8.15.0

ipython-genutils

0.2.0

ipywidgets

7.7.2

isodate

0.6.1

jedi

0.18.1

jeepney

0.7.1

jmespath

0.10.0

joblib

1.2.0

jupyter_client

7.4.9

jupyter_core

5.3.0

keyring

23.5.0

kiwisolver

1.4.4

launchpadlib

1.10.16

lazr.restfulclient

0.14.4

lazr.uri

1.0.6

matplotlib

3.7.2

matplotlib-inline

0.1.6

mlflow-skinny

2.11.3

more-itertools

8.10.0

mypy-extensions

0.4.3

nest-asyncio

1.5.6

numpy

1.23.5

oauthlib

3.2.0

packaging

23.2

pandas

1.5.3

parso

0.8.3

pathspec

0.10.3

patsy

0.5.3

pexpect

4.8.0

pickleshare

0.7.5

Pillow

9.4.0

pip

23.2.1

platformdirs

3.10.0

plotly

5.9.0

prompt-toolkit

3.0.36

proto-plus

1.23.0

protobuf

4.24.1

psutil

5.9.0

psycopg2

2.9.3

ptyprocess

0.7.0

pure-eval

0.2.2

pyarrow

14.0.1

pyasn1

0.4.8

pyasn1-modules

0.2.8

pyccolo

0.0.52

pycparser

2.21

pydantic

1.10.6

Pygments

2.15.1

PyGObject

3.42.1

PyJWT

2.3.0

pyodbc

4.0.38

pyparsing

3.0.9

python-dateutil

2.8.2

python-lsp-jsonrpc

1.1.1

pytz

2022.7

PyYAML

6.0

pyzmq

23.2.0

requests

2.31.0

rsa

4.9

s3transfer

0.10.1

scikit-learn

1.3.0

scipy

1.11.1

seaborn

0.12.2

SecretStorage

3.3.1

setuptools

68.0.0

six

1.16.0

smmap

5.0.1

sqlparse

0.5.0

ssh-import-id

5.11

stack-data

0.2.0

statsmodels

0.14.0

tenacity

8.2.2

threadpoolctl

2.2.0

tokenize-rt

4.2.1

tornado

6.3.2

traitlets

5.7.1

typing_extensions

4.10.0

tzdata

2022.1

ujson

5.4.0

unattended-upgrades

0.1

urllib3

1.26.16

virtualenv

20.24.2

wadllib

1.3.6

wcwidth

0.2.5

wheel

0.38.4

zipp

3.11.0

Installed R libraries

R libraries are installed from the Posit Package Manager CRAN snapshot

Library

Version

Library

Version

Library

Version

arrow

14.0.0.2

askpass

1.2.0

assertthat

0.2.1

backports

1.4.1

base

4.3.2

base64enc

0.1-3

bigD

0.2.0

bit

4.0.5

bit64

4.0.5

bitops

1.0-7

blob

1.2.4

boot

1.3-28

brew

1.0-10

brio

1.1.4

broom

1.0.5

bslib

0.6.1

cachem

1.0.8

callr

3.7.3

caret

6.0-94

cellranger

1.1.0

chron

2.3-61

class

7.3-22

cli

3.6.2

clipr

0.8.0

clock

0.7.0

cluster

2.1.4

codetools

0.2-19

colorspace

2.1-0

commonmark

1.9.1

compiler

4.3.2

config

0.3.2

conflicted

1.2.0

cpp11

0.4.7

crayon

1.5.2

credentials

2.0.1

curl

5.2.0

data.table

1.15.0

datasets

4.3.2

DBI

1.2.1

dbplyr

2.4.0

desc

1.4.3

devtools

2.4.5

diagram

1.6.5

diffobj

0.3.5

digest

0.6.34

downlit

0.4.3

dplyr

1.1.4

dtplyr

1.3.1

e1071

1.7-14

ellipsis

0.3.2

evaluate

0.23

fansi

1.0.6

farver

2.1.1

fastmap

1.1.1

fontawesome

0.5.2

forcats

1.0.0

foreach

1.5.2

foreign

0.8-85

forge

0.2.0

fs

1.6.3

future

1.33.1

future.apply

1.11.1

gargle

1.5.2

generics

0.1.3

gert

2.0.1

ggplot2

3.4.4

gh

1.4.0

git2r

0.33.0

gitcreds

0.1.2

glmnet

4.1-8

globals

0.16.2

glue

1.7.0

googledrive

2.1.1

googlesheets4

1.1.1

gower

1.0.1

graphics

4.3.2

grDevices

4.3.2

grid

4.3.2

gridExtra

2.3

gsubfn

0.7

gt

0.10.1

gtable

0.3.4

hardhat

1.3.1

haven

2.5.4

highr

0.10

hms

1.1.3

htmltools

0.5.7

htmlwidgets

1.6.4

httpuv

1.6.14

httr

1.4.7

httr2

1.0.0

ids

1.0.1

ini

0.3.1

ipred

0.9-14

isoband

0.2.7

iterators

1.0.14

jquerylib

0.1.4

jsonlite

1.8.8

juicyjuice

0.1.0

KernSmooth

2.23-21

knitr

1.45

labeling

0.4.3

later

1.3.2

lattice

0.21-8

lava

1.7.3

lifecycle

1.0.4

listenv

0.9.1

lubridate

1.9.3

magrittr

2.0.3

markdown

1.12

MASS

7.3-60

Matrix

1.5-4.1

memoise

2.0.1

methods

4.3.2

mgcv

1.8-42

mime

0.12

miniUI

0.1.1.1

mlflow

2.10.0

ModelMetrics

1.2.2.2

modelr

0.1.11

munsell

0.5.0

nlme

3.1-163

nnet

7.3-19

numDeriv

2016.8-1.1

openssl

2.1.1

parallel

4.3.2

parallelly

1.36.0

pillar

1.9.0

pkgbuild

1.4.3

pkgconfig

2.0.3

pkgdown

2.0.7

pkgload

1.3.4

plogr

0.2.0

plyr

1.8.9

praise

1.0.0

prettyunits

1.2.0

pROC

1.18.5

processx

3.8.3

prodlim

2023.08.28

profvis

0.3.8

progress

1.2.3

progressr

0.14.0

promises

1.2.1

proto

1.0.0

proxy

0.4-27

ps

1.7.6

purrr

1.0.2

R6

2.5.1

ragg

1.2.7

randomForest

4.7-1.1

rappdirs

0.3.3

rcmdcheck

1.4.0

RColorBrewer

1.1-3

Rcpp

1.0.12

RcppEigen

0.3.3.9.4

reactable

0.4.4

reactR

0.5.0

readr

2.1.5

readxl

1.4.3

recipes

1.0.9

rematch

2.0.0

rematch2

2.1.2

remotes

2.4.2.1

reprex

2.1.0

reshape2

1.4.4

rlang

1.1.3

rmarkdown

2.25

RODBC

1.3-23

roxygen2

7.3.1

rpart

4.1.21

rprojroot

2.0.4

Rserve

1.8-13

RSQLite

2.3.5

rstudioapi

0.15.0

rversions

2.1.2

rvest

1.0.3

sass

0.4.8

scales

1.3.0

selectr

0.4-2

sessioninfo

1.2.2

shape

1.4.6

shiny

1.8.0

sourcetools

0.1.7-1

sparklyr

1.8.4

spatial

7.3-15

splines

4.3.2

sqldf

0.4-11

SQUAREM

2021.1

stats

4.3.2

stats4

4.3.2

stringi

1.8.3

stringr

1.5.1

survival

3.5-5

swagger

3.33.1

sys

3.4.2

systemfonts

1.0.5

tcltk

4.3.2

testthat

3.2.1

textshaping

0.3.7

tibble

3.2.1

tidyr

1.3.1

tidyselect

1.2.0

tidyverse

2.0.0

timechange

0.3.0

timeDate

4032.109

tinytex

0.49

tools

4.3.2

tzdb

0.4.0

urlchecker

1.0.1

usethis

2.2.2

utf8

1.2.4

utils

4.3.2

uuid

1.2-0

V8

4.4.1

vctrs

0.6.5

viridisLite

0.4.2

vroom

1.6.5

waldo

0.5.2

whisker

0.4.1

withr

3.0.0

xfun

0.41

xml2

1.3.6

xopen

1.0.0

xtable

1.8-4

yaml

2.3.8

zeallot

0.1.0

zip

2.3.1

Installed Java and Scala libraries (Scala 2.12 cluster version)

Group ID

Artifact ID

Version

antlr

antlr

2.7.7

com.amazonaws

amazon-kinesis-client

1.12.0

com.amazonaws

aws-java-sdk-autoscaling

1.12.610

com.amazonaws

aws-java-sdk-cloudformation

1.12.610

com.amazonaws

aws-java-sdk-cloudfront

1.12.610

com.amazonaws

aws-java-sdk-cloudhsm

1.12.610

com.amazonaws

aws-java-sdk-cloudsearch

1.12.610

com.amazonaws

aws-java-sdk-cloudtrail

1.12.610

com.amazonaws

aws-java-sdk-cloudwatch

1.12.610

com.amazonaws

aws-java-sdk-cloudwatchmetrics

1.12.610

com.amazonaws

aws-java-sdk-codedeploy

1.12.610

com.amazonaws

aws-java-sdk-cognitoidentity

1.12.610

com.amazonaws

aws-java-sdk-cognitosync

1.12.610

com.amazonaws

aws-java-sdk-config

1.12.610

com.amazonaws

aws-java-sdk-core

1.12.610

com.amazonaws

aws-java-sdk-datapipeline

1.12.610

com.amazonaws

aws-java-sdk-directconnect

1.12.610

com.amazonaws

aws-java-sdk-directory

1.12.610

com.amazonaws

aws-java-sdk-dynamodb

1.12.610

com.amazonaws

aws-java-sdk-ec2

1.12.610

com.amazonaws

aws-java-sdk-ecs

1.12.610

com.amazonaws

aws-java-sdk-efs

1.12.610

com.amazonaws

aws-java-sdk-elasticache

1.12.610

com.amazonaws

aws-java-sdk-elasticbeanstalk

1.12.610

com.amazonaws

aws-java-sdk-elasticloadbalancing

1.12.610

com.amazonaws

aws-java-sdk-elastictranscoder

1.12.610

com.amazonaws

aws-java-sdk-emr

1.12.610

com.amazonaws

aws-java-sdk-glacier

1.12.610

com.amazonaws

aws-java-sdk-glue

1.12.610

com.amazonaws

aws-java-sdk-iam

1.12.610

com.amazonaws

aws-java-sdk-importexport

1.12.610

com.amazonaws

aws-java-sdk-kinesis

1.12.610

com.amazonaws

aws-java-sdk-kms

1.12.610

com.amazonaws

aws-java-sdk-lambda

1.12.610

com.amazonaws

aws-java-sdk-logs

1.12.610

com.amazonaws

aws-java-sdk-machinelearning

1.12.610

com.amazonaws

aws-java-sdk-opsworks

1.12.610

com.amazonaws

aws-java-sdk-rds

1.12.610

com.amazonaws

aws-java-sdk-redshift

1.12.610

com.amazonaws

aws-java-sdk-route53

1.12.610

com.amazonaws

aws-java-sdk-s3

1.12.610

com.amazonaws

aws-java-sdk-ses

1.12.610

com.amazonaws

aws-java-sdk-simpledb

1.12.610

com.amazonaws

aws-java-sdk-simpleworkflow

1.12.610

com.amazonaws

aws-java-sdk-sns

1.12.610

com.amazonaws

aws-java-sdk-sqs

1.12.610

com.amazonaws

aws-java-sdk-ssm

1.12.610

com.amazonaws

aws-java-sdk-storagegateway

1.12.610

com.amazonaws

aws-java-sdk-sts

1.12.610

com.amazonaws

aws-java-sdk-support

1.12.610

com.amazonaws

aws-java-sdk-swf-libraries

1.11.22

com.amazonaws

aws-java-sdk-workspaces

1.12.610

com.amazonaws

jmespath-java

1.12.610

com.clearspring.analytics

stream

2.9.6

com.databricks

Rserve

1.8-3

com.databricks

databricks-sdk-java

0.17.1

com.databricks

jets3t

0.7.1-0

com.databricks.scalapb

compilerplugin_2.12

0.4.15-10

com.databricks.scalapb

scalapb-runtime_2.12

0.4.15-10

com.esotericsoftware

kryo-shaded

4.0.2

com.esotericsoftware

minlog

1.3.0

com.fasterxml

classmate

1.3.4

com.fasterxml.jackson.core

jackson-annotations

2.15.2

com.fasterxml.jackson.core

jackson-core

2.15.2

com.fasterxml.jackson.core

jackson-databind

2.15.2

com.fasterxml.jackson.dataformat

jackson-dataformat-cbor

2.15.2

com.fasterxml.jackson.dataformat

jackson-dataformat-yaml

2.15.2

com.fasterxml.jackson.datatype

jackson-datatype-joda

2.15.2

com.fasterxml.jackson.datatype

jackson-datatype-jsr310

2.16.0

com.fasterxml.jackson.module

jackson-module-paranamer

2.15.2

com.fasterxml.jackson.module

jackson-module-scala_2.12

2.15.2

com.github.ben-manes.caffeine

caffeine

2.9.3

com.github.fommil

jniloader

1.1

com.github.fommil.netlib

native_ref-java

1.1

com.github.fommil.netlib

native_ref-java

1.1-natives

com.github.fommil.netlib

native_system-java

1.1

com.github.fommil.netlib

native_system-java

1.1-natives

com.github.fommil.netlib

netlib-native_ref-linux-x86_64

1.1-natives

com.github.fommil.netlib

netlib-native_system-linux-x86_64

1.1-natives

com.github.luben

zstd-jni

1.5.5-4

com.github.wendykierp

JTransforms

3.1

com.google.code.findbugs

jsr305

3.0.0

com.google.code.gson

gson

2.10.1

com.google.crypto.tink

tink

1.9.0

com.google.errorprone

error_prone_annotations

2.10.0

com.google.flatbuffers

flatbuffers-java

23.5.26

com.google.guava

guava

15.0

com.google.protobuf

protobuf-java

2.6.1

com.helger

profiler

1.1.1

com.ibm.icu

icu4j

72.1

com.jcraft

jsch

0.1.55

com.jolbox

bonecp

0.8.0.RELEASE

com.lihaoyi

sourcecode_2.12

0.1.9

com.microsoft.azure

azure-data-lake-store-sdk

2.3.9

com.microsoft.sqlserver

mssql-jdbc

11.2.2.jre8

com.ning

compress-lzf

1.1.2

com.sun.mail

javax.mail

1.5.2

com.sun.xml.bind

jaxb-core

2.2.11

com.sun.xml.bind

jaxb-impl

2.2.11

com.tdunning

json

1.8

com.thoughtworks.paranamer

paranamer

2.8

com.trueaccord.lenses

lenses_2.12

0.4.12

com.twitter

chill-java

0.10.0

com.twitter

chill_2.12

0.10.0

com.twitter

util-app_2.12

7.1.0

com.twitter

util-core_2.12

7.1.0

com.twitter

util-function_2.12

7.1.0

com.twitter

util-jvm_2.12

7.1.0

com.twitter

util-lint_2.12

7.1.0

com.twitter

util-registry_2.12

7.1.0

com.twitter

util-stats_2.12

7.1.0

com.typesafe

config

1.4.3

com.typesafe.scala-logging

scala-logging_2.12

3.7.2

com.uber

h3

3.7.3

com.univocity

univocity-parsers

2.9.1

com.zaxxer

HikariCP

4.0.3

commons-cli

commons-cli

1.5.0

commons-codec

commons-codec

1.16.0

commons-collections

commons-collections

3.2.2

commons-dbcp

commons-dbcp

1.4

commons-fileupload

commons-fileupload

1.5

commons-httpclient

commons-httpclient

3.1

commons-io

commons-io

2.13.0

commons-lang

commons-lang

2.6

commons-logging

commons-logging

1.1.3

commons-pool

commons-pool

1.5.4

dev.ludovic.netlib

arpack

3.0.3

dev.ludovic.netlib

blas

3.0.3

dev.ludovic.netlib

lapack

3.0.3

info.ganglia.gmetric4j

gmetric4j

1.0.10

io.airlift

aircompressor

0.25

io.delta

delta-sharing-client_2.12

1.0.5

io.dropwizard.metrics

metrics-annotation

4.2.19

io.dropwizard.metrics

metrics-core

4.2.19

io.dropwizard.metrics

metrics-graphite

4.2.19

io.dropwizard.metrics

metrics-healthchecks

4.2.19

io.dropwizard.metrics

metrics-jetty9

4.2.19

io.dropwizard.metrics

metrics-jmx

4.2.19

io.dropwizard.metrics

metrics-json

4.2.19

io.dropwizard.metrics

metrics-jvm

4.2.19

io.dropwizard.metrics

metrics-servlets

4.2.19

io.netty

netty-all

4.1.96.Final

io.netty

netty-buffer

4.1.96.Final

io.netty

netty-codec

4.1.96.Final

io.netty

netty-codec-http

4.1.96.Final

io.netty

netty-codec-http2

4.1.96.Final

io.netty

netty-codec-socks

4.1.96.Final

io.netty

netty-common

4.1.96.Final

io.netty

netty-handler

4.1.96.Final

io.netty

netty-handler-proxy

4.1.96.Final

io.netty

netty-resolver

4.1.96.Final

io.netty

netty-tcnative-boringssl-static

2.0.61.Final-linux-aarch_64

io.netty

netty-tcnative-boringssl-static

2.0.61.Final-linux-x86_64

io.netty

netty-tcnative-boringssl-static

2.0.61.Final-osx-aarch_64

io.netty

netty-tcnative-boringssl-static

2.0.61.Final-osx-x86_64

io.netty

netty-tcnative-boringssl-static

2.0.61.Final-windows-x86_64

io.netty

netty-tcnative-classes

2.0.61.Final

io.netty

netty-transport

4.1.96.Final

io.netty

netty-transport-classes-epoll

4.1.96.Final

io.netty

netty-transport-classes-kqueue

4.1.96.Final

io.netty

netty-transport-native-epoll

4.1.96.Final

io.netty

netty-transport-native-epoll

4.1.96.Final-linux-aarch_64

io.netty

netty-transport-native-epoll

4.1.96.Final-linux-x86_64

io.netty

netty-transport-native-kqueue

4.1.96.Final-osx-aarch_64

io.netty

netty-transport-native-kqueue

4.1.96.Final-osx-x86_64

io.netty

netty-transport-native-unix-common

4.1.96.Final

io.prometheus

simpleclient

0.7.0

io.prometheus

simpleclient_common

0.7.0

io.prometheus

simpleclient_dropwizard

0.7.0

io.prometheus

simpleclient_pushgateway

0.7.0

io.prometheus

simpleclient_servlet

0.7.0

io.prometheus.jmx

collector

0.12.0

jakarta.annotation

jakarta.annotation-api

1.3.5

jakarta.servlet

jakarta.servlet-api

4.0.3

jakarta.validation

jakarta.validation-api

2.0.2

jakarta.ws.rs

jakarta.ws.rs-api

2.1.6

javax.activation

activation

1.1.1

javax.el

javax.el-api

2.2.4

javax.jdo

jdo-api

3.0.1

javax.transaction

jta

1.1

javax.transaction

transaction-api

1.1

javax.xml.bind

jaxb-api

2.2.11

javolution

javolution

5.5.1

jline

jline

2.14.6

joda-time

joda-time

2.12.1

net.java.dev.jna

jna

5.8.0

net.razorvine

pickle

1.3

net.sf.jpam

jpam

1.1

net.sf.opencsv

opencsv

2.3

net.sf.supercsv

super-csv

2.2.0

net.snowflake

snowflake-ingest-sdk

0.9.6

net.sourceforge.f2j

arpack_combined_all

0.1

org.acplt.remotetea

remotetea-oncrpc

1.1.2

org.antlr

ST4

4.0.4

org.antlr

antlr-runtime

3.5.2

org.antlr

antlr4-runtime

4.9.3

org.antlr

stringtemplate

3.2.1

org.apache.ant

ant

1.10.11

org.apache.ant

ant-jsch

1.10.11

org.apache.ant

ant-launcher

1.10.11

org.apache.arrow

arrow-format

15.0.0

org.apache.arrow

arrow-memory-core

15.0.0

org.apache.arrow

arrow-memory-netty

15.0.0

org.apache.arrow

arrow-vector

15.0.0

org.apache.avro

avro

1.11.3

org.apache.avro

avro-ipc

1.11.3

org.apache.avro

avro-mapred

1.11.3

org.apache.commons

commons-collections4

4.4

org.apache.commons

commons-compress

1.23.0

org.apache.commons

commons-crypto

1.1.0

org.apache.commons

commons-lang3

3.12.0

org.apache.commons

commons-math3

3.6.1

org.apache.commons

commons-text

1.10.0

org.apache.curator

curator-client

2.13.0

org.apache.curator

curator-framework

2.13.0

org.apache.curator

curator-recipes

2.13.0

org.apache.datasketches

datasketches-java

3.1.0

org.apache.datasketches

datasketches-memory

2.0.0

org.apache.derby

derby

10.14.2.0

org.apache.hadoop

hadoop-client-runtime

3.3.6

org.apache.hive

hive-beeline

2.3.9

org.apache.hive

hive-cli

2.3.9

org.apache.hive

hive-jdbc

2.3.9

org.apache.hive

hive-llap-client

2.3.9

org.apache.hive

hive-llap-common

2.3.9

org.apache.hive

hive-serde

2.3.9

org.apache.hive

hive-shims

2.3.9

org.apache.hive

hive-storage-api

2.8.1

org.apache.hive.shims

hive-shims-0.23

2.3.9

org.apache.hive.shims

hive-shims-common

2.3.9

org.apache.hive.shims

hive-shims-scheduler

2.3.9

org.apache.httpcomponents

httpclient

4.5.14

org.apache.httpcomponents

httpcore

4.4.16

org.apache.ivy

ivy

2.5.1

org.apache.logging.log4j

log4j-1.2-api

2.22.1

org.apache.logging.log4j

log4j-api

2.22.1

org.apache.logging.log4j

log4j-core

2.22.1

org.apache.logging.log4j

log4j-layout-template-json

2.22.1

org.apache.logging.log4j

log4j-slf4j2-impl

2.22.1

org.apache.orc

orc-core

1.9.2-shaded-protobuf

org.apache.orc

orc-mapreduce

1.9.2-shaded-protobuf

org.apache.orc

orc-shims

1.9.2

org.apache.thrift

libfb303

0.9.3

org.apache.thrift

libthrift

0.12.0

org.apache.ws.xmlschema

xmlschema-core

2.3.0

org.apache.xbean

xbean-asm9-shaded

4.23

org.apache.yetus

audience-annotations

0.13.0

org.apache.zookeeper

zookeeper

3.6.3

org.apache.zookeeper

zookeeper-jute

3.6.3

org.checkerframework

checker-qual

3.31.0

org.codehaus.jackson

jackson-core-asl

1.9.13

org.codehaus.jackson

jackson-mapper-asl

1.9.13

org.codehaus.janino

commons-compiler

3.0.16

org.codehaus.janino

janino

3.0.16

org.datanucleus

datanucleus-api-jdo

4.2.4

org.datanucleus

datanucleus-core

4.1.17

org.datanucleus

datanucleus-rdbms

4.1.19

org.datanucleus

javax.jdo

3.2.0-m3

org.eclipse.collections

eclipse-collections

11.1.0

org.eclipse.collections

eclipse-collections-api

11.1.0

org.eclipse.jetty

jetty-client

9.4.52.v20230823

org.eclipse.jetty

jetty-continuation

9.4.52.v20230823

org.eclipse.jetty

jetty-http

9.4.52.v20230823

org.eclipse.jetty

jetty-io

9.4.52.v20230823

org.eclipse.jetty

jetty-jndi

9.4.52.v20230823

org.eclipse.jetty

jetty-plus

9.4.52.v20230823

org.eclipse.jetty

jetty-proxy

9.4.52.v20230823

org.eclipse.jetty

jetty-security

9.4.52.v20230823

org.eclipse.jetty

jetty-server

9.4.52.v20230823

org.eclipse.jetty

jetty-servlet

9.4.52.v20230823

org.eclipse.jetty

jetty-servlets

9.4.52.v20230823

org.eclipse.jetty

jetty-util

9.4.52.v20230823

org.eclipse.jetty

jetty-util-ajax

9.4.52.v20230823

org.eclipse.jetty

jetty-webapp

9.4.52.v20230823

org.eclipse.jetty

jetty-xml

9.4.52.v20230823

org.eclipse.jetty.websocket

websocket-api

9.4.52.v20230823

org.eclipse.jetty.websocket

websocket-client

9.4.52.v20230823

org.eclipse.jetty.websocket

websocket-common

9.4.52.v20230823

org.eclipse.jetty.websocket

websocket-server

9.4.52.v20230823

org.eclipse.jetty.websocket

websocket-servlet

9.4.52.v20230823

org.fusesource.leveldbjni

leveldbjni-all

1.8

org.glassfish.hk2

hk2-api

2.6.1

org.glassfish.hk2

hk2-locator

2.6.1

org.glassfish.hk2

hk2-utils

2.6.1

org.glassfish.hk2

osgi-resource-locator

1.0.3

org.glassfish.hk2.external

aopalliance-repackaged

2.6.1

org.glassfish.hk2.external

jakarta.inject

2.6.1

org.glassfish.jersey.containers

jersey-container-servlet

2.40

org.glassfish.jersey.containers

jersey-container-servlet-core

2.40

org.glassfish.jersey.core

jersey-client

2.40

org.glassfish.jersey.core

jersey-common

2.40

org.glassfish.jersey.core

jersey-server

2.40

org.glassfish.jersey.inject

jersey-hk2

2.40

org.hibernate.validator

hibernate-validator

6.1.7.Final

org.ini4j

ini4j

0.5.4

org.javassist

javassist

3.29.2-GA

org.jboss.logging

jboss-logging

3.3.2.Final

org.jdbi

jdbi

2.63.1

org.jetbrains

annotations

17.0.0

org.joda

joda-convert

1.7

org.jodd

jodd-core

3.5.2

org.json4s

json4s-ast_2.12

3.7.0-M11

org.json4s

json4s-core_2.12

3.7.0-M11

org.json4s

json4s-jackson_2.12

3.7.0-M11

org.json4s

json4s-scalap_2.12

3.7.0-M11

org.lz4

lz4-java

1.8.0

org.mlflow

mlflow-spark_2.12

2.9.1

org.objenesis

objenesis

2.5.1

org.postgresql

postgresql

42.6.1

org.roaringbitmap

RoaringBitmap

0.9.45

org.roaringbitmap

shims

0.9.45

org.rocksdb

rocksdbjni

8.3.2

org.rosuda.REngine

REngine

2.1.0

org.scala-lang

scala-compiler_2.12

2.12.15

org.scala-lang

scala-library_2.12

2.12.15

org.scala-lang

scala-reflect_2.12

2.12.15

org.scala-lang.modules

scala-collection-compat_2.12

2.11.0

org.scala-lang.modules

scala-parser-combinators_2.12

1.1.2

org.scala-lang.modules

scala-xml_2.12

1.2.0

org.scala-sbt

test-interface

1.0

org.scalacheck

scalacheck_2.12

1.14.2

org.scalactic

scalactic_2.12

3.2.15

org.scalanlp

breeze-macros_2.12

2.1.0

org.scalanlp

breeze_2.12

2.1.0

org.scalatest

scalatest-compatible

3.2.15

org.scalatest

scalatest-core_2.12

3.2.15

org.scalatest

scalatest-diagrams_2.12

3.2.15

org.scalatest

scalatest-featurespec_2.12

3.2.15

org.scalatest

scalatest-flatspec_2.12

3.2.15

org.scalatest

scalatest-freespec_2.12

3.2.15

org.scalatest

scalatest-funspec_2.12

3.2.15

org.scalatest

scalatest-funsuite_2.12

3.2.15

org.scalatest

scalatest-matchers-core_2.12

3.2.15

org.scalatest

scalatest-mustmatchers_2.12

3.2.15

org.scalatest

scalatest-propspec_2.12

3.2.15

org.scalatest

scalatest-refspec_2.12

3.2.15

org.scalatest

scalatest-shouldmatchers_2.12

3.2.15

org.scalatest

scalatest-wordspec_2.12

3.2.15

org.scalatest

scalatest_2.12

3.2.15

org.slf4j

jcl-over-slf4j

2.0.7

org.slf4j

jul-to-slf4j

2.0.7

org.slf4j

slf4j-api

2.0.7

org.slf4j

slf4j-simple

1.7.25

org.threeten

threeten-extra

1.7.1

org.tukaani

xz

1.9

org.typelevel

algebra_2.12

2.0.1

org.typelevel

cats-kernel_2.12

2.1.1

org.typelevel

spire-macros_2.12

0.17.0

org.typelevel

spire-platform_2.12

0.17.0

org.typelevel

spire-util_2.12

0.17.0

org.typelevel

spire_2.12

0.17.0

org.wildfly.openssl

wildfly-openssl

1.1.3.Final

org.xerial

sqlite-jdbc

3.42.0.0

org.xerial.snappy

snappy-java

1.1.10.3

org.yaml

snakeyaml

2.0

oro

oro

2.0.8

pl.edu.icm

JLargeArrays

1.5

software.amazon.cryptools

AmazonCorrettoCryptoProvider

1.6.1-linux-x86_64

software.amazon.ion

ion-java

1.0.2

stax

stax-api

1.0.1