Databricks SQL Driver for Node.js

Note

This article covers the Databricks SQL Driver for Node.js, which is provided as-is and is not supported by Databricks through customer technical support channels. Questions and feature requests can be communicated through the Issues page of the databricks/databricks-sql-nodejs repo on GitHub.

The Databricks SQL Driver for Node.js is a Node.js library that allows you to use JavaScript code to run SQL commands on Databricks compute resources.

Requirements

Specify the connection variables

To access your cluster or SQL warehouse, the Databricks SQL Driver for Node.js uses connection variables named token, server_hostname and http_path, representing your Databricks personal access token and your cluster’s or SQL warehouse’s Server Hostname and HTTP Path values, respectively.

The Databricks personal access token value for token is similar to the following: dapi1ab2c34defabc567890123d4efa56789.

The Server Hostname value for server_hostname is similar to the following: 1234567890123456.7.gcp.databricks.com.

The HTTP Path value for http_path is similar to the following: for a cluster, sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh; and for a SQL warehouse, /sql/1.0/endpoints/a1b234c5678901d2.

Note

As a security best practice, you should not hard code these connection variable values into your code. Instead, you should retrieve these connection variable values from a secure location. For example, the code example later in this article uses environment variables.

Query data

The following code example demonstrates how to call the Databricks SQL Driver for Node.js to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the diamonds table.

The diamonds table is included in Sample datasets. This table is also featured in Get started with Databricks as a data scientist.

This code example retrieves the token, server_hostname and http_path connection variable values from a set of environment variables. These environment variables have the following environment variable names:

  • DATABRICKS_TOKEN, which represents your Databricks personal access token from the requirements.

  • DATABRICKS_SERVER_HOSTNAME, which represents the Server Hostname value from the requirements.

  • DATABRICKS_HTTP_PATH, which represents the HTTP Path value from the requirements.

You can use other approaches to retrieving these connection variable values. Using environment variables is just one approach among many.

const { DBSQLClient } = require('@databricks/sql');

var token          = process.env.DATABRICKS_TOKEN;
var serverHostname = process.env.DATABRICKS_SERVER_HOSTNAME;
var httpPath       = process.env.DATABRICKS_HTTP_PATH;

if (!token || !serverHostname || !httpPath) {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_TOKEN, " +
                  "DATABRICKS_SERVER_HOSTNAME, and DATABRICKS_HTTP_PATH.");
}

const client = new DBSQLClient();
const utils  = DBSQLClient.utils;

client.connect(
  options = {
    token: token,
    host:  serverHostname,
    path:  httpPath
  }).then(
    async client => {
      const session = await client.openSession();

      const queryOperation = await session.executeStatement(
        statement = 'SELECT * FROM default.diamonds LIMIT 2',
        options   = {
          runAsync: true,
          maxRows:  10000 // This option enables the direct results feature.
        }
      );

      const result = await queryOperation.fetchAll({
        progress: false,
        callback: () => {},
      });

      await queryOperation.close();

      console.table(result);

      await session.close();
      await client.close();
}).catch((error) => {
  console.log(error);
});
import { DBSQLClient } from '@databricks/sql';
import IDBSQLSession from '@databricks/sql/dist/contracts/IDBSQLSession';
import IOperation from '@databricks/sql/dist/contracts/IOperation';

var serverHostname: string = process.env.DATABRICKS_SERVER_HOSTNAME || '';
var httpPath: string       = process.env.DATABRICKS_HTTP_PATH || '';
var token: string          = process.env.DATABRICKS_TOKEN || '';

if (serverHostname == '' || httpPath == '' || token == '') {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_SERVER_HOSTNAME, " +
                  "DATABRICKS_HTTP_PATH, and DATABRICKS_TOKEN.");
}

const client: DBSQLClient = new DBSQLClient();

client.connect(
  {
    host:  serverHostname,
    path:  httpPath,
    token: token
  }).then(
    async client => {
      const session: IDBSQLSession = await client.openSession();

      const queryOperation: IOperation = await session.executeStatement(
        'SELECT * FROM default.diamonds LIMIT 2',
        {
          runAsync: true,
          maxRows:  10000 // This option enables the direct results feature.
        }
      );

      const result = await queryOperation.fetchAll({
        progress: false,
        callback: () => {},
      });

      await queryOperation.close();

      console.table(result);

      await session.close();
      client.close();
}).catch((error) => {
  console.log(error);
});

Output:

┌─────────┬─────┬────────┬───────────┬───────┬─────────┬────────┬───────┬───────┬────────┬────────┬────────┐
│ (index) │ _c0 │ carat  │    cut    │ color │ clarity │ depth  │ table │ price │   x    │   y    │   z    │
├─────────┼─────┼────────┼───────────┼───────┼─────────┼────────┼───────┼───────┼────────┼────────┼────────┤
│    0    │ '1' │ '0.23' │  'Ideal'  │  'E'  │  'SI2'  │ '61.5' │ '55'  │ '326' │ '3.95' │ '3.98' │ '2.43' │
│    1    │ '2' │ '0.21' │ 'Premium' │  'E'  │  'SI1'  │ '59.8' │ '61'  │ '326' │ '3.89' │ '3.84' │ '2.31' │
└─────────┴─────┴────────┴───────────┴───────┴─────────┴────────┴───────┴───────┴────────┴────────┴────────┘

For additional examples, see the examples folder in the databricks/databricks-sql-nodejs repository on GitHub.

Additional resources