atoti_directquery_databricks.DatabricksConnectionInfo#

class atoti_directquery_databricks.DatabricksConnectionInfo#

Information needed to connect to a Databricks database.

__init__(url, /, *, auto_multi_column_array_conversion=None, heavy_load_url=None, lookup_mode='warn', max_sub_queries=500, password=None, query_timeout=datetime.timedelta(seconds=3600), time_travel='strict')#

Create a Databricks connection info.

To aggregate native Databrick arrays, UDAFs (User Defined Aggregation Functions) provided by ActiveViam must be registered on the cluster. Native array aggregation is not supported on SQL warehouses.

Parameters:
  • url (str) – The JDBC connection string.

  • auto_multi_column_array_conversion (AutoMultiColumnArrayConversion | None) – When not None, multi column array conversion will be performed automatically.

  • heavy_load_url (str | None) – When not None, this JDBC connection string will be used instead of url for the heavy load phases (e.g. startup and refresh).

  • lookup_mode (Literal['allow', 'warn', 'deny']) – Whether lookup queries on the external database are allowed. Lookup can be very slow and expensive as the database may not enforce primary keys.

  • max_sub_queries (int) – Maximum number of subqueries performed when splitting a query into multi-steps queries.

  • password (str | None) –

    The password to connect to the database.

    Passing it in this separate parameter allows to avoid having it logged alongside the connection string.

    If None, a password is expected to be present in the passed url.

  • query_timeout (timedelta) – Timeout for queries performed on the external database.

  • time_travel (Literal[False, 'lax', 'strict']) –

    How to use Databricks’ time travel feature.

    Databricks does not support time travel with views, so the options are:

    • False: tables and views are queried on the latest state of the database.

    • "lax": tables are queried with time travel but views are queried without it.

    • "strict": tables are queried with time travel and querying a view raises an error.

Example

>>> import os
>>> from atoti_directquery_databricks import DatabricksConnectionInfo
>>> connection_info = DatabricksConnectionInfo(
...     "jdbc:databricks://"
...     + os.environ["DATABRICKS_SERVER_HOSTNAME"]
...     + "/default;"
...     + "transportMode=http;"
...     + "ssl=1;"
...     + "httpPath="
...     + os.environ["DATABRICKS_HTTP_PATH"]
...     + ";"
...     + "AuthMech=3;"
...     + "UID=token;",
...     password=os.environ["DATABRICKS_AUTH_TOKEN"],
... )
>>> external_database = session.connect_to_external_database(connection_info)