atoti_directquery_databricks.ConnectionConfig#
- final class atoti_directquery_databricks.ConnectionConfig#
Config to connect to a Databricks database.
Native array aggregation is not supported on SQL warehouses.
Example
>>> import os >>> from atoti_directquery_databricks import ConnectionConfig >>> connection_config = ConnectionConfig( ... url="jdbc:databricks://" ... + os.environ["DATABRICKS_SERVER_HOSTNAME"] ... + "/default;" ... + "transportMode=http;" ... + "ssl=1;" ... + "httpPath=" ... + os.environ["DATABRICKS_HTTP_PATH_CLUSTER_RUNTIME_15"] ... + ";" ... + "AuthMech=3;" ... + "UID=token;" ... + "EnableArrow=0;", ... password=os.environ["DATABRICKS_AUTH_TOKEN"], ... ) >>> external_database = session.connect_to_external_database(connection_config)
- array_long_agg_function_name: str | None = None#
The name (if different from the default) of the UDAF performing
atoti.agg.long()
on native arrays.Note
This function must be defined in Databricks and accessible to the role running the queries.
- array_short_agg_function_name: str | None = None#
The name (if different from the default) of the UDAF performing
atoti.agg.short()
on native arrays.Note
This function must be defined in Databricks and accessible to the role running the queries.
- array_sum_agg_function_name: str | None = None#
The name (if different from the default) of the UDAF performing
atoti.agg.sum()
on native arrays.Note
This function must be defined in Databricks and accessible to the role running the queries.
- array_sum_product_agg_function_name: str | None = None#
The name (if different from the default) of the UDAF performing
atoti.agg.sum_product()
on native arrays.Note
This function must be defined in Databricks and accessible to the role running the queries.
- auto_multi_column_array_conversion: AutoMultiColumnArrayConversion | None = None#
When not
None
, multi-column array conversion will be performed automatically.
- column_clustered_queries: 'all' | 'feeding' = 'feeding'#
Control which queries will use clustering columns.
- feeding_query_timeout: Duration = datetime.timedelta(seconds=3600)#
Timeout for queries performed on the external database during feeding phases.
The feeding phases are:
the initial load to feed
aggregate_providers
andhierarchies
;the refresh operations.
- feeding_url: str | None = None#
When not
None
, this JDBC connection string will be used instead ofurl
for the feeding phases.
- lookup_mode: 'allow' | 'warn' | 'deny' = 'warn'#
Whether lookup queries on the external database are allowed.
Lookup can be very slow and expensive as the database may not enforce primary keys.
- max_sub_queries: Annotated[int, Field(gt=0)] = 500#
Maximum number of sub queries performed when splitting a query into multi-step queries.
- password: str | None = None#
The password to connect to the database.
Passing it in this separate attribute prevents it from being logged alongside the connection string.
If
None
, a password is expected to be present inurl
.
- query_timeout: Duration = datetime.timedelta(seconds=300)#
Timeout for queries performed on the external database outside feeding phases.
- time_travel: Literal[False, 'lax', 'strict'] = 'strict'#
How to use Databricks’ time travel feature.
Databricks does not support time travel with views, so the options are:
False
: tables and views are queried on the latest state of the database."lax"
: tables are queried with time travel but views are queried without it."strict"
: tables are queried with time travel and querying a view raises an error.