atoti_directquery_synapse.SynapseTableOptions#

class atoti_directquery_synapse.SynapseTableOptions#
__init__(*, array_conversion=None, clustering_columns=(), keys=None)#

Additional options about the external table to create.

Parameters:
  • array_conversion (MultiColumnArrayConversion | MultiRowArrayConversion | None) – Option to convert some values spread over multiple columns or rows into array columns

  • clustering_columns (Sequence[str]) –

    The names of the columns used for clustering.

    Feeding aggregate providers from an external database can result in very large queries to be run on this database. Clustering columns split up queries made by DirectQuery to the external database when feeding aggregate providers.

  • keys (Sequence[str] | None) –

    The columns that will become keys of the table.

    Inserting a row containing key values equal to the ones of an existing row will replace the existing row with the new one.

    Key columns cannot have None as their default_value.

Example

>>> from atoti_directquery_synapse import SynapseTableOptions
>>> external_table = external_database.tables["TUTORIAL", "SALES"]
>>> table = session.add_external_table(
...     external_table,
...     table_name="sales_renamed",
...     options=SynapseTableOptions(keys=["SALE_ID"]),
... )

To get arrays from values stored on multiple rows:

>>> multi_row_external_table = external_database.tables[
...     "TUTORIAL", "MULTI_ROW_QUANTITY"
... ]
>>> multi_row_external_table.columns
['PRODUCT', 'INDEX', 'QUANTITY']
>>> multi_row_table = session.add_external_table(
...     multi_row_external_table,
...     table_name="Sales (Multi row array)",
...     options=SynapseTableOptions(
...         array_conversion=tt.MultiRowArrayConversion(
...             index_column="INDEX",
...             array_columns=["QUANTITY"],
...         ),
...     ),
... )
>>> multi_row_table.head()
                                 QUANTITY
PRODUCT
product_1  [10.0, 20.0, 15.0, 25.0, 10.0]
product_2  [50.0, 65.0, 55.0, 30.0, 80.0]

To get arrays from values stored on multiple columns:

>>> multi_column_external_table = external_database.tables[
...     "TUTORIAL", "MULTI_COLUMN_QUANTITY"
... ]
>>> multi_column_external_table.columns
['PRODUCT', 'QUANTITY_0', 'QUANTITY_1', 'QUANTITY_2', 'QUANTITY_3', 'QUANTITY_4']
>>> multi_column_table = session.add_external_table(
...     multi_column_external_table,
...     table_name="Sales (Multi column array)",
...     options=SynapseTableOptions(
...         keys=["PRODUCT"],
...         array_conversion=tt.MultiColumnArrayConversion(
...             column_prefixes=["QUANTITY"],
...         ),
...     ),
... )
>>> multi_column_table.head()
                                 QUANTITY
PRODUCT
product_1  [10.0, 20.0, 15.0, 25.0, 10.0]
product_2  [50.0, 65.0, 55.0, 30.0, 80.0]