Use DirectQuery#

This feature is not part of the community edition: it needs to be unlocked.

This will walk you through the basics of DirectQuery by creating an application on Snowflake.

We’ll see how to:

Connect to an external database
Create a data model based on this external database
Use this data model in a cube

[1]:

import atoti as tt

[2]:

session = tt.Session.start()

Connecting to the external database#

Once we have a session, we can connect to the external database:

[3]:

import os

from atoti_directquery_snowflake import ConnectionConfig

connection_config = ConnectionConfig(
    url=f"jdbc:snowflake://{os.environ['SNOWFLAKE_ACCOUNT_IDENTIFIER']}.snowflakecomputing.com/?user={os.environ['SNOWFLAKE_USERNAME']}&database=TEST_RESOURCES&schema=TESTS",
    password=os.environ["SNOWFLAKE_PASSWORD"],
)
external_database = session.connect_to_external_database(connection_config)

The choice of the type of connection is what differs between the various supported databases. For example, you can switch to atoti_directquery_clickhouse.ConnectionConfig to connect to Clickhouse instead of Snowflake.

Discovering the external tables#

We can list the various tables available on the Snowflake server:

[4]:

# The method below is private and susceptible to change in patch releases.
external_database.tables._filter(schema_name="TUTORIAL")

[4]:

SNOWFLAKE
- TEST_RESOURCES
  - TUTORIAL
    1. MULTI_COLUMN_QUANTITY
    2. MULTI_ROW_QUANTITY
    3. PRODUCTS
    4. SALES
    5. SALES_BY_PRODUCT

Building the data model#

We can see the 2 tables of interest to us, add them to the session, and join them:

[5]:

from atoti_directquery_snowflake import TableConfig

sales_table = session.add_external_table(
    external_database.tables["TEST_RESOURCES", "TUTORIAL", "SALES"],
    config=TableConfig(keys=["SALE_ID"]),
)

[6]:

products_table = session.add_external_table(
    external_database.tables["TEST_RESOURCES", "TUTORIAL", "PRODUCTS"],
    config=TableConfig(keys=["PRODUCT_ID"]),
)

[7]:

sales_table.join(products_table, sales_table["PRODUCT"] == products_table["PRODUCT_ID"])

Note that at any point, you can inspect the external table definition, as well as its definition as interpreted by Atoti.

[8]:

# Table definition in the external database:
external_database.tables["TEST_RESOURCES", "TUTORIAL", "SALES"]

[8]:

SALES
- SALE_ID: String
- DATE: LocalDate
- SHOP: String
- PRODUCT: String
- QUANTITY: double
- UNIT_PRICE: double

[9]:

# Table definition in the Atoti session:
sales_table

[9]:

SALES
- SALE_ID
  - key: True
  - type: String
  - default_value: N/A
- DATE
  - key: False
  - type: LocalDate
  - default_value: datetime.date(1970, 1, 1)
- SHOP
  - key: False
  - type: String
  - default_value: N/A
- PRODUCT
  - key: False
  - type: String
  - default_value: N/A
- QUANTITY
  - key: False
  - type: double
  - default_value: 0.0
- UNIT_PRICE
  - key: False
  - type: double
  - default_value: 0.0

Creating the cube#

The cube, its hierarchies and measures can be defined as we would do on a session with an in-memory database:

[10]:

cube = session.create_cube(sales_table)
h, l, m = cube.hierarchies, cube.levels, cube.measures

[11]:

h["CATEGORY"] = [l["CATEGORY"], l["SUB_CATEGORY"]]
del h["SUB_CATEGORY"]

[12]:

m["Max Price"] = tt.agg.max(sales_table["UNIT_PRICE"])

Running a query#

Let’s make sure everything works, by running a query:

[13]:

cube.query(m["Max Price"], levels=[l["SHOP"]])

[13]:

	Max Price
SHOP
shop_0	210.00
shop_1	300.00
shop_2	300.00
shop_3	210.00
shop_4	300.00
shop_5	150.00

Refreshing the data#

If the external database receives updates, some elements retrieved by DirectQuery will be out-of-date. You can manually trigger a refresh of the session to restore synchronization:

[14]:

# The method below is private and susceptible to change in patch releases.
session._synchronize_with_external_database()

Going further#

Go here for the list of supported databases.