Data Querying

You can execute queries and get back the result as a pandas DataFrame.

There are two ways to query data:

  • with cube.query by passing a list of measure and level objects

  • with session.query_mdx by passing an MDX string

Querying current session

[1]:
import atoti as tt

session = tt.create_session()
store = session.read_csv("data/sales.csv", keys=["Sale ID"])
cube = session.create_cube(store, "Sales")

Querying a cube

[2]:
m = cube.measures
lvl = cube.levels
cube.query(m["Amount.SUM"], m["Quantity.SUM"], levels=[lvl["Product"]])
[2]:
Amount.SUM Quantity.SUM
Product
BED_24 20,100.00 134.00
BED_25 39,600.00 132.00
BED_26 51,350.00 130.00
BED_27 58,080.00 132.00
BED_3 19,800.00 132.00
... ... ...
TSH_52 3,120.00 130.00
TSH_53 3,168.00 132.00
TSH_7 2,620.00 131.00
TSH_8 2,640.00 132.00
TSH_9 2,860.00 130.00

61 rows × 2 columns

Condition in queries

A condition can be added to filter the query:

[3]:
cube.query(
    m["Amount.SUM"],
    m["Quantity.SUM"],
    levels=[lvl["Product"]],
    condition=(lvl["Shop"] == "shop_0"),
)
[3]:
Amount.SUM Quantity.SUM
Product
BED_24 300.00 2.00
BED_25 1,800.00 6.00
BED_26 790.00 2.00
BED_27 880.00 2.00
BED_3 300.00 2.00
... ... ...
TSH_52 48.00 2.00
TSH_53 48.00 2.00
TSH_7 60.00 3.00
TSH_8 60.00 3.00
TSH_9 88.00 4.00

61 rows × 2 columns

Use level.isin or hierarchy.isin to filter on several values:

[4]:
cube.query(
    m["Amount.SUM"],
    m["Quantity.SUM"],
    levels=[lvl["Product"]],
    condition=lvl["Product"].isin("TSH_7", "TSH_8", "TSH_9"),
)
[4]:
Amount.SUM Quantity.SUM
Product
TSH_7 2,620.00 131.00
TSH_8 2,640.00 132.00
TSH_9 2,860.00 130.00

Conditions on different hierarchies can be combined with the & operator:

[5]:
cube.query(
    m["Amount.SUM"],
    m["Quantity.SUM"],
    levels=[lvl["Shop"], lvl["Product"]],
    condition=(
        lvl["Product"].isin("TSH_7", "TSH_8", "TSH_9")
        & lvl["Shop"].isin("shop_0", "shop_1", "shop_2")
    ),
)
[5]:
Amount.SUM Quantity.SUM
Shop Product
shop_0 TSH_7 60.00 3.00
TSH_8 60.00 3.00
TSH_9 88.00 4.00
shop_1 TSH_7 40.00 2.00
TSH_8 80.00 4.00
TSH_9 132.00 6.00
shop_2 TSH_7 40.00 2.00
TSH_8 120.00 6.00
TSH_9 44.00 2.00

MDX Query

Any MDX query can be used to retrieve data as a pandas DataFrame:

[6]:
mdx_query = """
SELECT
  NON EMPTY [Measures].[Amount.SUM] ON COLUMNS,
  NON EMPTY [Sales].[Product].[Product].Members ON ROWS
  FROM (
    SELECT
    {
      [Sales].[Date].[ALL].[AllMember].[2020-05-14],
      [Sales].[Date].[ALL].[AllMember].[2020-05-15],
      [Sales].[Date].[ALL].[AllMember].[2020-05-16]
    } ON COLUMNS
    FROM [Sales]
  )
"""
[7]:
session.query_mdx(mdx_query)
[7]:
Amount.SUM
Product
BED_24 2,400.00
BED_25 4,800.00
BED_26 5,530.00
BED_27 7,040.00
BED_3 1,800.00
... ...
TSH_52 336.00
TSH_53 360.00
TSH_7 260.00
TSH_8 280.00
TSH_9 352.00

61 rows × 1 columns

Querying remote cube

It is possible to connect to an existing cube and query it. This cube can come from another atoti process or even a classic ActivePivot (version >=5.7).

To do that, open a query session, passing the auth parameter to indicate how to authenticate against this server. A query session is immutable: its structure cannot be changed like regular local atoti sessions.

For this example, we will connect to the session created before with atoti:

[8]:
existing_session = tt.open_query_session(
    session.url,  # Here you would put the base URL of the existing session.
    # if we needed authentication we could pass it like that
    # auth=tt.query.create_basic_authentication("admin", "admin"),
)
existing_session
[8]:
  • http://localhost:65323
    • Cubes
      • Sales
        • Dimensions
          • Sales
            • Date
              1. Date
            • Product
              1. Product
            • Sale ID
              1. Sale ID
            • Shop
              1. Shop
        • Measures
          • Amount.MEAN
            • formatter: #,###.00
          • Amount.SUM
            • formatter: #,###.00
          • Quantity.MEAN
            • formatter: #,###.00
          • Quantity.SUM
            • formatter: #,###.00
          • Unit price.MEAN
            • formatter: #,###.00
          • Unit price.SUM
            • formatter: #,###.00
          • contributors.COUNT
            • formatter: #,###

Authentication

atoti provides helpers for 2 authentication schemes:

Basic authentication with username and password

[9]:
auth = tt.query.create_basic_authentication("admin", "admin")

Token based authentication (for instance JWT)

[10]:
auth = tt.query.create_token_authentication("some_jwt")

But custom authentication can also be used: open_query_session’s auth parameter accepts a function taking the server URL and returning a dictionary of HTTP headers to include in the request.

For instance this lambda is similar to create_token_authentication and can be used to authenticate with a token:

[11]:
auth = lambda url: {"Authorization": f"Bearer {token}"}

Querying a remote cube

You can retrieve the cubes of the remote session to access their levels and measures and query it.

[12]:
existing_cube = existing_session.cubes["Sales"]
lvl = existing_cube.levels
m = existing_cube.measures
existing_cube.query(m["Amount.SUM"], m["Quantity.SUM"], levels=[lvl["Product"]])
[12]:
Amount.SUM Quantity.SUM
Product
BED_24 20,100.00 134.00
BED_25 39,600.00 132.00
BED_26 51,350.00 130.00
BED_27 58,080.00 132.00
BED_3 19,800.00 132.00
... ... ...
TSH_52 3,120.00 130.00
TSH_53 3,168.00 132.00
TSH_7 2,620.00 131.00
TSH_8 2,640.00 132.00
TSH_9 2,860.00 130.00

61 rows × 2 columns

MDX queries are also supported:

[13]:
existing_session.query_mdx(mdx_query)
[13]:
Amount.SUM
Product
BED_24 2,400.00
BED_25 4,800.00
BED_26 5,530.00
BED_27 7,040.00
BED_3 1,800.00
... ...
TSH_52 336.00
TSH_53 360.00
TSH_7 260.00
TSH_8 280.00
TSH_9 352.00

61 rows × 1 columns

And interactive visualizations too:

[14]:
existing_session.visualize()
Open the notebook in JupyterLab with the atoti extension installed and enabled to start editing this widget.