Bucketing

[1]:
import atoti as tt

session = tt.create_session()
store = session.read_csv("data/example.csv", keys=["ID"], store_name="First store")
cube = session.create_cube(store, "FirstCube")
store
[1]:
  • First store
    • ID
      • key: True
      • nullable: True
      • type: int
    • Date
      • key: False
      • nullable: True
      • type: LocalDate[yyyy-MM-dd]
    • Continent
      • key: False
      • nullable: True
      • type: string
    • Country
      • key: False
      • nullable: True
      • type: string
    • City
      • key: False
      • nullable: True
      • type: string
    • Color
      • key: False
      • nullable: True
      • type: string
    • Quantity
      • key: False
      • nullable: True
      • type: double
    • Price
      • key: False
      • nullable: True
      • type: double

With Atoti, it’s possible to create buckets:

[2]:
buckets = [
    ["red", "hot", 1.0],
    ["blue", "cold", 1.0],
    ["green", "hot", 0.5],
    ["green", "cold", 0.5],
]

cube.create_bucketing("First Bucket", ["Color"], buckets)
[2]:
  • First Bucket
    • Color
      • key: True
      • nullable: True
      • type: string
    • First Bucket
      • key: True
      • nullable: True
      • type: string
    • First Bucket_weight
      • key: False
      • nullable: True
      • type: double
[3]:
cube.query()
[3]:
First Bucket_weight Price.AVG Price.SUM Quantity.AVG Quantity.SUM contributors.COUNT
0 None None None None None 10
[4]:
cube.query(levels=cube.levels["First Bucket"])
[4]:
First Bucket_weight Price.AVG Price.SUM Quantity.AVG Quantity.SUM contributors.COUNT
First Bucket
cold None 420.000000 2520.0 2500.0 15000.0 6
hot None 428.333333 2570.0 2450.0 14700.0 6
[5]:
cube.query(levels=[cube.levels["First Bucket"], cube.levels["Color"]])
[5]:
First Bucket_weight Price.AVG Price.SUM Quantity.AVG Quantity.SUM contributors.COUNT
First Bucket Color
cold blue 1.0 427.5 1710.0 2000.0 8000.0 4
green 0.5 405.0 810.0 3500.0 7000.0 2
hot green 0.5 405.0 810.0 3500.0 7000.0 2
red 1.0 440.0 1760.0 1925.0 7700.0 4