atoti.agg module¶
- atoti.agg.count_distinct(operand, *, scope=None)¶
Return a measure equal to the distinct count of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Price.DISTINCT_COUNT"] = tt.agg.count_distinct(table["Price"]) >>> cube.query(m["Price.DISTINCT_COUNT"]) Price.DISTINCT_COUNT 0 3
- Return type
- atoti.agg.long(operand, *, scope=None)¶
Return a measure equal to the sum of the positive values of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Quantity.LONG"] = tt.agg.long(table["Quantity"]) >>> cube.query(m["Quantity.LONG"]) Quantity.LONG 0 1,110
- Return type
- atoti.agg.max(operand, *, scope=None)¶
Return a measure equal to the maximum of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Maximum Price"] = tt.agg.max(table["Price"]) >>> cube.query(m["Maximum Price"]) Maximum Price 0 43.00
- Return type
- atoti.agg.max_member(measure, level)¶
Return a measure equal to the member maximizing the passed measure on the given level.
When multiple members maximize the passed measure, the first one (according to the comparator of the given level) is returned.
- Parameters
measure (
Union
[MeasureDescription
,MeasureConvertible
]) – The measure to maximize.level (
Level
) – The level on which the maximizing member is searched for.
Example
>>> df = pd.DataFrame( ... columns=["Continent", "City", "Price"], ... data=[ ... ("Europe", "Paris", 200.0), ... ("Europe", "Berlin", 150.0), ... ("Europe", "London", 240.0), ... ("North America", "New York", 270.0), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="City price table", ... ) >>> table.head() Continent City Price 0 Europe Paris 200.0 1 Europe Berlin 150.0 2 Europe London 240.0 3 North America New York 270.0 >>> cube = session.create_cube(table, mode="manual") >>> h, l, m = cube.hierarchies, cube.levels, cube.measures >>> h["Geography"] = [table["Continent"], table["City"]] >>> m["Price"] = tt.value(table["Price"]) >>> m["City with maximum price"] = tt.agg.max_member(m["Price"], l["City"])
At the given level, the measure is equal to the current member of the City level:
>>> cube.query(m["City with maximum price"], levels=[l["City"]]) City with maximum price Continent City Europe Berlin Berlin London London Paris Paris North America New York New York
At a level above it, the measure is equal to the city of each continent with the maximum price:
>>> cube.query(m["City with maximum price"], levels=[l["Continent"]]) City with maximum price Continent Europe London North America New York
At the top level, the measure is equal to the city with the maximum price across all continents:
>>> cube.query(m["City with maximum price"]) City with maximum price 0 New York
- Return type
- atoti.agg.mean(operand, *, scope=None)¶
Return a measure equal to the mean of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Quantity.MEAN"] = tt.agg.mean(table["Quantity"]) >>> cube.query(m["Quantity.MEAN"]) Quantity.MEAN 0 370.00
- Return type
- atoti.agg.median(operand, *, scope=None)¶
Return a measure equal to the median of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Median Price"] = tt.agg.median(table["Price"]) >>> cube.query(m["Median Price"]) Median Price 0 25.90
- Return type
- atoti.agg.min(operand, *, scope=None)¶
Return a measure equal to the minimum of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Minimum Price"] = tt.agg.min(table["Price"]) >>> cube.query(m["Minimum Price"]) Minimum Price 0 12.50
- Return type
- atoti.agg.min_member(measure, level)¶
Return a measure equal to the member minimizing the passed measure on the given level.
When multiple members minimize the passed measure, the first one (according to the comparator of the given level) is returned.
- Parameters
measure (
Union
[MeasureDescription
,MeasureConvertible
]) – The measure to minimize.level (
Level
) – The level on which the minimizing member is searched for.
Example
>>> df = pd.DataFrame( ... columns=["Continent", "City", "Price"], ... data=[ ... ("Europe", "Paris", 200.0), ... ("Europe", "Berlin", 150.0), ... ("Europe", "London", 240.0), ... ("North America", "New York", 270.0), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="City price table", ... ) >>> table.head() Continent City Price 0 Europe Paris 200.0 1 Europe Berlin 150.0 2 Europe London 240.0 3 North America New York 270.0 >>> cube = session.create_cube(table, mode="manual") >>> h, l, m = cube.hierarchies, cube.levels, cube.measures >>> h["Geography"] = [table["Continent"], table["City"]] >>> m["Price"] = tt.value(table["Price"]) >>> m["City with minimum price"] = tt.agg.min_member(m["Price"], l["City"])
At the given level, the measure is equal to the current member of the City level:
>>> cube.query(m["City with minimum price"], levels=[l["City"]]) City with minimum price Continent City Europe Berlin Berlin London London Paris Paris North America New York New York
At a level above it, the measure is equal to the city of each continent with the minimum price:
>>> cube.query(m["City with minimum price"], levels=[l["Continent"]]) City with minimum price Continent Europe Berlin North America New York
At the top level, the measure is equal to the city with the minimum price across all continents:
>>> cube.query(m["City with minimum price"]) City with minimum price 0 Berlin
- atoti.agg.prod(operand, *, scope=None)¶
Return a measure equal to the product of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Other.PROD"] = tt.agg.prod(table["Other"]) >>> cube.query(m["Other.PROD"]) Other.PROD 0 4
- Return type
- atoti.agg.quantile(operand, q, *, mode='inc', interpolation='linear', scope=None)¶
Return a measure equal to the requested quantile of the passed measure across the specified scope.
Here is how to obtain the same behavior as these standard quantile calculation methods:
R-1:
mode="centered"
andinterpolation="lower"
R-2:
mode="centered"
andinterpolation="midpoint"
R-3:
mode="simple"
andinterpolation="nearest"
R-4:
mode="simple"
andinterpolation="linear"
R-5:
mode="centered"
andinterpolation="linear"
R-6 (similar to Excel’s
PERCENTILE.EXC
):mode="exc"
andinterpolation="linear"
R-7 (similar to Excel’s
PERCENTILE.INC
):mode="inc"
andinterpolation="linear"
R-8 and R-9 are not supported
The formulae given for the calculation of the quantile index assume a 1-based indexing system.
- Parameters
measure – The measure to get the quantile of.
q (
Union
[float
,MeasureDescription
]) – The quantile to take. Must be between0
and1
. For instance,0.95
is the 95th percentile and0.5
is the median.mode (
Literal
[‘simple’, ‘centered’, ‘inc’, ‘exc’]) –The method used to calculate the index of the quantile. Available options are, when searching for the q quantile of a vector
X
:simple
:len(X) * q
centered
:len(X) * q + 0.5
exc
:(len(X) + 1) * q
inc
:(len(X) - 1) * q + 1
interpolation (
Literal
[‘linear’, ‘higher’, ‘lower’, ‘nearest’, ‘midpoint’]) –If the quantile index is not an integer, the interpolation decides what value is returned. The different options are, considering a quantile index
k
withi < k < j
for a sorted vectorX
:linear
:v = X[i] + (X[j] - X[i]) * (k - i)
lowest
:v = X[i]
highest
:v = X[j]
nearest
:v = X[i]
orv = X[j]
depending on which ofi
orj
is closest tok
midpoint
:v = (X[i] + X[j]) / 2
scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
- Return type
- atoti.agg.short(operand, *, scope=None)¶
Return a measure equal to the sum of the negative values of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Quantity.SHORT"] = tt.agg.short(table["Quantity"]) >>> cube.query(m["Quantity.SHORT"]) Quantity.SHORT 0 0
- Return type
- atoti.agg.square_sum(operand, *, scope=None)¶
Return a measure equal to the sum of the square of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Other.SQUARE_SUM"] = tt.agg.square_sum(table["Other"]) >>> cube.query(m["Other.SQUARE_SUM"]) Other.SQUARE_SUM 0 9
- Return type
- atoti.agg.std(operand, *, mode='sample', scope=None)¶
Return a measure equal to the standard deviation of the passed measure across the specified scope.
- Parameters
measure – The measure to get the standard deviation of.
mode (
Literal
[‘sample’, ‘population’]) –One of the supported modes:
The
sample
standard deviation, similar to Excel’sSTDEV.S
, is \(\sqrt{\frac{\sum_{i=0}^{n} (X_i - m)^{2}}{n - 1}}\) wherem
is the sample mean andn
the size of the sample. Use this mode if the data represents a sample of the population.The
population
standard deviation, similar to Excel’sSTDEV.P
is \(\sqrt{\frac{\sum_{i=0}^{n}(X_i - m)^{2}}{n}}\) wherem
is the mean of theXi
elements andn
the size of the population. Use this mode if the data represents the entire population.
scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
- Return type
- atoti.agg.sum(operand, *, scope=None)¶
Return a measure equal to the sum of the passed measure across the specified scope.
- Parameters
operand (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – The measure or table column to aggregate.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> df = pd.DataFrame( ... columns=["id", "Quantity", "Price", "Other"], ... data=[ ... ("a1", 100, 12.5, 1), ... ("a2", 10, 43, 2), ... ("a3", 1000, 25.9, 2), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Product", ... keys=["id"], ... ) >>> table.head() Quantity Price Other id a1 100 12.5 1 a2 10 43.0 2 a3 1000 25.9 2 >>> cube = session.create_cube(table) >>> m = cube.measures >>> m["Quantity.SUM"] = tt.agg.sum(table["Quantity"]) >>> cube.query(m["Quantity.SUM"]) Quantity.SUM 0 1,110
- Return type
- atoti.agg.sum_product(*factors, scope=None)¶
Return a measure equal to the sum product aggregation of the passed factors across the specified scope.
- Parameters
factors (
Union
[Column
,Operation
,MeasureDescription
,MeasureConvertible
]) – Column, Measure or Level to do the sum product of.scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
Example
>>> from datetime import date >>> df = pd.DataFrame( ... columns=["Date", "Category", "Price", "Quantity", "Array"], ... data=[ ... (date(2020, 1, 1), "TV", 300.0, 5, [10.0, 15.0]), ... (date(2020, 1, 2), "TV", 200.0, 1, [5.0, 15.0]), ... (date(2020, 1, 1), "Computer", 900.0, 2, [2.0, 3.0]), ... (date(2020, 1, 2), "Computer", 800.0, 3, [10.0, 20.0]), ... (date(2020, 1, 1), "TV", 500.0, 2, [3.0, 10.0]), ... ], ... ) >>> table = session.read_pandas( ... df, ... table_name="Date", ... ) >>> table.head() Date Category Price Quantity Array 0 2020-01-01 TV 300.0 5 doubleVector[2]{10.0, ...} 1 2020-01-02 TV 200.0 1 doubleVector[2]{5.0, ...} 2 2020-01-01 Computer 900.0 2 doubleVector[2]{2.0, ...} 3 2020-01-02 Computer 800.0 3 doubleVector[2]{10.0, ...} 4 2020-01-01 TV 500.0 2 doubleVector[2]{3.0, ...} >>> cube = session.create_cube(table) >>> h, l, m = cube.hierarchies, cube.levels, cube.measures >>> m["turnover"] = tt.agg.sum_product(table["Price"], table["Quantity"]) >>> cube.query(m["turnover"], levels=[l["Category"]]) turnover Category Computer 4,200.00 TV 2,700.00 >>> m["array sum product"] = tt.agg.sum_product(table["Price"], table["Array"]) >>> cube.query(m["array sum product"]) array sum product 0 doubleVector[2]{15300.0, ...}
- Return type
- atoti.agg.var(operand, *, mode='sample', scope=None)¶
Return a measure equal to the variance of the passed measure across the specified scope.
- Parameters
measure – The measure to get the variance of.
mode (
Literal
[‘sample’, ‘population’]) –One of the supported modes:
The
sample
variance, similar to Excel’sVAR.S
, is \(\frac{\sum_{i=0}^{n} (X_i - m)^{2}}{n - 1}\) wherem
is the sample mean andn
the size of the sample. Use this mode if the data represents a sample of the population.The
population
variance, similar to Excel’sVAR.P
is \(\frac{\sum_{i=0}^{n}(X_i - m)^{2}}{n}\) wherem
is the mean of theXi
elements andn
the size of the population. Use this mode if the data represents the entire population.
scope (
Optional
[Scope
]) – The scope of the aggregation. WhenNone
is specified, the natural aggregation scope is used: it contains all the data in the cube which coordinates match the ones of the currently evaluated member.
- Return type