Scenarios¶

[1]:

import pandas as pd

dataframe = pd.read_csv("data/example.csv")
dataframe

[1]:

	ID	Date	Continent	Country	City	Color	Quantity	Price
0	1	2019-01-01	Europe	France	Paris	red	1000.0	500.0
1	2	2019-01-02	Europe	France	Lyon	red	2000.0	400.0
2	3	2019-01-05	Europe	France	Paris	blue	3000.0	420.0
3	4	2018-01-01	Europe	France	Bordeaux	blue	1500.0	480.0
4	5	2019-01-01	Europe	UK	London	green	3000.0	460.0
5	6	2019-01-01	Europe	UK	London	red	2500.0	500.0
6	7	2019-01-02	Asia	China	Beijing	blue	2000.0	410.0
7	8	2019-01-05	Asia	China	HongKong	green	4000.0	350.0
8	9	2018-01-01	Asia	India	Dehli	red	2200.0	360.0
9	10	2019-01-01	Asia	India	Mumbai	blue	1500.0	400.0

[2]:

import atoti as tt

session = tt.create_session()
store = session.read_pandas(dataframe, keys=["ID"], store_name="First store")
cube = session.create_cube(store, "FirstCube")
cube.query()

[2]:

	Price.AVG	Price.SUM	Quantity.AVG	Quantity.SUM	contributors.COUNT
0	428.0	4280.0	2270.0	22700.0	10

Scenarios from modified sources¶

Load a modified version of your data into the store to compare it to the original version.

First, let’s modify the source dataset with pandas:

[3]:

dataframe.loc[(dataframe["City"] == "Paris"), "Color"] = "purple"
dataframe

[3]:

	ID	Date	Continent	Country	City	Color	Quantity	Price
0	1	2019-01-01	Europe	France	Paris	purple	1000.0	500.0
1	2	2019-01-02	Europe	France	Lyon	red	2000.0	400.0
2	3	2019-01-05	Europe	France	Paris	purple	3000.0	420.0
3	4	2018-01-01	Europe	France	Bordeaux	blue	1500.0	480.0
4	5	2019-01-01	Europe	UK	London	green	3000.0	460.0
5	6	2019-01-01	Europe	UK	London	red	2500.0	500.0
6	7	2019-01-02	Asia	China	Beijing	blue	2000.0	410.0
7	8	2019-01-05	Asia	China	HongKong	green	4000.0	350.0
8	9	2018-01-01	Asia	India	Dehli	red	2200.0	360.0
9	10	2019-01-01	Asia	India	Mumbai	blue	1500.0	400.0

We can now load this dataset into a new scenario:

[4]:

store.scenarios["Purple Paris"].load_pandas(dataframe)
store.scenarios["Purple Paris"]

[4]:

First store
- ID
  - key: True
  - nullable: False
  - type: long
- Date
  - key: False
  - nullable: True
  - type: LocalDate[yyyy-MM-dd]
- Continent
  - key: False
  - nullable: True
  - type: string
- Country
  - key: False
  - nullable: True
  - type: string
- City
  - key: False
  - nullable: True
  - type: string
- Color
  - key: False
  - nullable: True
  - type: string
- Quantity
  - key: False
  - nullable: False
  - type: double
- Price
  - key: False
  - nullable: False
  - type: double

If you want the store to appear in application’s Source Simulation widget, you will need to enable this parameter on the store

[5]:

store.source_simulation_enabled = True

Scenarios measure scaling¶

You can also perform simulations by modifying the value of certain measures. There are three available simulation methods:

multiply: multiplies the measure’s value with the provided weight
replace: replaces the measure’s value with the given value
add: increments the measure’s value with the provided value

[6]:

help(cube.setup_simulation)

Help on method setup_simulation in module atoti.cube:

setup_simulation(name: 'str', multiply: 'Optional[Collection[Measure]]' = None, replace: 'Optional[Collection[Measure]]' = None, add: 'Optional[Collection[Measure]]' = None, per: 'Optional[Sequence[Level]]' = None, base_scenario_name: 'str' = 'Base') -> 'Simulation' method of atoti.cube.Cube instance
    Create a simulation for the given measures.

    This creates a store to configure the simulation.
    You cannot use the same measure in several methods.

    You can create as many scenarios as you want for each simulation you create.

    Args:
        name: The name of the simulation
        multiply: Collection of measures whose values will be multiplied
        replace: Collection of measures whose values will be replaced
        add: Collection of measures whose values will be added (incremented)
        per: Sequence of levels to simulate on
        base_scenario_name: The name of the base scenario
    Returns:
        The simulation on which scenarios can be made

[7]:

lvl = cube.levels
m = cube.measures

Calling the cube.setup_simulation method will create a special type of store which controls the simulation. This store’s key fields are the levels you provide, as well as the column with the name of the simulation, which contains the scenario names.

[8]:

simulation = cube.setup_simulation(
    "First simulation",
    per=[lvl["Continent"]],
    multiply=[m["Quantity.SUM"], m["Quantity.AVG"]],
    replace=[m["Price.AVG"]],
)
simulation

[8]:

First simulation
- Levels
  1. Continent
- Scenarios
  1. Base
- multiply
  1. Quantity.SUM
  2. Quantity.AVG
- replace
  1. Price.AVG

[9]:

simulation.head()

[9]:

		First simulation_Quantity.SUM_multiply	First simulation_Quantity.AVG_multiply	First simulation_Price.AVG_replace	Priority
Continent	First simulation

In order to populate this simulation with scenarios, you can either populate the simulation the same way you would a regular store, with a CSV file or a DataFrame. Or you can create a scenario for this simulation and then insert rows into it either manually or from a source file.

When filling a scenario, (not the simulation), there are several things which are done automatically. For instance, you don’t need to specify the priority.

[10]:

asian_growth = simulation.scenarios["Asian growth"]
asian_growth += ("Asia", 1.05, 1.05, 20)
asian_growth += ("Europe", 0.9, 0.9, 0.85)
asian_growth

[10]:

Scenario(name='Asian growth', _simulation=Simulation(_name='First simulation', _levels=[Level(_name='Continent', _column_name='Continent', _data_type='string', _hierarchy=Hierarchy(_name='Continent', _levels={'Continent': ...}, _dimension='Hierarchies', _slicing=False), _comparator=None)], _multiply=[NamedMeasure(_name='Quantity.SUM', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True), NamedMeasure(_name='Quantity.AVG', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True)], _replace=[NamedMeasure(_name='Price.AVG', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True)], _add=[], _base_scenario='Base'))

Calling head on the scenario will display the rows of the Simulation handled by this scenario.

[11]:

cube.query(
    m["Quantity.SUM"],
    m["Quantity.AVG"],
    m["Price.AVG"],
    levels=[lvl["Continent"], lvl["First simulation"]],
)

[11]:

		Quantity.SUM	Quantity.AVG	Price.AVG
Continent	First simulation
Asia	Base	9700.0	2425.000000	380.00
Asia	Asian growth	10185.0	2546.250000	20.00
Europe	Base	13000.0	2166.666667	460.00
Europe	Asian growth	11700.0	1950.000000	0.85

You can create as many scenarios as you want. However, every scenario of a simulation uses the same methods. If you want to use a different method you must create a new simulation.

When loading a dataframe into the simulation, you can get the required headers from the simulation. The same is true for a scenario. For the Priority you can either use the provided ‘tt.simulation.Priority’ values, or any numeric value you want. If you use the provided values, they will be converted into integer values by the API. When loading a DataFrame into a scenario, the priority column is optional.

[12]:

growth = simulation.scenarios["Growth"]
rows = [
    ["Asia", tt.simulation.Priority.NORMAL, 1.1, 1.1, 22],
    ["Europe", tt.simulation.Priority.NORMAL, 1.2, 1.3, 15],
]
df = pd.DataFrame(rows, columns=growth.columns)
df

[12]:

	Continent	Priority	First simulation_Quantity.SUM_multiply	First simulation_Quantity.AVG_multiply	First simulation_Price.AVG_replace
0	Asia	Priority.NORMAL	1.1	1.1	22
1	Europe	Priority.NORMAL	1.2	1.3	15

[13]:

growth = simulation.scenarios["Growth"].load_pandas(df)
growth

[13]:

Scenario(name='Growth', _simulation=Simulation(_name='First simulation', _levels=[Level(_name='Continent', _column_name='Continent', _data_type='string', _hierarchy=Hierarchy(_name='Continent', _levels={'Continent': ...}, _dimension='Hierarchies', _slicing=False), _comparator=None)], _multiply=[NamedMeasure(_name='Quantity.SUM', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True), NamedMeasure(_name='Quantity.AVG', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True)], _replace=[NamedMeasure(_name='Price.AVG', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True)], _add=[], _base_scenario='Base'))

You can delete a simulation from the cube’s simulation dict

[14]:

del cube.simulations["First simulation"]

You can create simulations wihout any levels. These simulations can only have one value per scenario. For simulations whithout any fields and acting on only one measure, you can assign values to scenarios instead of inserting tuples.

[15]:

no_field_sim = cube.setup_simulation("No Field Sim", multiply=[m["Price.SUM"]])
no_field_sim.scenarios["- 10 %"] = 0.9
no_field_sim.scenarios["+ 20%"] = 1.2
no_field_sim.head()

[15]:

	No Field Sim_Price.SUM_multiply	Priority
No Field Sim
- 10 %	0.9	1.0
+ 20%	1.2	1.0

The priority rules are used to supersed the default behaviour between conflicting rules in the simulation. The basic rule is the more wildcard fields there are and the further to the left these wildcards are, the lower the priority. So in the following example, the rule on Europe overrides the one on France.

[16]:

priority_sim = cube.setup_simulation(
    "Priority Simulation",
    per=[lvl["Continent"], lvl["Country"]],
    multiply=[m["Quantity.AVG"]],
)
priority_scenario = priority_sim.scenarios["growth"]
priority_scenario += ("Europe", "*", 1.1)
priority_scenario += ("*", "France", 1.2)
priority_scenario.head()

[16]:

		Priority Simulation_Quantity.AVG_multiply	Priority
Continent	Country
Europe	*	1.1	1.0
*	France	1.2	1.0

[17]:

cube.query(
    m["Quantity.AVG"],
    levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)

[17]:

			Quantity.AVG
Continent	Country	Priority Simulation
Asia	China	Base	3000.0
	China	growth	3000.0
	India	Base	1850.0
	India	growth	1850.0
Europe	France	Base	1875.0
	France	growth	2062.5
	UK	Base	2750.0
	UK	growth	3025.0

Let’s imagine we wanted our rule on France to override the one on Europe. By default, we assign the tt.simulation.Priority.NORMAL value to any row for which the priority is not specified. If we give the rule on France a higher Priority than that, then it will override the other rules impacting France.

[18]:

priority_scenario += ("*", "France", 120, tt.simulation.Priority.IMPORTANT)
priority_scenario

[18]:

Scenario(name='growth', _simulation=Simulation(_name='Priority Simulation', _levels=[Level(_name='Continent', _column_name='Continent', _data_type='string', _hierarchy=Hierarchy(_name='Continent', _levels={'Continent': ...}, _dimension='Hierarchies', _slicing=False), _comparator=None), Level(_name='Country', _column_name='Country', _data_type='string', _hierarchy=Hierarchy(_name='Country', _levels={'Country': ...}, _dimension='Hierarchies', _slicing=False), _comparator=None)], _multiply=[NamedMeasure(_name='Quantity.AVG', _folder=None, _formatter='DOUBLE[#,###.00]', _visible=True)], _replace=[], _add=[], _base_scenario='Base'))

[19]:

cube.query(
    m["Quantity.AVG"],
    levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)

[19]:

			Quantity.AVG
Continent	Country	Priority Simulation
Asia	China	Base	3000.0
	China	growth	3000.0
	India	Base	1850.0
	India	growth	1850.0
Europe	France	Base	1875.0
	France	growth	225000.0
	UK	Base	2750.0
	UK	growth	3025.0