Scenarios¶
[1]:
import pandas as pd
dataframe = pd.read_csv("data/example.csv")
dataframe
[1]:
ID | Date | Continent | Country | City | Color | Quantity | Price | |
---|---|---|---|---|---|---|---|---|
0 | 1 | 2019-01-01 | Europe | France | Paris | red | 1000.0 | 500.0 |
1 | 2 | 2019-01-02 | Europe | France | Lyon | red | 2000.0 | 400.0 |
2 | 3 | 2019-01-05 | Europe | France | Paris | blue | 3000.0 | 420.0 |
3 | 4 | 2018-01-01 | Europe | France | Bordeaux | blue | 1500.0 | 480.0 |
4 | 5 | 2019-01-01 | Europe | UK | London | green | 3000.0 | 460.0 |
5 | 6 | 2019-01-01 | Europe | UK | London | red | 2500.0 | 500.0 |
6 | 7 | 2019-01-02 | Asia | China | Beijing | blue | 2000.0 | 410.0 |
7 | 8 | 2019-01-05 | Asia | China | HongKong | green | 4000.0 | 350.0 |
8 | 9 | 2018-01-01 | Asia | India | Dehli | red | 2200.0 | 360.0 |
9 | 10 | 2019-01-01 | Asia | India | Mumbai | blue | 1500.0 | 400.0 |
[2]:
import atoti as tt
session = tt.create_session()
store = session.read_pandas(dataframe, "First", keys=["ID"])
cube = session.create_cube(store, "FirstCube")
cube.query()
Welcome to atoti 0.4.0!
By using this community edition, you agree with the license available at https://www.atoti.io/eula.
Browse the official documentation at https://docs.atoti.io.
Join the community at https://www.atoti.io/register.
You can hide this message by setting the ATOTI_HIDE_EULA_MESSAGE environment variable to True.
[2]:
Price.MEAN | Price.SUM | Quantity.MEAN | Quantity.SUM | contributors.COUNT | |
---|---|---|---|---|---|
0 | 428.0 | 4280.0 | 2270.0 | 22700.0 | 10 |
Scenarios from modified sources¶
Load a modified version of your data into the store to compare it to the original version.
First, let’s modify the source dataset with pandas:
[3]:
dataframe.loc[(dataframe["City"] == "Paris"), "Color"] = "purple"
dataframe
[3]:
ID | Date | Continent | Country | City | Color | Quantity | Price | |
---|---|---|---|---|---|---|---|---|
0 | 1 | 2019-01-01 | Europe | France | Paris | purple | 1000.0 | 500.0 |
1 | 2 | 2019-01-02 | Europe | France | Lyon | red | 2000.0 | 400.0 |
2 | 3 | 2019-01-05 | Europe | France | Paris | purple | 3000.0 | 420.0 |
3 | 4 | 2018-01-01 | Europe | France | Bordeaux | blue | 1500.0 | 480.0 |
4 | 5 | 2019-01-01 | Europe | UK | London | green | 3000.0 | 460.0 |
5 | 6 | 2019-01-01 | Europe | UK | London | red | 2500.0 | 500.0 |
6 | 7 | 2019-01-02 | Asia | China | Beijing | blue | 2000.0 | 410.0 |
7 | 8 | 2019-01-05 | Asia | China | HongKong | green | 4000.0 | 350.0 |
8 | 9 | 2018-01-01 | Asia | India | Dehli | red | 2200.0 | 360.0 |
9 | 10 | 2019-01-01 | Asia | India | Mumbai | blue | 1500.0 | 400.0 |
We can now load this dataset into a new scenario:
[4]:
store.scenarios["Purple Paris"].load_pandas(dataframe)
store.scenarios["Purple Paris"]
[4]:
- First
- ID
- key: True
- nullable: False
- type: long
- Date
- key: False
- nullable: True
- type: LocalDate[yyyy-MM-dd]
- Continent
- key: False
- nullable: True
- type: string
- Country
- key: False
- nullable: True
- type: string
- City
- key: False
- nullable: True
- type: string
- Color
- key: False
- nullable: True
- type: string
- Quantity
- key: False
- nullable: False
- type: double
- Price
- key: False
- nullable: False
- type: double
- ID
If you want the store to appear in application’s Source Simulation widget, you will need to enable this parameter on the store
[5]:
store.source_simulation_enabled = True
Scenarios measure scaling¶
You can also perform simulations by modifying the value of certain measures. There are three available simulation methods:
multiply: multiplies the measure’s value with the provided weight
replace: replaces the measure’s value with the given value
add: increments the measure’s value with the provided value
[6]:
help(cube.setup_simulation)
Help on method setup_simulation in module atoti.cube:
setup_simulation(name: 'str', *, base_scenario: 'str' = 'Base', levels: 'Optional[Sequence[Level]]' = None, multiply: 'Optional[Collection[Measure]]' = None, replace: 'Optional[Collection[Measure]]' = None, add: 'Optional[Collection[Measure]]' = None) -> 'Simulation' method of atoti.cube.Cube instance
Create a simulation store for the given measures.
Simulations can have as many scenarios as desired.
The same measure cannot be passed in several methods.
Args:
name: The name of the simulation.
base_scenario: The name of the base scenario.
levels: The levels to simulate on.
multiply: Measures whose values will be multiplied.
replace: Measures whose values will be replaced.
add: Measures whose values will be added (incremented).
Returns:
The simulation on which scenarios can be made.
[7]:
lvl = cube.levels
m = cube.measures
Calling the cube.setup_simulation
method will create a special type of store which controls the simulation. This store’s key fields are the levels you provide, as well as the column with the name of the simulation, which contains the scenario names.
[8]:
simulation = cube.setup_simulation(
"First simulation",
levels=[lvl["Continent"]],
multiply=[m["Quantity.SUM"], m["Quantity.MEAN"]],
replace=[m["Price.MEAN"]],
)
simulation
[8]:
- First simulation
- Levels
- Continent
- Scenarios
- Base
- multiply
- Quantity.SUM
- Quantity.MEAN
- replace
- Price.MEAN
- Levels
[9]:
simulation.head()
[9]:
First simulation_Quantity.SUM_multiply | First simulation_Quantity.MEAN_multiply | First simulation_Price.MEAN_replace | Priority | ||
---|---|---|---|---|---|
Continent | First simulation |
In order to populate this simulation with scenarios, you can either populate the simulation the same way you would a regular store, with a CSV file or a DataFrame. Or you can create a scenario for this simulation and then insert rows into it either manually or from a source file.
When filling a scenario, (not the simulation), there are several things which are done automatically. For instance, you don’t need to specify the priority.
[10]:
asian_growth = simulation.scenarios["Asian growth"]
asian_growth += ("Asia", 1.05, 1.05, 20)
asian_growth += ("Europe", 0.9, 0.9, 0.85)
asian_growth.head()
[10]:
First simulation_Quantity.SUM_multiply | First simulation_Quantity.MEAN_multiply | First simulation_Price.MEAN_replace | Priority | |
---|---|---|---|---|
Continent | ||||
Asia | 1.05 | 1.05 | 20.00 | 0.0 |
Europe | 0.90 | 0.90 | 0.85 | 0.0 |
Calling head
on the scenario will display the rows of the Simulation handled by this scenario.
[11]:
cube.query(
m["Quantity.SUM"],
m["Quantity.MEAN"],
m["Price.MEAN"],
levels=[lvl["Continent"], lvl["First simulation"]],
)
[11]:
Quantity.SUM | Quantity.MEAN | Price.MEAN | ||
---|---|---|---|---|
Continent | First simulation | |||
Asia | Base | 9700.0 | 2425.000000 | 380.00 |
Asian growth | 10185.0 | 2546.250000 | 20.00 | |
Europe | Base | 13000.0 | 2166.666667 | 460.00 |
Asian growth | 11700.0 | 1950.000000 | 0.85 |
You can create as many scenarios as you want. However, every scenario of a simulation uses the same methods. If you want to use a different method you must create a new simulation.
When loading a DataFrame into the simulation, you can get the required headers from the simulation. The same is true for a scenario. For the priority, you can use any numeric value you want. When loading a DataFrame into a scenario, the priority column is optional.
[12]:
growth = simulation.scenarios["Growth"]
normal_priority = 0
rows = [
["Asia", 1.1, 1.1, 22, normal_priority],
["Europe", 1.2, 1.3, 15, normal_priority],
]
df = pd.DataFrame(rows, columns=growth.columns)
df
[12]:
Continent | First simulation_Quantity.SUM_multiply | First simulation_Quantity.MEAN_multiply | First simulation_Price.MEAN_replace | Priority | |
---|---|---|---|---|---|
0 | Asia | 1.1 | 1.1 | 22 | 0 |
1 | Europe | 1.2 | 1.3 | 15 | 0 |
[13]:
growth = simulation.scenarios["Growth"].load_pandas(df)
growth.head()
[13]:
First simulation_Quantity.SUM_multiply | First simulation_Quantity.MEAN_multiply | First simulation_Price.MEAN_replace | Priority | |
---|---|---|---|---|
Continent | ||||
Asia | 1.1 | 1.1 | 22.0 | 0.0 |
Europe | 1.2 | 1.3 | 15.0 | 0.0 |
You can delete a simulation from the cube’s simulation dict
[14]:
del cube.simulations["First simulation"]
You can create simulations wihout any levels. These simulations can only have one value per scenario. For simulations whithout any fields and acting on only one measure, you can assign values to scenarios instead of inserting tuples.
[15]:
no_field_sim = cube.setup_simulation("No Field Sim", multiply=[m["Price.SUM"]])
no_field_sim.scenarios["- 10 %"] = 0.9
no_field_sim.scenarios["+ 20%"] = 1.2
no_field_sim.head()
[15]:
No Field Sim_Price.SUM_multiply | Priority | |
---|---|---|
No Field Sim | ||
- 10 % | 0.9 | 0.0 |
+ 20% | 1.2 | 0.0 |
The priority rules are used to supersed the default behaviour between conflicting rules in the simulation. The basic rule is the more wildcard fields there are and the further to the left these wildcards are, the lower the priority. So in the following example, the rule on Europe overrides the one on France. To use wildcards, simply pass None
as the value for the column(s) you want the wildcard to be on.
[16]:
priority_sim = cube.setup_simulation(
"Priority Simulation",
levels=[lvl["Continent"], lvl["Country"]],
multiply=[m["Quantity.MEAN"]],
)
priority_scenario = priority_sim.scenarios["growth"]
priority_scenario += ("Europe", None, 1.1)
priority_scenario += (None, "France", 1.2)
priority_scenario.head()
[16]:
Priority Simulation_Quantity.MEAN_multiply | Priority | ||
---|---|---|---|
Continent | Country | ||
Europe | NaN | 1.1 | 0.0 |
NaN | France | 1.2 | 0.0 |
[17]:
cube.query(
m["Quantity.MEAN"],
levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)
[17]:
Quantity.MEAN | |||
---|---|---|---|
Continent | Country | Priority Simulation | |
Asia | China | Base | 3000.0 |
growth | 3000.0 | ||
India | Base | 1850.0 | |
growth | 1850.0 | ||
Europe | France | Base | 1875.0 |
growth | 2062.5 | ||
UK | Base | 2750.0 | |
growth | 3025.0 |
To override the Europe rule with one specific to France, we can give a higher priority:
[18]:
priority_scenario += (
None,
"France",
120,
normal_priority + 1,
)
priority_scenario.head()
[18]:
Priority Simulation_Quantity.MEAN_multiply | Priority | ||
---|---|---|---|
Continent | Country | ||
Europe | NaN | 1.1 | 0.0 |
NaN | France | 120.0 | 1.0 |
[19]:
cube.query(
m["Quantity.MEAN"],
levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)
[19]:
Quantity.MEAN | |||
---|---|---|---|
Continent | Country | Priority Simulation | |
Asia | China | Base | 3000.0 |
growth | 3000.0 | ||
India | Base | 1850.0 | |
growth | 1850.0 | ||
Europe | France | Base | 1875.0 |
growth | 225000.0 | ||
UK | Base | 2750.0 | |
growth | 3025.0 |
[ ]: