0.6.0 (July 20, 2021)#

Highlights:

The main theme of this release is simplification.

From the session configuration to simulations through data loading, many aspects of the API have been revised. The goal was to increase the API’s intuitiveness and consistency, in turn making Atoti easier to learn and use.

This release also comes with breaking changes, most of them being the result of:

Naming standardization (e.g. Table instead of Store).
Removal of non Atoti specific features which can be replicated just as well using standard Python functions or popular libraries (e.g. data sampling and file watching).

A lot of performance improvement work also happened behind the scenes. In particular, creating and joining tables and defining new measures will be quicker than before.

Added#

atoti.Session.link() and atoti_query.QuerySession.link to display a link to the session in JupyterLab.
atoti.UserContentStorageConfig to back the user content storage with a remote database (issue #277).
Connect with Excel and Watch local files how-tos.

User interface#

Filters can be saved.

Like dashboards, filters are saved in the atoti.create_session()’s user_content_storage parameter.
Saved dashboards listed in the app’s home page show a thumbnail instead of a blank card.
Ability to duplicate a dashboard page by right-clicking on its tab.

Changed#

All the changes are BREAKING.

Config#

atoti.create_session()’s config parameter expects a plain Python object following the structure of SessionConfig rather than an object created with atoti.config.create_config() or a path to a config file.
```
  session = tt.create_session(
-   config=tt.config.create_config(port=9090)
+   config={"port": 9090}
  )
```
The metadata_db config parameter has been renamed user_content_storage. Existing metadata.mv.db files must be renamed content.mv.db.
Other options have been changed:
- default_locale and i18n_directory have been regrouped under atoti.create_session()’s i18n parameter.
- java_args has been renamed java_options.
- jwt_key_pair has been renamed jwt.
- max_memory has been removed. Pass an -Xmx option to atoti.create_session()’s java_options parameter instead.

Data loading#

The atoti.store.Store class has been renamed atoti.Table. Same thing for the atoti.Session.stores property which has become atoti.Session.tables, the store_name parameter which has become table_name, and atoti.Session.create_store() which has become atoti.Session.create_table().
int and long table columns, unless they are keys, automatically become measures instead of levels. With this change, all the numeric columns behave the same.

atoti.Table.load_pandas() and atoti.Table.append() do not automatically infer date types. date or datetime objects must be used.

- table.append("2020-02-01", "France", "Paris", "id-113", 111.)
+ table.append(datetime.date(2020, 2, 1), "France, "Paris", "id-113", 111.)

atoti.Session.read_csv() and atoti.Table.load_csv()’s sep and array_sep parameters have been renamed separator and array_separator.
atoti.Session.read_csv() and atoti.Session.read_parquet()’s table_name parameter is required when the path argument is a glob pattern.

atoti.Session.read_csv() and atoti.Table.load_csv()’s path parameter no longer accepts a directory. Use a glob instead.

  session.read_csv(
-   path="path/to/sales/",
+   path="path/to/sales/*.csv",
    table_name="Sales"
  )

When passing a directory to atoti.Session.read_parquet() and atoti.Table.load_parquet()’s path parameter, only Parquet files in this directory will be loaded, not the ones in possible subdirectories. Use a glob to load Parquet files in subdirectories.
```
  session.read_parquet(
-   path="path/to/sales/",
+   path="path/to/sales/**/*.parquet",
    table_name="Sales"
  )
```
atoti.Session.read_sql() and atoti.Table.load_sql() expect the username and password to be in the connection string passed to the url parameter instead of being passed as dedicated parameters. The url parameter has also been made keyword-only.
```
  table.load_sql(
-   "h2:/file/path",
    "SELECT * FROM MYTABLE;",
+   url="h2:/file/path;USER=username;PASSWORD=passwd",
-   username="username",
-   password="passwd"
  )
```
atoti.Session.read_numpy() (array, columns, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, hierarchized_columns=None, **kwargs) -> (array, *, columns, table_name, keys=None, partitioning=None, hierarchized_columns=None, **kwargs).
```
  session.read_numpy(
-   np_array, ["Id", "Country", "City", "Price"], "Prices",
+   np_array, columns=["Id", "Country", "City", "Price"], table_name="Prices",
  )
```
atoti.Session.start_transaction() () -> (scenario_name="Base"). Transactions only accept loading operations which impact the scenario they are started on.

Querying#

The levels parameter of atoti.Cube.query() and atoti_query.QueryCube.query does not accept a single level anymore. If a value is passed, it must be a sequence of levels. No more choice overload.
```
  cube.query(
    m["contributors.COUNT"],
+   levels=l["Product"]
-   levels=[l["Product"]]
  )
```
Searching for the regular expression levels=([^[][^,)]+) and replacing it with levels=[$1] can help adapting most occurrences of the single level calls to the new syntax.

Passing no measures to atoti.Cube.query() and atoti_query.QueryCube.query queries no measures instead of querying all visible measures (issue #220).

  # Query all visible measures on all products:
  cube.query(
+   *[measure for measure in cube.measures.values() if measure.visible],
    levels=[l["Product"]]
  )

Simulations#

atoti.Cube.setup_simulation() has been replaced with atoti.Cube.create_parameter_simulation(). Instead of intercepting the regular aggregation flow of an existing measure like setup_simulation(), create_parameter_simulation() creates a new parameter measure that can be used to define new measures or redefine existing ones. Levels created by setup_simulation() were in the Measure simulations dimension but the levels created by create_parameter_simulation() have the same name, hierarchy name, and dimension name as the simulation.

  turnover = tt.agg.sum(table["Unit price"] * table["Quantity"])
- m["Turnover"] = turnover
- simulation = cube.setup_simulation(
-   "Country Simulation",
-   levels=[l["Country"]],
-   multiply=[m["Turnover"]]
- )
+ simulation = cube.create_parameter_simulation(
+   "Country Simulation",
+   measure_name="Country parameter",
+   default_value=1,
+   levels=[l["Country"]]
+ )
+ m["Turnover"] = tt.agg.sum(
+   turnover * m["Country parameter"],
+   scope=tt.scope.origin(l["Country"])
+ )
- simulation.scenarios["France boost"] += ("France", 1.15)
+ simulation += ("France boost", "France", 1.15)
  cube.query(
    m["Turnover"],
    levels=[l["Country Simulation"], l["Country"]]
  )

Other#

atoti.Column are not automatically converted into measures. Columns and measures cannot be used together in calculations without first converting the column into a measure. atoti.value() can be used to convert a table column to a measure.
```
- m["Quantity.VALUE"] = table["Quantity"]
+ m["Quantity.VALUE"] = tt.value(table["Quantity"])
- m["Final Price"] = m["Unit Price"] * table["Rate"]
+ m["Final Price"] = m["Unit Price"] * tt.value(table["Rate"])
```
atoti.experimental.create_date_hierarchy() (name, cube, column, *, levels={'Day': 'd', 'Month': 'M', 'Year', 'Y'}) -> (name, *, cube, column, levels={'Day': 'd', 'Month': 'M', 'Year': 'y'}).
atoti.Cube.create_store_column_parameter_hierarchy() has been renamed atoti.Cube.create_parameter_hierarchy_from_column(). The created hierarchy is slicing by default.

atoti.Cube.create_static_parameter_hierarchy(name, members, *, indices=None, data_type=None, index_measure=None) has been changed to atoti.Cube.create_parameter_hierarchy_from_members() and (name, members, *, data_type=None, index_measure_name=None). The sorted() function can be used as a replacement of the indices parameter to change the order of the members.

- cube.create_static_parameter_hierarchy(
+ cube.create_parameter_hierarchy_from_members(
    "Date",
-   ["2020/01/30", "2020/02/27", "2020/03/30", "2020/04/30"],
-   store_name="Date",
-   indices=[3, 2, 1, 0],
+   sorted(["2020/01/30", "2020/02/27", "2020/03/30", "2020/04/30"], reverse=True),
-   index_measure="Date Index",
+   index_measure_name="Date Index",
  )

atoti.parent_value()’s on parameter has been removed in favor of degrees which has become required.

  tt.parent_value(
    m["Price.SUM"],
-   on=h["Date"],
+   degrees={h["Date"]: 1}
  )

atoti.date_diff()’s following method has been renamed next.

  tt.date_shift(
    m["Price.SUM"],
    h["Date"],
    offset="1D",
-   method="following"
+   method="next"
  )

atoti.date_shift() (measure, on, offset, *, method='exact') -> (measure, on, *, offset, method='exact').
atoti.rank() (measure, hierarchy, ascending=True, apply_filters=True) -> (measure, hierarchy, *, ascending=True, apply_filters=True).
Measure has been renamed atoti.MeasureDescription and atoti.NamedMeasure has been renamed atoti.Measure.
The REST API for tables only grants write access to users with the ROLE_ADMIN and the configured role atoti_plus.security.Role.restrictions are used to limit which lines each user can read (issue #270). The REST API does not allow modification of tables created with the atoti.Cube.create_parameter_simulation() and atoti.Cube.create_parameter_hierarchy_from_members().
comparator.ASC has become atoti.comparator.ASCENDING and comparator.DESC has become atoti.comparator.DESCENDING.

Deprecated#

Support for remote content servers. Configure the user content storage with a JDBC atoti.UserContentStorageConfig.url instead.

Removed#

All the removals are BREAKING.

Config#

Global configuration mechanism. Only the object passed to atoti.create_session()’s config parameter will be used.
cache_cloud_files config parameter.
accent_color and frame_color branding parameters.

Data loading#

Watching CSV and Parquet files. This removes the watch parameter of load_*() and read_*() methods. The Watch local files how-to shows an alternative.
Sampling of data loading operations. This removes the atoti.Session.load_all_data() method and the sampling_mode parameter of atoti.config.create_config() as well as methods such as atoti.Session.read_csv() and atoti.Session.read_parquet().

The recommended alternative for projects loading large amount of data into their session is to:
1. Extract a smaller and meaningful dataset and work with it when iterating on the declaration of the data model.
2. Perform the large data loading operations after the tables and cubes of the sessions have reached their final structures.

truncate parameter of atoti.Table’s loading methods. The same behavior can be replicated by calling atoti.Table.drop() first.

- store.load_pandas(df, truncate=True)
+ with session.start_transaction():
+     table.drop()
+     table.load_pandas(df)

in_all_scenarios parameter from all methods of the API.
- atoti.Session.read_csv() and all other atoti.Session.read_*() methods only load data in the Base scenario.
- atoti.Table.load_csv() and all other atoti.Table.load_*() methods only load data in the current scenario of the table.
- Data loaded into parameter tables created by atoti.Cube.create_parameter_simulation() or atoti.Cube.create_parameter_hierarchy_from_members() is always loaded into all the existing scenarios.

atoti.store.StoreScenarios.load_csv().

  base_directory = Path("path/to/base/directory")
- store.scenarios.load_csv(base_directory)
+ for scenario_path in base_directory.iterdir():
+     if scenario_path.is_dir():
+         table.scenarios[scenario_path.name].load_csv(f"str(scenario_path.resolve())/**.csv")

This can be combined with the Watch local files how-to to generate new scenarios in real-time.

Simulations#

Measure simulations and Source simulation editors in the app.
atoti.Table’s source_simulation_enabled property since it was only used by the Source simulation editor.

Other#

atoti.Session.url and atoti.Session.excel_url, dropping the need for atoti.config.create_config()’s url_pattern parameter (issue #260).

There is no reliable way for the session to know if it is exposed through a reverse proxy, a load balancer, a Docker container, or anything else that would make the public URL used to access the session and the hostname/IP of the machine hosting the session different. In these situations, url_pattern had to be defined to prevent atoti.Session.url and atoti.Session.excel_url from being unusable. However, it is simpler to build the session URL directly from the known network setup and atoti.Session.port than using url_pattern.
- In JupyterLab, atoti.Session.link() can be used instead, even when Jupyter Server is not running locally.
- For a session running locally, atoti.Session.url can be replaced with f"http://localhost:{session.port}".
- The Connect with Excel guide shows how to connect Excel without atoti.Session.excel_url.
atoti.Hierarchy.name can no longer be changed to rename the hierarchy, see Hierarchies for information on renaming hierarchies.

atoti.Table.shape.

- table.shape
+ {"columns": len(table.columns), "rows": len(table)}

Support for Python 3.7.0. Python versions >= 3.7.1 are supported.
Following deprecated functions and methods:
- atoti.agg._single_value().
- atoti.agg._stop().
- atoti.config.create_role(), atoti.config.create_basic_user(), and atoti.config.create_kerberos_user(). Use atoti.Session.security instead.
- atoti.Session.logs_tail().
```
- lines = session.logs_tail(n=10)
+ with open(session.logs_path) as logs:
+   lines = logs.readlines()[-10:]
```

Fixed#

Measure names including , are rejected instead of silently renamed (issue #271).
Measures combining atoti.scope.cumulative() and atoti.shift() (issue #295).
atoti.value() when no name was specified for the created measure (issue #281, issue #287).
Handling of glob patterns containing parentheses (issue #285).
hierarchized_columns of joined tables not taken into account (issue #303).
Handling of multiline strings in dataframes passed to atoti.Session.read_pandas() and atoti.Table.load_pandas() (issue #186).
Handling of null arrays when aggregating table columns.
Errors occurring inside transactions leaving the session in an unstable state.