0.6.0 (July 20, 2021)#
Highlights:
The main theme of this release is simplification.
From the session configuration to simulations through data loading, many aspects of the API have been revised. The goal was to increase the API’s intuitiveness and consistency, in turn making Atoti easier to learn and use.
This release also comes with breaking changes, most of them being the result of:
Naming standardization (e.g. Table instead of Store).
Removal of non Atoti specific features which can be replicated just as well using standard Python functions or popular libraries (e.g. data sampling and file watching).
A lot of performance improvement work also happened behind the scenes. In particular, creating and joining tables and defining new measures will be quicker than before.
Added#
atoti.Session.link()
andatoti_query.QuerySession.link
to display a link to the session in JupyterLab.atoti.UserContentStorageConfig
to back the user content storage with a remote database (issue #277).Connect with Excel and Watch local files how-tos.
User interface#
Filters can be saved.
Like dashboards, filters are saved in the
atoti.create_session()
’s user_content_storage parameter.Saved dashboards listed in the app’s home page show a thumbnail instead of a blank card.
Ability to duplicate a dashboard page by right-clicking on its tab.
Changed#
All the changes are BREAKING.
Config#
atoti.create_session()
’s config parameter expects a plain Python object following the structure ofSessionConfig
rather than an object created withatoti.config.create_config()
or a path to a config file.session = tt.create_session( - config=tt.config.create_config(port=9090) + config={"port": 9090} )
The metadata_db config parameter has been renamed user_content_storage. Existing
metadata.mv.db
files must be renamedcontent.mv.db
.Other options have been changed:
default_locale and i18n_directory have been regrouped under
atoti.create_session()
’s i18n parameter.java_args has been renamed java_options.
jwt_key_pair has been renamed jwt.
max_memory has been removed. Pass an
-Xmx
option toatoti.create_session()
’s java_options parameter instead.
Data loading#
The
atoti.store.Store
class has been renamedatoti.Table
. Same thing for theatoti.Session.stores
property which has becomeatoti.Session.tables
, the store_name parameter which has become table_name, andatoti.Session.create_store()
which has becomeatoti.Session.create_table()
.int
andlong
table columns, unless they arekeys
, automatically become measures instead of levels. With this change, all the numeric columns behave the same.atoti.Table.load_pandas()
andatoti.Table.append()
do not automatically infer date types.date
ordatetime
objects must be used.- table.append("2020-02-01", "France", "Paris", "id-113", 111.) + table.append(datetime.date(2020, 2, 1), "France, "Paris", "id-113", 111.)
atoti.Session.read_csv()
andatoti.Table.load_csv()
’s sep and array_sep parameters have been renamed separator and array_separator.atoti.Session.read_csv()
andatoti.Session.read_parquet()
’s table_name parameter is required when the path argument is a glob pattern.atoti.Session.read_csv()
andatoti.Table.load_csv()
’s path parameter no longer accepts a directory. Use a glob instead.session.read_csv( - path="path/to/sales/", + path="path/to/sales/*.csv", table_name="Sales" )
When passing a directory to
atoti.Session.read_parquet()
andatoti.Table.load_parquet()
’s path parameter, only Parquet files in this directory will be loaded, not the ones in possible subdirectories. Use a glob to load Parquet files in subdirectories.session.read_parquet( - path="path/to/sales/", + path="path/to/sales/**/*.parquet", table_name="Sales" )
atoti.Session.read_sql()
andatoti.Table.load_sql()
expect the username and password to be in the connection string passed to the url parameter instead of being passed as dedicated parameters. The url parameter has also been made keyword-only.table.load_sql( - "h2:file:/file/path", "SELECT * FROM MYTABLE;", + url="h2:file:/file/path;USER=username;PASSWORD=passwd", - username="username", - password="passwd" )
atoti.Session.read_numpy()
(array, columns, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, hierarchized_columns=None, **kwargs)
->(array, *, columns, table_name, keys=None, partitioning=None, hierarchized_columns=None, **kwargs)
.session.read_numpy( - np_array, ["Id", "Country", "City", "Price"], "Prices", + np_array, columns=["Id", "Country", "City", "Price"], table_name="Prices", )
atoti.Session.start_transaction()
()
->(scenario_name="Base")
. Transactions only accept loading operations which impact the scenario they are started on.
Querying#
The levels parameter of
atoti.Cube.query()
andatoti_query.QueryCube.query
does not accept a single level anymore. If a value is passed, it must be a sequence of levels. No more choice overload.cube.query( m["contributors.COUNT"], + levels=l["Product"] - levels=[l["Product"]] )
Searching for the regular expression
levels=([^[][^,)]+)
and replacing it withlevels=[$1]
can help adapting most occurrences of the single level calls to the new syntax.Passing no measures to
atoti.Cube.query()
andatoti_query.QueryCube.query
queries no measures instead of querying all visible measures (issue #220).# Query all visible measures on all products: cube.query( + *[measure for measure in cube.measures.values() if measure.visible], levels=[l["Product"]] )
Simulations#
atoti.Cube.setup_simulation()
has been replaced withatoti.Cube.create_parameter_simulation()
. Instead of intercepting the regular aggregation flow of an existing measure likesetup_simulation()
,create_parameter_simulation()
creates a new parameter measure that can be used to define new measures or redefine existing ones. Levels created bysetup_simulation()
were in the Measure simulations dimension but the levels created bycreate_parameter_simulation()
have the same name, hierarchy name, and dimension name as the simulation.turnover = tt.agg.sum(table["Unit price"] * table["Quantity"]) - m["Turnover"] = turnover - simulation = cube.setup_simulation( - "Country Simulation", - levels=[l["Country"]], - multiply=[m["Turnover"]] - ) + simulation = cube.create_parameter_simulation( + "Country Simulation", + measure_name="Country parameter", + default_value=1, + levels=[l["Country"]] + ) + m["Turnover"] = tt.agg.sum( + turnover * m["Country parameter"], + scope=tt.scope.origin(l["Country"]) + ) - simulation.scenarios["France boost"] += ("France", 1.15) + simulation += ("France boost", "France", 1.15) cube.query( m["Turnover"], levels=[l["Country Simulation"], l["Country"]] )
Other#
atoti.Column
are not automatically converted into measures. Columns and measures cannot be used together in calculations without first converting the column into a measure.atoti.value()
can be used to convert a table column to a measure.- m["Quantity.VALUE"] = table["Quantity"] + m["Quantity.VALUE"] = tt.value(table["Quantity"]) - m["Final Price"] = m["Unit Price"] * table["Rate"] + m["Final Price"] = m["Unit Price"] * tt.value(table["Rate"])
atoti.experimental.create_date_hierarchy()
(name, cube, column, *, levels={'Day': 'd', 'Month': 'M', 'Year', 'Y'})
->(name, *, cube, column, levels={'Day': 'd', 'Month': 'M', 'Year': 'y'})
.atoti.Cube.create_store_column_parameter_hierarchy()
has been renamedatoti.Cube.create_parameter_hierarchy_from_column()
. The created hierarchy isslicing
by default.atoti.Cube.create_static_parameter_hierarchy(name, members, *, indices=None, data_type=None, index_measure=None)
has been changed toatoti.Cube.create_parameter_hierarchy_from_members()
and(name, members, *, data_type=None, index_measure_name=None)
. Thesorted()
function can be used as a replacement of the indices parameter to change the order of the members.- cube.create_static_parameter_hierarchy( + cube.create_parameter_hierarchy_from_members( "Date", - ["2020/01/30", "2020/02/27", "2020/03/30", "2020/04/30"], - store_name="Date", - indices=[3, 2, 1, 0], + sorted(["2020/01/30", "2020/02/27", "2020/03/30", "2020/04/30"], reverse=True), - index_measure="Date Index", + index_measure_name="Date Index", )
atoti.parent_value()
’s on parameter has been removed in favor of degrees which has become required.tt.parent_value( m["Price.SUM"], - on=h["Date"], + degrees={h["Date"]: 1} )
atoti.date_diff()
’sfollowing
method has been renamednext
.tt.date_shift( m["Price.SUM"], h["Date"], offset="1D", - method="following" + method="next" )
atoti.date_shift()
(measure, on, offset, *, method='exact')
->(measure, on, *, offset, method='exact')
.atoti.rank()
(measure, hierarchy, ascending=True, apply_filters=True)
->(measure, hierarchy, *, ascending=True, apply_filters=True)
.Measure
has been renamedatoti.MeasureDescription
andatoti.NamedMeasure
has been renamedatoti.Measure
.The REST API for tables only grants write access to users with the ROLE_ADMIN and the configured role
atoti_plus.security.Role.restrictions
are used to limit which lines each user can read (issue #270). The REST API does not allow modification of tables created with theatoti.Cube.create_parameter_simulation()
andatoti.Cube.create_parameter_hierarchy_from_members()
.comparator.ASC
has becomeatoti.comparator.ASCENDING
andcomparator.DESC
has becomeatoti.comparator.DESCENDING
.
Deprecated#
Support for remote content servers. Configure the user content storage with a JDBC
atoti.UserContentStorageConfig.url
instead.
Removed#
All the removals are BREAKING.
Config#
Global configuration mechanism. Only the object passed to
atoti.create_session()
’s config parameter will be used.cache_cloud_files config parameter.
accent_color and frame_color branding parameters.
Data loading#
Watching CSV and Parquet files. This removes the watch parameter of
load_*()
andread_*()
methods. The Watch local files how-to shows an alternative.Sampling of data loading operations. This removes the
atoti.Session.load_all_data()
method and the sampling_mode parameter ofatoti.config.create_config()
as well as methods such asatoti.Session.read_csv()
andatoti.Session.read_parquet()
.The recommended alternative for projects loading large amount of data into their session is to:
Extract a smaller and meaningful dataset and work with it when iterating on the declaration of the data model.
Perform the large data loading operations after the tables and cubes of the sessions have reached their final structures.
truncate parameter of
atoti.Table
’s loading methods. The same behavior can be replicated by callingatoti.Table.drop()
first.- store.load_pandas(df, truncate=True) + with session.start_transaction(): + table.drop() + table.load_pandas(df)
in_all_scenarios parameter from all methods of the API.
atoti.Session.read_csv()
and all otheratoti.Session.read_*()
methods only load data in the Base scenario.atoti.Table.load_csv()
and all otheratoti.Table.load_*()
methods only load data in the current scenario of the table.Data loaded into parameter tables created by
atoti.Cube.create_parameter_simulation()
oratoti.Cube.create_parameter_hierarchy_from_members()
is always loaded into all the existing scenarios.
atoti.store.StoreScenarios.load_csv()
.base_directory = Path("path/to/base/directory") - store.scenarios.load_csv(base_directory) + for scenario_path in base_directory.iterdir(): + if scenario_path.is_dir(): + table.scenarios[scenario_path.name].load_csv(f"str(scenario_path.resolve())/**.csv")
This can be combined with the Watch local files how-to to generate new scenarios in real-time.
Simulations#
Measure simulations and Source simulation editors in the app.
atoti.Table
’ssource_simulation_enabled
property since it was only used by the Source simulation editor.
Other#
atoti.Session.url
andatoti.Session.excel_url
, dropping the need foratoti.config.create_config()
’s url_pattern parameter (issue #260).There is no reliable way for the session to know if it is exposed through a reverse proxy, a load balancer, a Docker container, or anything else that would make the public URL used to access the session and the hostname/IP of the machine hosting the session different. In these situations, url_pattern had to be defined to prevent
atoti.Session.url
andatoti.Session.excel_url
from being unusable. However, it is simpler to build the session URL directly from the known network setup andatoti.Session.port
than using url_pattern.In JupyterLab,
atoti.Session.link()
can be used instead, even when Jupyter Server is not running locally.For a session running locally,
atoti.Session.url
can be replaced withf"http://localhost:{session.port}"
.The Connect with Excel guide shows how to connect Excel without
atoti.Session.excel_url
.
atoti.Hierarchy.name
can no longer be changed to rename the hierarchy, seeHierarchies
for information on renaming hierarchies.atoti.Table.shape
.- table.shape + {"columns": len(table.columns), "rows": len(table)}
Support for Python 3.7.0. Python versions >= 3.7.1 are supported.
Following deprecated functions and methods:
atoti.agg._single_value()
.atoti.agg._stop()
.atoti.config.create_role()
,atoti.config.create_basic_user()
, andatoti.config.create_kerberos_user()
. Useatoti.Session.security
instead.atoti.Session.logs_tail()
.- lines = session.logs_tail(n=10) + with open(session.logs_path) as logs: + lines = logs.readlines()[-10:]
Fixed#
Measure names including
,
are rejected instead of silently renamed (issue #271).Measures combining
atoti.scope.cumulative()
andatoti.shift()
(issue #295).atoti.value()
when no name was specified for the created measure (issue #281, issue #287).Handling of glob patterns containing parentheses (issue #285).
hierarchized_columns of joined tables not taken into account (issue #303).
Handling of multiline strings in dataframes passed to
atoti.Session.read_pandas()
andatoti.Table.load_pandas()
(issue #186).Handling of null arrays when aggregating table columns.
Errors occurring inside transactions leaving the session in an unstable state.