Changelog¶

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

0.5.0 (2020-12-08)¶

Highlights:

Some functionalities have been moved to plugin packages to lighten the core atoti package.
The web application and the JupyterLab extension have been rewritten from scratch to provide better performances and a simpler experience. atoti’s JupyterLab extension leverages JupyterLab 3’s federated extension system meaning that Node.js and the rebuilding of JupyterLab are not required anymore for its installation. It is also distributed as an atoti plugin instead of a separate npm or Conda package.
Data can be loaded from more sources: Amazon S3, Azure Blob Storage, Google Cloud Storage, and SQL databases. On the fly decompression of CSV files stored in .gz, .tar.gz, or .zip archives has also been added.
The name of the default dimension of a hierarchy has changed from Hierarchies to the name of the store on which the hierarchy is based.

Added¶

Plugins bringing additional features:
- atoti-azure to load CSV and parquet file from Azure Blob Storage.
- atoti-gcp to load CSV and parquet file from Google Cloud Storage.
- atoti-jupyterlab to make interactive visualizations on top of atoti cubes in JupyterLab. It enables the session.Session.visualize() and query.session.QuerySession.visualize() methods.
- atoti-sql to load results of SQL queries into atoti stores.
Reports about the data loaded into stores including number of lines, errors and duration. A warning is now issued in the notebook if an error occured during the loading (#58 and #64).
Support for path parameters in endpoint()’s route parameter.
Hierarchy visibility can be toggled through the visible attribute.
Support for reading .gz, .tar.gz and .zip files containing compressed CSV(s) (#123).
array.n_lowest_indices() and array.n_greatest_indices() to retrieve the indices of the lowest or greatest values of an array measure (#153).
array.prod() to do the product of all the elements of an array (#113).
value() to create a measure based on the value of a store column.
hierarchized_columns parameter to select which columns of a store are converted into hierarchies. It is available in these methods:
config.create_ldap_authentication() to setup LDAP authentication in Atoti+.
Support for multiple hierarchies in total().
Support for negative value in array indexing (#149).
Support for named_measure.NamedMeasure representing booleans in where()’s condition parameter (#94).
cube.Cube.create_store_column_parameter_hierarchy() to create parameter hierarchies from existing store columns.
array.quantile_index() returning the index of the desired quantile.
Measure description can be changed (#167).
Runtime type checking on all the public API functions.
branding, extra_jars, https, and same_site parameters to create_config().

Experimental¶

The atoti.experimental module regroups new features that can go through breaking changes in minor and/or patch releases. Its initial content is:

atoti.experimental.distributed to create distributed clusters of atoti cubes.
atoti.experimental.finance.irr() to compute an internal rate of return.
atoti.experimental.stats providing the probability distribution functions pdf, cdf and ppf for Normal, Chi-square, Student’s t, Beta and F distributions.

Changed¶

BREAKING: The web application requires a new initial file structure in the metadata DB. Metadata DBs created in previous versions are not compatible with this version and will have to be recreated.
BREAKING: Cube.visualize() has been replaced with session.Session.visualize() that requires the atoti-jupyterlab plugin. Widgets made with Cube.visualize() will have to be rebuilt with the new JupyterLab extension.
BREAKING: AWS S3 and Kafka loading are no longer supported in the base package, they require the plugins atoti-aws and atoti-kafka respectively.
BREAKING: Hierarchy are put in a dimension with the same name as the store which feeds their levels.
BREAKING: math functions have been moved to the atoti.math module: math.abs(), math.ceil(), math.cos(), math.exp(), math.floor(), math.log(), math.log10(), math.max(), math.min(), math.round(), math.sin(), math.sqrt() and math.tan().
BREAKING: parent_value()’s degree parameter has been replaced by a degrees mapping to support multiple hierarchies.
BREAKING: comparator.first_members()’s members parameter has been made variadic instead of accepting a collection.
BREAKING: atoti.types module and AtotiType class have been respectively renamed atoti.type and DataType. Array and nullable types have also been renamed for improved grammar and consistency.
BREAKING: session.Session.endpoint()’s method parameter has been made keyword-only.
BREAKING: level.Level.data_type’s type changed from str to DataType.
BREAKING: The constructors of the following classes are no longer part of the API and have been replaced by factory functions:
- config.SessionConfiguration → config.create_config()
- config.BasicAuthentication → config.create_basic_authentication()
- config.BasicUser → config.create_basic_user()
- config.OidcAuthentication → config.create_oidc_authentication()
- query.basic_auth.BasicAuthentication → query.create_basic_authentication()
BREAKING: cube.Cube.create_parameter_hierarchy() has been renamed cube.Cube.create_static_parameter_hierarchy().
BREAKING: Store names inferred from file paths are capitalized.
BREAKING: Key columns cannot be nullable anymore and are automatically made non nullable. String and date columns are also inferred as non nullable.
BREAKING: config.create_config()’s inherit parameter has been renamed inherit_global_config.
BREAKING: JSON responses generated from endpoint() are no longer encapsulated into an object with data and status keys.
BREAKING: .VALUE measures are no longer automatically created from numeric columns of joined stores.
The first MDX query run by an atoti widget in JupyterLab is no longer executed in Python. Instead, the query is executed client-side like before 0.4.3 and the call to visualize() will block until this first query is done.
session.Session.query_mdx() and query.session.QuerySession.query_mdx() support any MDX SELECT query (more than 2 axes, measures on rows, or totals). Empty measure values will also be kept as None in the resulting DataFrame instead of being converted to NaN.
ROLE_USER is no longer automatically added to the role mapping of config.create_oidc_authentication() and must be given explicitly.
atoti’s Conda package depends on jdk4py so the installation of the openjdk Conda package is no longer required.
The data loaded while a sampling mode is active is now consistent between store and cube manipulations.

Deprecated¶

BREAKING: agg.stop() moved to agg._stop() as its behavior can be replicated with where():

- m["Stopped price"] = tt.agg.stop(m["Price"], lvl["Product"], lvl["Shop"])
+ m["Stopped price"] = tt.where(
+     (lvl["Product"] != None) & (lvl["Shop"] != None), m["Price"]
+ )

BREAKING: agg.single_value() moved to agg._single_value() as value() can be used for its main use-case: creating a measure based on the value of a store column.

Fixed¶

HTML entities are correctly encoded in widget snpashots (#148).
Issue with boolean type in Parquet files (#157).
Issue when passing a measure to the n parameter of array.n_lowest(), array.nth_lowest(), array.n_greatest() and array.nth_greatest() (#159).
agg.count_distinct() support for measure and scope parameters.
Issue with date_shift() and date_diff() when applied on a date level with the default "N/A" member (#180).

0.4.3 (2020-09-01)¶

Added¶

agg.sum_product() return a measure equal to the product and sum of two measures or fields.
The modulo % operation between measures (#48).
Support reading Parquet files from AWS S3 (#27).
store.Store.drop() remove rows from a store by specifying column names and values.
atoti widgets are snapshotted as SVG images in the notebook on save. These images will appear in HTML exports or in GitHub previews of the notebook.
Context menu actions for widgets of the atoti JupyterLab extension:
- Undo and Redo (#71).
- Convert to atoti Widget Below available in DataFrames returned by session.Session.query_mdx() and cube.Cube.query() to start an interactive exploration from the same MDX query (#49).
- Open state editor to navigate to the notebook cell metadata editor (#104).

Changed¶

Java is no longer required in the pip installation since the atoti Python wheel now relies on jdk4py.
The Measure simulations and Source simulation editors have been redesigned. They now both have their dedicated page instead of being widgets embeddable in dashboards. The Measure simulations editor also issue more minimal updates to the underlying store (#70).
Upgraded from ActiveUI SDK 4.3.8 to 4.3.11 (#30, #59, #74, #79, #84, #87 and #110).
Upgraded from ActivePivot 5.9.1 to 5.9.2 (#98 and #99).
session.Session.query_mdx(), cube.Cube.query(), query.session.QuerySession.query_mdx(), and query.cube.QueryCube.query() return DataFrames displaying formatted values and providing a style attribute reflecting the potential styling properties of the corresponding cell set (#91).
The first MDX query ran by an atoti widget in JupyterLab is now executed in Python and its resulting cell set is outputted to the corresponding notebook cell. It allows to ensure that the data displayed by the widget reflects the expected state of the cube without having to block the IPython kernel in a fragile way.
Added ability to append rows in all scenarios of a store with store.Store.append().

Removed¶

The creation of calculated measures and the action to add a measure computing the difference between to table columns have been disabled in the JupyterLab widget extension to incentivize creating these measures in Python instead. These features are still available in the app (#21).

Deprecated¶

The constructors of config.SessionConfiguration, config.BasicAuthentication, config.BasicUser and config.OidcAuthentication have been made private and using them is deprecated. They have been replaced by factory functions: config.create_config(), config.create_basic_authentication(), config.create_basic_user() and config.create_oidc_authentication().
query.basic_auth.BasicAuthentication, replaced by query.create_basic_authentication().

Fixed¶

Issue with dates not correctly converted to Python datetime in store.Store.head() (#97).
Issue getting vector element with a measure containing long (#115).
Issue with aggregation function not preserved when creating a simulated measure (#121).

0.4.2 (2020-07-15)¶

Added¶

Kafka streaming data source through store.Store.load_kafka().
session.Session.endpoint() decorator adds HTTP endpoints to the session from a Python callback.
array.sort() has a new ascending parameter. True by default, it allows to choose the sorting order.
scope.cumulative() has a new dense parameter. False by default, it allows to choose whether to include all of a level’s members in the cumulative aggregation, even those for which the underlying measure has no values.
Atoti+ now supports i18n. en-US is the only locale supported by default but additional locales can be made available by providing custom translation files. These can be configured with config.create_config() A good starting point for adding new locales is to use the template containing all the translatable items, which can be obtained by using session.Session.export_translations_template().
The Gauss error function math.erf() and its complementary math.erfc() (#92).

Changed¶

The name_attribute parameter used to select the displayed username when using an OpenID Connect can be configured.
The scope parameter used to select the requested scopes when using an OpenID Connect provider can be configured. The openid scope is always passed by default.

Fixed¶

Missing images in the tutorial (#80).
Wrong results when using where() (#17).
Issue when reading pandas DataFrame with NaN (#77).
hierarchy.Hierarchy.isin() and level.Level.isin() can be used with more than 2 values (#93).
Issue with array types not displayed correctly in the stores schema.
Issue when joining a column of type int to a column of type long if the store is based on a parquet file (#76).

0.4.1 (2020-06-17)¶

Added¶

New tutorial exploring the main basic features of atoti.
rank() returns a measure ranking the members of a given hierarchy based on the value of another measure.
array.prefix_sum() performs the prefix sum of array measures.
Hierarchies can have the same name if they are in different dimensions. To avoid conflicts, a hierarchy can be accessed via a tuple containing the dimension and the hierarchy: cube.hierarchies["Product", "Size"].

Changed¶

Bumped the minimal required version of JupyterLab to 2.1.
Upgraded from ActiveUI SDK 4.3.7 to 4.3.8.
Better messages for Java known errors (#43).
The Auth0 support in Atoti+ has been replaced by the more general OpenID Connect authentication protocol. The structure of the configuration can be seen in the configuration tutorial.

Fixed¶

filter()’s measure parameter accepts any value that can be converted to a measure (#22).
filter() and where() support inequalities on dates as conditions.
Issue when loading data into a scenario with truncate set to True (#53).
Issue with agg.quantile() combined with scope.origin().
Issue when aggregating .VALUE measures using any of the agg.xxx functions (#52).
Type issue that sometimes happened when chaining operators such as array.quantile() and date_shift().
Blinking cell updates not appearing in pivot tables with real time queries.

0.4.0 (2020-05-25)¶

Added¶

agg.max_member() and agg.min_member() return a measure equal to the member reaching the correspinding extremum of the passed measure on the given level.
hierarchy.Hierarchy.isin(), query.hierarchy.QueryHierarchy.isin(), level.Level.isin(), and query.level.QueryLevel.isin() create conditions expressing that a hierarchy or a level should be on one of the given members.
stores.Stores.schema and cube.Cube.schema: SVG graphs of, respectively, all the session’s stores and the stores used by a cube.
Pip installation guide.
store.StoreScenarios.load_csv() loads a directory of CSV files into a store, automatically generating scenarios based on the directory’s structure.
total() returns the total value on each hierarchy member.
session.Session.create_store() creates an empty store from a schema.
Exponentiation operation between measures: measure_a ** measure_b.

Changed¶

BREAKING: Hierarchies, levels, and measures can no longer be passed by name, instances of the corresponding class are expected instead.
BREAKING: create_session()’s port, max_memory, java_args and sampling_mode parameters and the ATOTI_URL_PATTERN environment variable have been moved to the config.SessionConfiguration changing these signatures:
- create_session(): (name='Unnamed', sampling_mode=SamplingMode(name='limit_lines', parameters=[10000]), port=None, max_memory=None, java_args=None, config=None, **kwargs) → (name='Unnamed', *, config=None)
- config.create_config(): (inherit=True, metadata_db=None, roles=None, authentication=None, properties=None) → (*, inherit=True, port=None, url_pattern=None, metadata_db=None, roles=None, authentication=None, sampling_mode=None, max_memory=None, java_args=None)
BREAKING: New structure for the authentication configuration in YAML as shown in the configuration tutorial <tutorial/02-configuration:Auth0>.
BREAKING: config.BasicUser.roles and config.Auth0Authentication.role_mapping do not accept role instances anymore, only role names.
BREAKING: The wildcard value in measure simulations has been changed from * to None.
BREAKING: session.Session.read_pandas(), session.Session.read_spark() and session.Session.read_numpy() require a name for the created store:
- session.Session.read_numpy(): (data, columns, store_name, keys, in_all_scenarios=True, partitioning=None, sep='|') → (array, columns, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, **kwargs)
- session.Session.read_pandas(): (dataframe, keys=None, store_name=None, partitioning=None, types=None, **kwargs) → (dataframe, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, types=None, **kwargs)
- session.Session.read_spark(): (dataframe, keys=None, store_name=None, partitioning=None) → (dataframe, store_name, *, keys=None, in_all_scenarios=True, partitioning=None)
BREAKING: simulation.Scenario.insert(row) and store.Store.insert_rows(rows) have been renamed simulation.Scenario.append() and store.Store.append(). They take a variadic *rows parameter and in place addition of a single row is still supported with +=.
BREAKING: percentile and variance functions have been renamed quantile and var:
- agg.percentile(measure, percentile_value, mode='inc', interpolation='linear', scope=None) → agg.quantile() and (measure, q, *, mode='inc', interpolation='linear', scope=None)
- array.percentile(measure, percentile_value, mode='inc', interpolation='linear') → array.quantile() and (measure, q, *, mode='inc', interpolation='linear')
- agg.variance(measure, mode='sample', scope=None) → agg.var() and (measure, *, mode='sample', scope=None)
- array.variance(measure, mode='sample') → array.var() and (measure, *, mode='sample')
BREAKING: avg has been renamed mean with .MEAN suffix for automatically created measures instead of .AVG:
- agg.avg(measure, scope=None) → agg.mean() and (measure, *, scope=None)
- array.avg() → array.mean()
BREAKING: Some function signatures have changed:
- cube.Cube.create_static_parameter_hierarchy: (level_and_hierarchy_name, members, indices=None, slicing=True, index_measure='', level_type=None) → (name, members, *, data_type=None, index_measure=None, indices=None, store_name=None) where slicing has been removed since it can be set afterwards through: hierarchy.Hierarchy.slicing.
- parent_value(): (measure, on_hierarchies=None, top_value=None) → (measure, on, *, apply_filters=False, degree=1, total_value=None). The two new parameters default to values equivalent to the previous behavior; see the function documentation for more details.
- scope.cumulative(): (level, partitioning=None, window=range(-2147483648, 0), exclude_self=False) → (level, *, partitioning=None, window=range(-2147483648, 0), exclude_self=False) where window can also accept a tuple of two time offsets to perform a rolling time period aggregation.
- simulation.Scenario.load_csv(): (file, delimiter=',') → (path, *, sep=',')
BREAKING: Some other function signatures have changed only to adopt keyword-only parameters (denoted by a * in the parameter list):
Upgraded from ActiveUI SDK 4.3.5 to 4.3.7. Pivot tables support new Tree, Pivot, and Table layouts, the latter making the Tabular View widget redundant so it has been removed from the available widgets.
session.Session.read_pandas(), store.Store.load_pandas(), simulation.Simulation.load_pandas(), and simulation.Scenario.load_pandas() automatically load columns made of numerical Python lists or Numpy one-dimensional ndarrays as arrays.
Stores without key columns are partitioned on their non-numerical columns by default.
Changed the behavior of agg.single_value() aggregation function to be more consistent with other aggregation functions (#40).
Cube names are not restricted to alphanumeric strings without spaces anymore.
The path parameter of all CSV loading functions accepts glob patterns (e.g. /path/**/*.csv).

Removed¶

BREAKING: simulation.Priority. Directly pass numbers to rank simulation rules instead.
BREAKING: Cube.create_bucketing() has moved to Cube._setup_bucketing() and is not part of the public API anymore. It might change in future releases without notice.
BREAKING: config.create_config()’s properties parameter. max_memory can be passed directly as a named-parameter instead. The other properties have been removed.
BREAKING: pow(measure_a, measure_b) replaced by measure_a ** measure_b.

Fixed¶

Inability to install atoti alongside Python > 3.7 when using Conda.
Issue with filter() not being aggregated correctly (#17, #28).
Metadata DBs created in atoti can be used in Atoti+ and reciprocally (#15).
Inability to create some measures or hierarchies after some partial joins (#4, #10).
Inability to load CSV folders from AWS S3 storage.
Slow read of files on AWS S3 when anonymous due to multiple timeouts in the credentials provider (#26).
Inability to use wildcards on fields types other than strings.
Inability to use numeric levels for measure simulations.

0.3.1 (2020-04-14)¶

First public release of atoti.