Changelog¶
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
0.5.0 (2020-12-08)¶
Highlights:
Some functionalities have been moved to plugin packages to lighten the core
atoti
package.The web application and the JupyterLab extension have been rewritten from scratch to provide better performances and a simpler experience. atoti’s JupyterLab extension leverages JupyterLab 3’s federated extension system meaning that Node.js and the rebuilding of JupyterLab are not required anymore for its installation. It is also distributed as an atoti plugin instead of a separate npm or Conda package.
Data can be loaded from more sources: Amazon S3, Azure Blob Storage, Google Cloud Storage, and SQL databases. On the fly decompression of CSV files stored in
.gz
,.tar.gz
, or.zip
archives has also been added.The name of the default dimension of a hierarchy has changed from Hierarchies to the name of the store on which the hierarchy is based.
Added¶
Plugins bringing additional features:
atoti-azure
to load CSV and parquet file from Azure Blob Storage.atoti-gcp
to load CSV and parquet file from Google Cloud Storage.atoti-jupyterlab
to make interactive visualizations on top of atoti cubes in JupyterLab. It enables thesession.Session.visualize()
andquery.session.QuerySession.visualize()
methods.atoti-sql
to load results of SQL queries into atoti stores.
Reports
about the data loaded into stores including number of lines, errors and duration. A warning is now issued in the notebook if an error occured during the loading (#58 and #64).Support for path parameters in
endpoint()
’sroute
parameter.Hierarchy visibility can be toggled through the
visible
attribute.Support for reading
.gz
,.tar.gz
and.zip
files containing compressed CSV(s) (#123).array.n_lowest_indices()
andarray.n_greatest_indices()
to retrieve the indices of the lowest or greatest values of an array measure (#153).array.prod()
to do the product of all the elements of an array (#113).value()
to create a measure based on the value of a store column.hierarchized_columns
parameter to select which columns of a store are converted into hierarchies. It is available in these methods:config.create_ldap_authentication()
to setup LDAP authentication in Atoti+.Support for multiple hierarchies in
total()
.Support for negative value in array indexing (#149).
Support for
named_measure.NamedMeasure
representing booleans inwhere()
’scondition
parameter (#94).cube.Cube.create_store_column_parameter_hierarchy()
to create parameter hierarchies from existing store columns.array.quantile_index()
returning the index of the desired quantile.Measure
description
can be changed (#167).Runtime type checking on all the public API functions.
branding
,extra_jars
,https
, andsame_site
parameters tocreate_config()
.
Experimental¶
The atoti.experimental
module regroups new features that can go through breaking changes in minor and/or patch releases.
Its initial content is:
atoti.experimental.distributed
to create distributed clusters of atoti cubes.atoti.experimental.finance.irr()
to compute an internal rate of return.atoti.experimental.stats
providing the probability distribution functionspdf
,cdf
andppf
for Normal, Chi-square, Student’s t, Beta and F distributions.
Changed¶
BREAKING: The web application requires a new initial file structure in the metadata DB. Metadata DBs created in previous versions are not compatible with this version and will have to be recreated.
BREAKING:
Cube.visualize()
has been replaced withsession.Session.visualize()
that requires theatoti-jupyterlab
plugin. Widgets made withCube.visualize()
will have to be rebuilt with the new JupyterLab extension.BREAKING: AWS S3 and Kafka loading are no longer supported in the base package, they require the plugins
atoti-aws
andatoti-kafka
respectively.BREAKING:
Hierarchy
are put in a dimension with the same name as the store which feeds their levels.BREAKING: math functions have been moved to the
atoti.math
module:math.abs()
,math.ceil()
,math.cos()
,math.exp()
,math.floor()
,math.log()
,math.log10()
,math.max()
,math.min()
,math.round()
,math.sin()
,math.sqrt()
andmath.tan()
.BREAKING:
parent_value()
’sdegree
parameter has been replaced by adegrees
mapping to support multiple hierarchies.BREAKING:
comparator.first_members()
’smembers
parameter has been made variadic instead of accepting a collection.BREAKING:
atoti.types
module andAtotiType
class have been respectively renamedatoti.type
andDataType
. Array and nullable types have also been renamed for improved grammar and consistency.BREAKING:
session.Session.endpoint()
’smethod
parameter has been made keyword-only.BREAKING:
level.Level.data_type
’s type changed fromstr
toDataType
.BREAKING: The constructors of the following classes are no longer part of the API and have been replaced by factory functions:
config.SessionConfiguration
→config.create_config()
config.BasicAuthentication
→config.create_basic_authentication()
config.BasicUser
→config.create_basic_user()
config.OidcAuthentication
→config.create_oidc_authentication()
query.basic_auth.BasicAuthentication
→query.create_basic_authentication()
BREAKING:
cube.Cube.create_parameter_hierarchy()
has been renamedcube.Cube.create_static_parameter_hierarchy()
.BREAKING: Store names inferred from file paths are capitalized.
BREAKING: Key columns cannot be nullable anymore and are automatically made non nullable. String and date columns are also inferred as non nullable.
BREAKING:
config.create_config()
’sinherit
parameter has been renamedinherit_global_config
.BREAKING: JSON responses generated from
endpoint()
are no longer encapsulated into an object withdata
andstatus
keys.BREAKING:
.VALUE
measures are no longer automatically created from numeric columns of joined stores.The first MDX query run by an atoti widget in JupyterLab is no longer executed in Python. Instead, the query is executed client-side like before 0.4.3 and the call to
visualize()
will block until this first query is done.session.Session.query_mdx()
andquery.session.QuerySession.query_mdx()
support any MDX SELECT query (more than 2 axes, measures on rows, or totals). Empty measure values will also be kept asNone
in the resulting DataFrame instead of being converted toNaN
.ROLE_USER
is no longer automatically added to the role mapping ofconfig.create_oidc_authentication()
and must be given explicitly.atoti’s Conda package depends on jdk4py so the installation of the
openjdk
Conda package is no longer required.The data loaded while a sampling mode is active is now consistent between store and cube manipulations.
Deprecated¶
BREAKING:
agg.stop()
moved toagg._stop()
as its behavior can be replicated withwhere()
:- m["Stopped price"] = tt.agg.stop(m["Price"], lvl["Product"], lvl["Shop"]) + m["Stopped price"] = tt.where( + (lvl["Product"] != None) & (lvl["Shop"] != None), m["Price"] + )
BREAKING:
agg.single_value()
moved toagg._single_value()
asvalue()
can be used for its main use-case: creating a measure based on the value of a store column.
Fixed¶
HTML entities are correctly encoded in widget snpashots (#148).
Issue with boolean type in Parquet files (#157).
Issue when passing a measure to the
n
parameter ofarray.n_lowest()
,array.nth_lowest()
,array.n_greatest()
andarray.nth_greatest()
(#159).agg.count_distinct()
support for measure and scope parameters.Issue with
date_shift()
anddate_diff()
when applied on a date level with the default"N/A"
member (#180).
0.4.3 (2020-09-01)¶
Added¶
agg.sum_product()
return a measure equal to the product and sum of two measures or fields.The modulo
%
operation between measures (#48).Support reading Parquet files from AWS S3 (#27).
store.Store.drop()
remove rows from a store by specifying column names and values.atoti widgets are snapshotted as SVG images in the notebook on save. These images will appear in HTML exports or in GitHub previews of the notebook.
Context menu actions for widgets of the atoti JupyterLab extension:
Undo and Redo (#71).
Convert to atoti Widget Below available in DataFrames returned by
session.Session.query_mdx()
andcube.Cube.query()
to start an interactive exploration from the same MDX query (#49).Open state editor to navigate to the notebook cell metadata editor (#104).
Changed¶
Java is no longer required in the pip installation since the atoti Python wheel now relies on jdk4py.
The Measure simulations and Source simulation editors have been redesigned. They now both have their dedicated page instead of being widgets embeddable in dashboards. The Measure simulations editor also issue more minimal updates to the underlying store (#70).
Upgraded from ActiveUI SDK 4.3.8 to 4.3.11 (#30, #59, #74, #79, #84, #87 and #110).
session.Session.query_mdx()
,cube.Cube.query()
,query.session.QuerySession.query_mdx()
, andquery.cube.QueryCube.query()
return DataFrames displaying formatted values and providing astyle
attribute reflecting the potential styling properties of the corresponding cell set (#91).The first MDX query ran by an atoti widget in JupyterLab is now executed in Python and its resulting cell set is outputted to the corresponding notebook cell. It allows to ensure that the data displayed by the widget reflects the expected state of the cube without having to block the IPython kernel in a fragile way.
Added ability to append rows in all scenarios of a store with
store.Store.append()
.
Removed¶
The creation of calculated measures and the action to add a measure computing the difference between to table columns have been disabled in the JupyterLab widget extension to incentivize creating these measures in Python instead. These features are still available in the app (#21).
Deprecated¶
The constructors of
config.SessionConfiguration
,config.BasicAuthentication
,config.BasicUser
andconfig.OidcAuthentication
have been made private and using them is deprecated. They have been replaced by factory functions:config.create_config()
,config.create_basic_authentication()
,config.create_basic_user()
andconfig.create_oidc_authentication()
.query.basic_auth.BasicAuthentication
, replaced byquery.create_basic_authentication()
.
Fixed¶
Issue with dates not correctly converted to Python datetime in
store.Store.head()
(#97).Issue getting vector element with a measure containing long (#115).
Issue with aggregation function not preserved when creating a simulated measure (#121).
0.4.2 (2020-07-15)¶
Added¶
Kafka streaming data source through
store.Store.load_kafka()
.session.Session.endpoint()
decorator adds HTTP endpoints to the session from a Python callback.array.sort()
has a newascending
parameter.True
by default, it allows to choose the sorting order.scope.cumulative()
has a newdense
parameter.False
by default, it allows to choose whether to include all of a level’s members in the cumulative aggregation, even those for which the underlying measure has no values.Atoti+ now supports i18n.
en-US
is the only locale supported by default but additional locales can be made available by providing custom translation files. These can be configured withconfig.create_config()
A good starting point for adding new locales is to use the template containing all the translatable items, which can be obtained by usingsession.Session.export_translations_template()
.The Gauss error function
math.erf()
and its complementarymath.erfc()
(#92).
Changed¶
The
name_attribute
parameter used to select the displayed username when using an OpenID Connect can be configured.The
scope
parameter used to select the requested scopes when using an OpenID Connect provider can be configured. Theopenid
scope is always passed by default.
Fixed¶
Missing images in the tutorial (#80).
Issue when reading pandas DataFrame with
NaN
(#77).hierarchy.Hierarchy.isin()
andlevel.Level.isin()
can be used with more than 2 values (#93).Issue with array types not displayed correctly in the stores schema.
Issue when joining a column of type int to a column of type long if the store is based on a parquet file (#76).
0.4.1 (2020-06-17)¶
Added¶
New tutorial exploring the main basic features of atoti.
rank()
returns a measure ranking the members of a given hierarchy based on the value of another measure.array.prefix_sum()
performs the prefix sum of array measures.Hierarchies can have the same name if they are in different dimensions. To avoid conflicts, a hierarchy can be accessed via a tuple containing the dimension and the hierarchy:
cube.hierarchies["Product", "Size"]
.
Changed¶
Bumped the minimal required version of JupyterLab to 2.1.
Upgraded from ActiveUI SDK 4.3.7 to 4.3.8.
Better messages for Java known errors (#43).
The Auth0 support in Atoti+ has been replaced by the more general OpenID Connect authentication protocol. The structure of the configuration can be seen in the configuration tutorial.
Fixed¶
filter()
’smeasure
parameter accepts any value that can be converted to a measure (#22).filter()
andwhere()
support inequalities on dates as conditions.Issue when loading data into a scenario with
truncate
set toTrue
(#53).Issue with
agg.quantile()
combined withscope.origin()
.Issue when aggregating
.VALUE
measures using any of theagg.xxx
functions (#52).Type issue that sometimes happened when chaining operators such as
array.quantile()
anddate_shift()
.Blinking cell updates not appearing in pivot tables with real time queries.
0.4.0 (2020-05-25)¶
Added¶
agg.max_member()
andagg.min_member()
return a measure equal to the member reaching the correspinding extremum of the passed measure on the given level.hierarchy.Hierarchy.isin()
,query.hierarchy.QueryHierarchy.isin()
,level.Level.isin()
, andquery.level.QueryLevel.isin()
create conditions expressing that a hierarchy or a level should be on one of the given members.stores.Stores.schema
andcube.Cube.schema
: SVG graphs of, respectively, all the session’s stores and the stores used by a cube.store.StoreScenarios.load_csv()
loads a directory of CSV files into a store, automatically generating scenarios based on the directory’s structure.total()
returns the total value on each hierarchy member.session.Session.create_store()
creates an empty store from a schema.Exponentiation operation between measures:
measure_a ** measure_b
.
Changed¶
BREAKING: Hierarchies, levels, and measures can no longer be passed by name, instances of the corresponding class are expected instead.
BREAKING:
create_session()
’sport
,max_memory
,java_args
andsampling_mode
parameters and theATOTI_URL_PATTERN
environment variable have been moved to theconfig.SessionConfiguration
changing these signatures:create_session()
:(name='Unnamed', sampling_mode=SamplingMode(name='limit_lines', parameters=[10000]), port=None, max_memory=None, java_args=None, config=None, **kwargs)
→(name='Unnamed', *, config=None)
config.create_config()
:(inherit=True, metadata_db=None, roles=None, authentication=None, properties=None)
→(*, inherit=True, port=None, url_pattern=None, metadata_db=None, roles=None, authentication=None, sampling_mode=None, max_memory=None, java_args=None)
BREAKING: New structure for the authentication configuration in YAML as shown in the
configuration tutorial <tutorial/02-configuration:Auth0>
.BREAKING:
config.BasicUser.roles
andconfig.Auth0Authentication.role_mapping
do not accept role instances anymore, only role names.BREAKING: The wildcard value in measure simulations has been changed from
*
toNone
.BREAKING:
session.Session.read_pandas()
,session.Session.read_spark()
andsession.Session.read_numpy()
require a name for the created store:session.Session.read_numpy()
:(data, columns, store_name, keys, in_all_scenarios=True, partitioning=None, sep='|')
→(array, columns, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, **kwargs)
session.Session.read_pandas()
:(dataframe, keys=None, store_name=None, partitioning=None, types=None, **kwargs)
→(dataframe, store_name, *, keys=None, in_all_scenarios=True, partitioning=None, types=None, **kwargs)
session.Session.read_spark()
:(dataframe, keys=None, store_name=None, partitioning=None)
→(dataframe, store_name, *, keys=None, in_all_scenarios=True, partitioning=None)
BREAKING:
simulation.Scenario.insert(row)
andstore.Store.insert_rows(rows)
have been renamedsimulation.Scenario.append()
andstore.Store.append()
. They take a variadic*rows
parameter and in place addition of a single row is still supported with+=
.BREAKING:
percentile
andvariance
functions have been renamedquantile
andvar
:agg.percentile(measure, percentile_value, mode='inc', interpolation='linear', scope=None)
→agg.quantile()
and(measure, q, *, mode='inc', interpolation='linear', scope=None)
array.percentile(measure, percentile_value, mode='inc', interpolation='linear')
→array.quantile()
and(measure, q, *, mode='inc', interpolation='linear')
agg.variance(measure, mode='sample', scope=None)
→agg.var()
and(measure, *, mode='sample', scope=None)
array.variance(measure, mode='sample')
→array.var()
and(measure, *, mode='sample')
BREAKING:
avg
has been renamedmean
with .MEAN suffix for automatically created measures instead of .AVG:agg.avg(measure, scope=None)
→agg.mean()
and(measure, *, scope=None)
array.avg()
→array.mean()
BREAKING: Some function signatures have changed:
cube.Cube.create_static_parameter_hierarchy
:(level_and_hierarchy_name, members, indices=None, slicing=True, index_measure='', level_type=None)
→(name, members, *, data_type=None, index_measure=None, indices=None, store_name=None)
whereslicing
has been removed since it can be set afterwards through:hierarchy.Hierarchy.slicing
.parent_value()
:(measure, on_hierarchies=None, top_value=None)
→(measure, on, *, apply_filters=False, degree=1, total_value=None)
. The two new parameters default to values equivalent to the previous behavior; see the function documentation for more details.scope.cumulative()
:(level, partitioning=None, window=range(-2147483648, 0), exclude_self=False)
→(level, *, partitioning=None, window=range(-2147483648, 0), exclude_self=False)
wherewindow
can also accept a tuple of two time offsets to perform a rolling time period aggregation.simulation.Scenario.load_csv()
:(file, delimiter=',')
→(path, *, sep=',')
BREAKING: Some other function signatures have changed only to adopt keyword-only parameters (denoted by a
*
in the parameter list):agg.single_value
Upgraded from ActiveUI SDK 4.3.5 to 4.3.7. Pivot tables support new Tree, Pivot, and Table layouts, the latter making the Tabular View widget redundant so it has been removed from the available widgets.
session.Session.read_pandas()
,store.Store.load_pandas()
,simulation.Simulation.load_pandas()
, andsimulation.Scenario.load_pandas()
automatically load columns made of numerical Python lists or Numpy one-dimensional ndarrays as arrays.Stores without key columns are partitioned on their non-numerical columns by default.
Changed the behavior of
agg.single_value()
aggregation function to be more consistent with other aggregation functions (#40).Cube names are not restricted to alphanumeric strings without spaces anymore.
The
path
parameter of all CSV loading functions accepts glob patterns (e.g./path/**/*.csv
).
Removed¶
BREAKING:
simulation.Priority
. Directly pass numbers to rank simulation rules instead.BREAKING:
Cube.create_bucketing()
has moved toCube._setup_bucketing()
and is not part of the public API anymore. It might change in future releases without notice.BREAKING:
config.create_config()
’sproperties
parameter.max_memory
can be passed directly as a named-parameter instead. The other properties have been removed.BREAKING:
pow(measure_a, measure_b)
replaced bymeasure_a ** measure_b
.
Fixed¶
Inability to install atoti alongside Python > 3.7 when using Conda.
Issue with
filter()
not being aggregated correctly (#17, #28).Metadata DBs created in atoti can be used in Atoti+ and reciprocally (#15).
Inability to create some measures or hierarchies after some partial joins (#4, #10).
Inability to load CSV folders from AWS S3 storage.
Slow read of files on AWS S3 when anonymous due to multiple timeouts in the credentials provider (#26).
Inability to use wildcards on fields types other than strings.
Inability to use numeric levels for measure simulations.
0.3.1 (2020-04-14)¶
First public release of atoti.