Reference#

Plugins#

Some atoti features require large additional libraries and might not be useful in every projects. To keep the core library as light as possible, these features are packaged into separate plugins that can be installed when needed.

Available plugins#

  • atoti-plus: Grant access to the Atoti+ features.

Data loading#

  • atoti-kafka: Load real time Kafka streams into atoti tables.

  • atoti-sql: Load results of SQL queries into atoti tables.

Cloud storage#
  • atoti-aws: Load CSV and Parquet files from AWS S3 into atoti tables.

  • atoti-azure: Load CSV and Parquet files from Azure Blob Storage into atoti tables.

  • atoti-gcp: Load CSV and Parquet files from Google cloud Storage into atoti tables.

These connectors open tens of HTTP connections to the cloud storage in order to transfer the data in parallel. They then transparently reassemble the blocks directly in memory. They can load up to 300 GB in about 5 minutes. Some parameters can impact the overall download speed:

  • Bandwidth of the network interface.

  • Speed of the CPU cores since HTTPS connections and client side-encryption consume CPU resources.

  • File size: small files will not have good download speed (< 60 MB/s).

  • Type (hot/cold) of the storage: hot storage is faster.

  • Data locality: best when the host running atoti and the data are in the same cloud region.

DirectQuery#

Use DirectQuery shows how to use DirectQuery.

Data visualization#

  • atoti-jupyterlab: Visualize the data in atoti session with interactive widgets in JupyterLab.

Installation#

A plugin can be installed as a Python package or as a Conda package.

For instance, to install the Kafka plugin:

Python package#

pip install atoti-kafka

Multiple plugins can be installed with the “extras” syntax:

pip install atoti[kafka,sql]

Conda package#

conda install atoti-kafka