atoti.Table.load()#

Table.load(data, /)#

Load data into the table.

This is a blocking operation: the method will not return until all the data is loaded.

Parameters:

data (Table | DataFrame | DataLoad) – The data to load.

Return type:

None

Example

>>> from datetime import date
>>> table = session.create_table(
...     "Sales",
...     data_types={
...         "ID": "String",
...         "Product": "String",
...         "Price": "int",
...         "Quantity": "int",
...         "Date": "LocalDate",
...     },
...     keys={"ID"},
... )

Loading an Arrow table:

>>> import pyarrow as pa
>>> arrow_table = pa.Table.from_pydict(
...     {
...         "ID": pa.array(["ab", "cd"]),
...         "Product": pa.array(["phone", "watch"]),
...         "Price": pa.array([699, 349]),
...         "Quantity": pa.array([1, 2]),
...         "Date": pa.array([date(2024, 3, 5), date(2024, 12, 12)]),
...     }
... )
>>> table.load(arrow_table)
>>> table.head().sort_index()
   Product  Price  Quantity       Date
ID
ab   phone    699         1 2024-03-05
cd   watch    349         2 2024-12-12

Loading a pandas DataFrame:

>>> import pandas as pd
>>> pandas_dataframe = pd.DataFrame(
...     {
...         "ID": ["ef", "gh"],
...         "Product": ["laptop", "book"],
...         "Price": [2599, 19],
...         "Quantity": [3, 5],
...         "Date": [date(2023, 8, 10), date(2024, 1, 13)],
...     }
... )
>>> table.load(pandas_dataframe)
>>> table.head().sort_index()
   Product  Price  Quantity       Date
ID
ab   phone    699         1 2024-03-05
cd   watch    349         2 2024-12-12
ef  laptop   2599         3 2023-08-10
gh    book     19         5 2024-01-13

Loading a NumPy array by converting it to a pandas DataFrame:

>>> import numpy as np
>>> numpy_array = np.asarray(
...     [
...         ["ij", "watch", 299, 1, date(2022, 7, 20)],
...         ["kl", "keyboard", 69, 1, date(2023, 5, 8)],
...     ],
...     dtype=object,
... )
>>> table.load(pd.DataFrame(numpy_array, columns=list(table)))
>>> table.head(10).sort_index()
     Product  Price  Quantity       Date
ID
ab     phone    699         1 2024-03-05
cd     watch    349         2 2024-12-12
ef    laptop   2599         3 2023-08-10
gh      book     19         5 2024-01-13
ij     watch    299         1 2022-07-20
kl  keyboard     69         1 2023-05-08

Loading a Spark DataFrame by converting it to a pandas DataFrame:

>>> from pyspark.sql import Row, SparkSession
>>> spark = SparkSession.builder.getOrCreate()
>>> spark_dataframe = spark.createDataFrame(
...     [
...         Row(
...             ID="mn",
...             Product="glasses",
...             Price=129,
...             Quantity=2,
...             Date=date(2021, 3, 3),
...         ),
...         Row(
...             ID="op",
...             Product="battery",
...             Price=49,
...             Quantity=2,
...             Date=date(2024, 11, 7),
...         ),
...     ]
... )
>>> table.load(spark_dataframe.toPandas())
>>> spark.stop()
>>> table.head(10).sort_index()
     Product  Price  Quantity       Date
ID
ab     phone    699         1 2024-03-05
cd     watch    349         2 2024-12-12
ef    laptop   2599         3 2023-08-10
gh      book     19         5 2024-01-13
ij     watch    299         1 2022-07-20
kl  keyboard     69         1 2023-05-08
mn   glasses    129         2 2021-03-03
op   battery     49         2 2024-11-07

The += operator is available as syntax sugar to load a single row expressed either as a tuple or a Mapping:

>>> table += ("qr", "mouse", 29, 3, date(2024, 11, 7))
>>> table.head(10).sort_index()
     Product  Price  Quantity       Date
ID
ab     phone    699         1 2024-03-05
cd     watch    349         2 2024-12-12
ef    laptop   2599         3 2023-08-10
gh      book     19         5 2024-01-13
ij     watch    299         1 2022-07-20
kl  keyboard     69         1 2023-05-08
mn   glasses    129         2 2021-03-03
op   battery     49         2 2024-11-07
qr     mouse     29         3 2024-11-07
>>> table += {  # The order of the keys does not matter.
...     "Product": "screen",
...     "Quantity": 1,
...     "Price": 599,
...     "Date": date(2023, 5, 8),
...     "ID": "st",
... }
>>> table.head(10).sort_index()
     Product  Price  Quantity       Date
ID
ab     phone    699         1 2024-03-05
cd     watch    349         2 2024-12-12
ef    laptop   2599         3 2023-08-10
gh      book     19         5 2024-01-13
ij     watch    299         1 2022-07-20
kl  keyboard     69         1 2023-05-08
mn   glasses    129         2 2021-03-03
op   battery     49         2 2024-11-07
qr     mouse     29         3 2024-11-07
st    screen    599         1 2023-05-08