atoti.Session.read_arrow()#

Session.read_arrow(table, /, *, table_name, keys=(), partitioning=None, types={}, default_values={}, **kwargs)#

Read an Arrow Table into a table.

Parameters:

table (Table) – The Arrow Table to load.
table_name (str) – The name of the table to create.
keys (Collection[str]) – The columns that will become keys of the table.
partitioning (str | None) –
The description of how the data will be split across partitions of the table.

Default rules:
- Only non-joined tables are automatically partitioned.
- Tables are automatically partitioned by hashing their key columns. If there are no key columns, all the dictionarized columns are hashed.
- Joined tables can only use a sub-partitioning of the table referencing them.
- Automatic partitioning is done modulo the number of available cores.
Example

hash4(country) splits the data across 4 partitions based on the country column’s hash value.
types (Mapping[str, DataType]) – Types for some or all columns of the table. Types for non specified columns will be inferred from arrow DataTypes.
default_values (Mapping[str, ConstantValue | None]) – Mapping from column name to column default_value.

Return type:

Table

Example

>>> import pyarrow as pa
>>> arrow_table = pa.Table.from_arrays(
...     [
...         pa.array(["phone", "headset", "watch"]),
...         pa.array([600.0, 80.0, 250.0]),
...     ],
...     names=["Product", "Price"],
... )
>>> arrow_table
pyarrow.Table
Product: string
Price: double
----
Product: [["phone","headset","watch"]]
Price: [[600,80,250]]
>>> table = session.read_arrow(
...     arrow_table, keys=["Product"], table_name="Arrow"
... )
>>> table.head().sort_index()
         Price
Product
headset   80.0
phone    600.0
watch    250.0