atoti.tables.Tables.data_transaction()#
- Tables.data_transaction(scenario_name=None, *, allow_nested=True)#
Create a data transaction to batch several data loading operations.
It is more efficient than doing each loading operation one after the other.
It avoids possibly incorrect intermediate states (e.g. if loading some new data requires dropping existing rows first).
Note
Data transactions cannot be mixed with:
Long-running operations such as
load_kafka()
.Data model operations such as
join()
,read_parquet()
, or defining a new measure.Operations on parameter tables created from
create_parameter_hierarchy_from_members()
andcreate_parameter_simulation()
.Operations on other source scenarios than the one the transaction is started on.
- Parameters:
allow_nested (bool) –
Whether to allow starting this transaction inside an already running one.
When
False
, an error will be raised if this transaction is started while another transaction is already running, regardless of that outer transaction’s value of allow_nested. The benefit of passingFalse
is that changes made in this transaction are guaranteed, if not rolled back, to be visible to the statements outside the transaction. The drawback is that it prevents splitting transaction steps in small composable functions.When nested transactions are allowed, changes made by inner transactions contribute transparently to the outer transaction.
scenario_name (str | None) – The name of the source scenario impacted by all the table operations inside the transaction.
- Return type:
AbstractContextManager[None]
Example
>>> df = pd.DataFrame( ... columns=["City", "Price"], ... data=[ ... ("Berlin", 150.0), ... ("London", 240.0), ... ("New York", 270.0), ... ("Paris", 200.0), ... ], ... ) >>> table = session.read_pandas(df, keys=["City"], table_name="Cities") >>> cube = session.create_cube(table) >>> extra_df = pd.DataFrame( ... columns=["City", "Price"], ... data=[ ... ("Singapore", 250.0), ... ], ... ) >>> with session.tables.data_transaction(): ... table += ("New York", 100.0) ... table.drop(table["City"] == "Paris") ... table.load_pandas(extra_df) >>> table.head().sort_index() Price City Berlin 150.0 London 240.0 New York 100.0 Singapore 250.0 >>> table.drop()
Nested transactions allowed:
>>> def composable_function(session): ... table = session.tables["Cities"] ... with session.tables.data_transaction(): ... table += ("Paris", 100.0) >>> # The function can be called in isolation: >>> composable_function(session) >>> table.head().sort_index() Price City Paris 100.0 >>> with session.tables.data_transaction( ... allow_nested=False # No-op because this is the outer transaction. ... ): ... table.drop() ... table += ("Berlin", 200.0) ... # The function can also be called inside another transaction and will contribute to it: ... composable_function(session) ... table += ("New York", 150.0) >>> table.head().sort_index() Price City Berlin 200.0 New York 150.0 Paris 100.0
Nested transactions not allowed:
>>> def not_composable_function(session): ... table = session.tables["Cities"] ... with session.tables.data_transaction(allow_nested=False): ... table.drop() ... table += ("Paris", 100.0) ... assert table.row_count == 1 >>> # The function can be called in isolation: >>> not_composable_function(session) >>> with session.tables.data_transaction(): ... table.drop() ... table += ("Berlin", 200.0) ... # This is a programming error, the function cannot be called inside another transaction: ... not_composable_function(session) ... table += ("New York", 150.0) Traceback (most recent call last): ... RuntimeError: Cannot start this transaction inside another transaction since nesting is not allowed. >>> # The last transaction was rolled back: >>> table.head().sort_index() Price City Paris 100.0
See also