Berkeley DB Reference Guide:
Transaction Subsystem

PrevRefNext

Berkeley DB and transactions

The transaction subsystem makes operations atomic, consistent, isolated, and durable in the face of system and application failures. The subsystem requires that the data be properly logged and locked in order to attain these properties. Berkeley DB contains all the components necessary to transaction-protect the Berkeley DB access methods and other forms of data may be protected if they are logged and locked appropriately.

The transaction subsystem is created, initialized, and opened by calls to DBENV->open with the DB_INIT_TXN flag specified. Note that enabling transactions automatically enables logging, but does not enable locking, as a single thread of control that needed atomicity and recoverability would not require it.

The txn_begin function starts a transaction, returning an opaque handle to a transaction. If the parent parameter to txn_begin is non-NULL, then the new transaction is a child of the designated parent transaction.

The txn_abort function ends the designated transaction and causes all updates performed by the transaction to be undone. The end result is that the database is left in a state identical to the state that existed prior to the txn_begin. If the aborting transaction has any child transactions associated with it (even ones that have already been committed), they are also aborted. Any transactions that are unresolved (i.e., neither committed nor aborted) when the application or system fails are aborted during recovery.

The txn_commit function ends the designated transaction and makes all the updates performed by the transaction permanent, even in the face of application or system failure. If this is a parent transaction committing, then all child transactions that individually committed or had not been resolved are also committed.

Transactions are identified by 32-bit unsigned integers. The ID associated with any transaction can be obtained using the txn_id function. If an application is maintaining information outside of Berkeley DB that it wishes to transaction-protect, it should use this transaction ID as the locking ID.

The txn_checkpoint function causes a transaction checkpoint. A checkpoint is performed relative to a specific log sequence number (LSN), referred to as the checkpoint LSN. When a checkpoint completes successfully, it means that all data buffers whose updates are described by LSNs less than the checkpoint LSN have been written to disk. This, in turn, means that the log records less than the checkpoint LSN are no longer necessary for normal recovery (although they would be required for catastrophic recovery should the database files be lost) and all log files containing only records prior to the checkpoint LSN may be safely archived and removed.

It is possible that in order to complete a transaction checkpoint, it will be necessary to write a buffer that is currently in use (i.e., is actively being read or written by some transaction). In this case, txn_checkpoint will not be able to write the buffer, as doing so might cause an inconsistent version of the page to be written to disk, and instead of completing successfully will return with an error code of DB_INCOMPLETE. In such cases, the checkpoint can simply be retried after a short delay.

The interval between successive checkpoints is directly proportional to the length of time required to run normal recovery. If the interval between checkpoints is long, then a large number of updates that are recorded in the log may not yet be written to disk and recovery may take longer to run. If the interval is short, then data is being written to disk more frequently, but the recovery time will be shorter. Often, the checkpoint interval will be tuned for each specific application.

The txn_stat function returns information about the status of the transaction subsystem. It is the programmatic interface used by the db_stat utility.

The transaction system is closed by a call to DBENV->close.

Finally, the entire transaction system may be removed using the DBENV->remove interface.

PrevRefNext

Copyright Sleepycat Software