|
The second component of the infrastructure is performing checkpoints of the log files. As transactions commit, change records are written into the log files, but the actual changes to the database are not necessarily written to disk. When a checkpoint is performed, the changes to the database that are part of committed transactions are written into the backing database file.
Performing checkpoints is necessary for two reasons. First, you can only remove the Berkeley DB log files from your system after a checkpoint. Second, the frequency of your checkpoints is inversely proportional to the amount of time it takes to run database recovery after a system or application failure.
Once the database pages are written, log files can be archived and removed from the system because they will never be needed for anything other than catastrophic failure. In addition, recovery after system or application failure only has to redo or undo changes since the last checkpoint, since changes before the checkpoint have all been flushed to the filesystem.
Berkeley DB provides a separate utility, db_checkpoint, which can be used to perform checkpoints. Alternatively, applications can write their own checkpoint utility using the underlying txn_checkpoint function. The following code fragment checkpoints the database environment every 60 seconds:
int main(int argc, char *argv) { extern char *optarg; extern int optind; DB *db_cats, *db_color, *db_fruit; DB_ENV *dbenv; pthread_t ptid; int ch;while ((ch = getopt(argc, argv, "")) != EOF) switch (ch) { case '?': default: usage(); } argc -= optind; argv += optind;
env_dir_create(); env_open(&dbenv);
/* Start a checkpoint thread. */ if ((errno = pthread_create( &ptid, NULL, checkpoint_thread, (void *)dbenv)) != 0) { fprintf(stderr, "txnapp: failed spawning checkpoint thread: %s\n", strerror(errno)); exit (1); }
/* Open database: Key is fruit class; Data is specific type. */ db_open(dbenv, &db_fruit, "fruit", 0);
/* Open database: Key is a color; Data is an integer. */ db_open(dbenv, &db_color, "color", 0);
/* * Open database: * Key is a name; Data is: company name, address, cat breeds. */ db_open(dbenv, &db_cats, "cats", 1);
add_fruit(dbenv, db_fruit, "apple", "yellow delicious");
add_color(dbenv, db_color, "blue", 0); add_color(dbenv, db_color, "blue", 3);
add_cat(dbenv, db_cats, "Amy Adams", "Sleepycat Software", "394 E. Riding Dr., Carlisle, MA 01741, USA", "abyssinian", "bengal", "chartreaux", NULL);
return (0); }
void * checkpoint_thread(void *arg) { DB_ENV *dbenv; int ret;
dbenv = arg; dbenv->errx(dbenv, "Checkpoint thread: %lu", (u_long)pthread_self());
/* Checkpoint once a minute. */ for (;; sleep(60)) switch (ret = txn_checkpoint(dbenv, 0, 0, 0)) { case 0: case DB_INCOMPLETE: break; default: dbenv->err(dbenv, ret, "checkpoint thread"); exit (1); }
/* NOTREACHED */ }
As checkpoints can be quite expensive, choosing how often to perform a checkpoint is a common tuning parameter for Berkeley DB applications.