Cursor operations

A database cursor refers to a single key/data pair in the database. It supports traversal of the database and is the only way to access individual duplicate data items. Cursors are used for operating on collections of records, for iterating over a database, and for saving handles to individual records, so that they can be modified after they have been read.

The DB->cursor() method opens a cursor into a database. Upon return the cursor is uninitialized, cursor positioning occurs as part of the first cursor operation.

Once a database cursor has been opened, records may be retrieved (DBC->get()), stored (DBC->put()), and deleted (DBC->del()).

Additional operations supported by the cursor handle include duplication (DBC->dup()), equality join (DB->join()), and a count of duplicate data items (DBC->count()). Cursors are eventually closed using DBC->close().

For more information on the operations supported by the cursor handle, see the Database Cursors and Related Methods section in the Berkeley DB C API Reference Guide.

Retrieving records with a cursor

The DBC->get() method retrieves records from the database using a cursor. The DBC->get() method takes a flag which controls how the cursor is positioned within the database and returns the key/data item associated with that positioning. Similar to DB->get(), DBC->get() may also take a supplied key and retrieve the data associated with that key from the database. There are several flags that you can set to customize retrieval.

Cursor position flags

DB_FIRST, DB_LAST

Return the first (last) record in the database.

DB_NEXT, DB_PREV

Return the next (previous) record in the database.

DB_NEXT_DUP

Return the next record in the database, if it is a duplicate data item for the current key. For Heap databases, this flag always results in the cursor returning the DB_NOTFOUND error.

DB_NEXT_NODUP, DB_PREV_NODUP

Return the next (previous) record in the database that is not a duplicate data item for the current key.

DB_CURRENT

Return the record from the database to which the cursor currently refers.

Retrieving specific key/data pairs

DB_SET

Return the record from the database that matches the supplied key. In the case of duplicates the first duplicate is returned and the cursor is positioned at the beginning of the duplicate list. The user can then traverse the duplicate entries for the key.

DB_SET_RANGE

Return the smallest record in the database greater than or equal to the supplied key. This functionality permits partial key matches and range searches in the Btree access method.

DB_GET_BOTH

Return the record from the database that matches both the supplied key and data items. This is particularly useful when there are large numbers of duplicate records for a key, as it allows the cursor to easily be positioned at the correct place for traversal of some part of a large set of duplicate records.

DB_GET_BOTH_RANGE

If used on a database configured for sorted duplicates, this returns the smallest record in the database greater than or equal to the supplied key and data items. If used on a database that is not configured for sorted duplicates, this flag behaves identically to DB_GET_BOTH.

Retrieving based on record numbers

DB_SET_RECNO

If the underlying database is a Btree, and was configured so that it is possible to search it by logical record number, retrieve a specific record based on a record number argument.

DB_GET_RECNO

If the underlying database is a Btree, and was configured so that it is possible to search it by logical record number, return the record number for the record to which the cursor refers.

Special-purpose flags

DB_CONSUME

Read-and-delete: the first record (the head) of the queue is returned and deleted. The underlying database must be a Queue.

DB_RMW

Read-modify-write: acquire write locks instead of read locks during retrieval. This can enhance performance in threaded applications by reducing the chance of deadlock.

In all cases, the cursor is repositioned by a DBC->get() operation to point to the newly-returned key/data pair in the database.

The following is a code example showing a cursor walking through a database and displaying the records it contains to the standard output:

  1. int
  2. display(char *database)
  3. {
  4. DB *dbp;
  5. DBC *dbcp;
  6. DBT key, data;
  7. int close_db, close_dbc, ret;
  8. close_db = close_dbc = 0;
  9. /* Open the database. */
  10. if ((ret = db_create(&dbp, NULL, 0)) != 0) {
  11. fprintf(stderr,
  12. "%s: db_create: %s\n", progname, db_strerror(ret));
  13. return (1);
  14. }
  15. close_db = 1;
  16. /* Turn on additional error output. */
  17. dbp->set_errfile(dbp, stderr);
  18. dbp->set_errpfx(dbp, progname);
  19. /* Open the database. */
  20. if ((ret = dbp->open(dbp, NULL, database, NULL,
  21. DB_UNKNOWN, DB_RDONLY, 0)) != 0) {
  22. dbp->err(dbp, ret, "%s: DB->open", database);
  23. goto err;
  24. }
  25. /* Acquire a cursor for the database. */
  26. if ((ret = dbp->cursor(dbp, NULL, &dbcp, 0)) != 0) {
  27. dbp->err(dbp, ret, "DB->cursor");
  28. goto err;
  29. }
  30. close_dbc = 1;
  31. /* Initialize the key/data return pair. */
  32. memset(&key, 0, sizeof(key));
  33. memset(&data, 0, sizeof(data));
  34. /* Walk through the database and print out the key/data pairs. */
  35. while ((ret = dbcp->get(dbcp, &key, &data, DB_NEXT)) == 0)
  36. printf("%.*s : %.*s\n",
  37. (int)key.size, (char *)key.data,
  38. (int)data.size, (char *)data.data);
  39. if (ret != DB_NOTFOUND) {
  40. dbp->err(dbp, ret, "DBcursor->get");
  41. goto err;
  42. }
  43. err: if (close_dbc && (ret = dbcp->close(dbcp)) != 0)
  44. dbp->err(dbp, ret, "DBcursor->close");
  45. if (close_db && (ret = dbp->close(dbp, 0)) != 0)
  46. fprintf(stderr,
  47. "%s: DB->close: %s\n", progname, db_strerror(ret));
  48. return (0);
  49. }

Storing records with a cursor

The DBC->put() method stores records into the database using a cursor. In general, DBC->put() takes a key and inserts the associated data into the database, at a location controlled by a specified flag.

There are several flags that you can set to customize storage:

DB_AFTER

Create a new record, immediately after the record to which the cursor refers.

DB_BEFORE

Create a new record, immediately before the record to which the cursor refers.

DB_CURRENT

Replace the data part of the record to which the cursor refers.

DB_KEYFIRST

Create a new record as the first of the duplicate records for the supplied key.

DB_KEYLAST

Create a new record, as the last of the duplicate records for the supplied key.

In all cases, the cursor is repositioned by a DBC->put() operation to point to the newly inserted key/data pair in the database.

The following is a code example showing a cursor storing two data items in a database that supports duplicate data items:

  1. int
  2. store(DB *dbp)
  3. {
  4. DBC *dbcp;
  5. DBT key, data;
  6. int ret;
  7. /*
  8. * The DB handle for a Btree database supporting duplicate data
  9. * items is the argument; acquire a cursor for the database.
  10. */
  11. if ((ret = dbp->cursor(dbp, NULL, &dbcp, 0)) != 0) {
  12. dbp->err(dbp, ret, "DB->cursor");
  13. goto err;
  14. }
  15. /* Initialize the key. */
  16. memset(&key, 0, sizeof(key));
  17. key.data = "new key";
  18. key.size = strlen(key.data) + 1;
  19. /* Initialize the data to be the first of two duplicate records. */
  20. memset(&data, 0, sizeof(data));
  21. data.data = "new key's data: entry #1";
  22. data.size = strlen(data.data) + 1;
  23. /* Store the first of the two duplicate records. */
  24. if ((ret = dbcp->put(dbcp, &key, &data, DB_KEYFIRST)) != 0)
  25. dbp->err(dbp, ret, "DB->cursor");
  26. /* Initialize the data to be the second of two duplicate records. */
  27. data.data = "new key's data: entry #2";
  28. data.size = strlen(data.data) + 1;
  29. /*
  30. * Store the second of the two duplicate records. No duplicate
  31. * record sort function has been specified, so we explicitly
  32. * store the record as the last of the duplicate set.
  33. */
  34. if ((ret = dbcp->put(dbcp, &key, &data, DB_KEYLAST)) != 0)
  35. dbp->err(dbp, ret, "DB->cursor");
  36. err: if ((ret = dbcp->close(dbcp)) != 0)
  37. dbp->err(dbp, ret, "DBcursor->close");
  38. return (0);
  39. }

Note

If you are using the Heap access method and you are creating a new record in the database, then the key that you provide to the DBC->put() method should be empty. The DBC->put() method will return the record’s ID (RID) in the key. The RID is automatically created for you when Heap database records are created.

Deleting records with a cursor

The DBC->del() method deletes records from the database using a cursor. The DBC->del() method deletes the record to which the cursor currently refers. In all cases, the cursor position is unchanged after a delete.

Duplicating a cursor

Once a cursor has been initialized (for example, by a call to DBC->get()), it can be thought of as identifying a particular location in a database. The DBC->dup() method permits an application to create a new cursor that has the same locking and transactional information as the cursor from which it is copied, and which optionally refers to the same position in the database.

In order to maintain a cursor position when an application is using locking, locks are maintained on behalf of the cursor until the cursor is closed. In cases when an application is using locking without transactions, cursor duplication is often required to avoid self-deadlocks. For further details, refer to Berkeley DB Transactional Data Store locking conventions.

Equality Join

Berkeley DB supports “equality” (also known as “natural”), joins on secondary indices. An equality join is a method of retrieving data from a primary database using criteria stored in a set of secondary indices. It requires the data be organized as a primary database which contains the primary key and primary data field, and a set of secondary indices. Each of the secondary indices is indexed by a different secondary key, and, for each key in a secondary index, there is a set of duplicate data items that match the primary keys in the primary database.

For example, let’s assume the need for an application that will return the names of stores in which one can buy fruit of a given color. We would first construct a primary database that lists types of fruit as the key item, and the store where you can buy them as the data item:

Primary key:Primary data:
appleConvenience Store
blueberryFarmer’s Market
peachShopway
pearFarmer’s Market
raspberryShopway
strawberryFarmer’s Market

We would then create a secondary index with the key color, and, as the data items, the names of fruits of different colors.

Secondary key:Secondary data:
blueblueberry
redapple
redraspberry
redstrawberry
yellowpeach
yellowpear

This secondary index would allow an application to look up a color, and then use the data items to look up the stores where the colored fruit could be purchased. For example, by first looking up blue, the data item blueberry could be used as the lookup key in the primary database, returning Farmer’s Market.

Your data must be organized in the following manner in order to use the DB->join() method:

  1. The actual data should be stored in the database represented by the DB object used to invoke this method. Generally, this DB object is called the primary.
  2. Secondary indices should be stored in separate databases, whose keys are the values of the secondary indices and whose data items are the primary keys corresponding to the records having the designated secondary key value. It is acceptable (and expected) that there may be duplicate entries in the secondary indices.

    These duplicate entries should be sorted for performance reasons, although it is not required. For more information see the DB_DUPSORT flag to the DB->set_flags() method.

What the DB->join() method does is review a list of secondary keys, and, when it finds a data item that appears as a data item for all of the secondary keys, it uses that data item as a lookup into the primary database, and returns the associated data item.

If there were another secondary index that had as its key the cost of the fruit, a similar lookup could be done on stores where inexpensive fruit could be purchased:

Secondary key:Secondary data:
expensiveblueberry
expensivepeach
expensivepear
expensivestrawberry
inexpensiveapple
inexpensivepear
inexpensiveraspberry

The DB->join() method provides equality join functionality. While not strictly cursor functionality, in that it is not a method off a cursor handle, it is more closely related to the cursor operations than to the standard DB operations.

It is also possible to do lookups based on multiple criteria in a single operation. For example, it is possible to look up fruits that are both red and expensive in a single operation. If the same fruit appeared as a data item in both the color and expense indices, then that fruit name would be used as the key for retrieval from the primary index, and would then return the store where expensive, red fruit could be purchased.

Example

Consider the following three databases:

personnel

  • key = SSN
  • data = record containing name, address, phone number, job title

lastname

  • key = lastname
  • data = SSN

jobs

  • key = job title
  • data = SSN

Consider the following query:

  1. Return the personnel records of all people named smith with the job
  2. title manager.

This query finds are all the records in the primary database (personnel) for whom the criteria lastname=smith and job title=manager is true.

Assume that all databases have been properly opened and have the handles: pers_db, name_db, job_db. We also assume that we have an active transaction to which the handle txn refers.

  1. DBC *name_curs, *job_curs, *join_curs;
  2. DBC *carray[3];
  3. DBT key, data;
  4. int ret, tret;
  5. name_curs = NULL;
  6. job_curs = NULL;
  7. memset(&key, 0, sizeof(key));
  8. memset(&data, 0, sizeof(data));
  9. if ((ret =
  10. name_db->cursor(name_db, txn, &name_curs, 0)) != 0)
  11. goto err;
  12. key.data = "smith";
  13. key.size = sizeof("smith");
  14. if ((ret =
  15. name_curs->get(name_curs, &key, &data, DB_SET)) != 0)
  16. goto err;
  17. if ((ret = job_db->cursor(job_db, txn, &job_curs, 0)) != 0)
  18. goto err;
  19. key.data = "manager";
  20. key.size = sizeof("manager");
  21. if ((ret =
  22. job_curs->get(job_curs, &key, &data, DB_SET)) != 0)
  23. goto err;
  24. carray[0] = name_curs;
  25. carray[1] = job_curs;
  26. carray[2] = NULL;
  27. if ((ret =
  28. pers_db->join(pers_db, carray, &join_curs, 0)) != 0)
  29. goto err;
  30. while ((ret =
  31. join_curs->get(join_curs, &key, &data, 0)) == 0) {
  32. /* Process record returned in key/data. */
  33. }
  34. /*
  35. * If we exited the loop because we ran out of records,
  36. * then it has completed successfully.
  37. */
  38. if (ret == DB_NOTFOUND)
  39. ret = 0;
  40. err:
  41. if (join_curs != NULL &&
  42. (tret = join_curs->close(join_curs)) != 0 && ret == 0)
  43. ret = tret;
  44. if (name_curs != NULL &&
  45. (tret = name_curs->close(name_curs)) != 0 && ret == 0)
  46. ret = tret;
  47. if (job_curs != NULL &&
  48. (tret = job_curs->close(job_curs)) != 0 && ret == 0)
  49. ret = tret;
  50. return (ret);

The name cursor is positioned at the beginning of the duplicate list for smith and the job cursor is placed at the beginning of the duplicate list for manager. The join cursor is returned from the join method. This code then loops over the join cursor getting the personnel records of each one until there are no more.

Data item count

Once a cursor has been initialized to refer to a particular key in the database, it can be used to determine the number of data items that are stored for any particular key. The DBC->count() method returns this number of data items. The returned value is always one, unless the database supports duplicate data items, in which case it may be any number of items.

Cursor close

The DBC->close() method closes the DBC cursor, after which the cursor may no longer be used. Although cursors are implicitly closed when the database they point to are closed, it is good programming practice to explicitly close cursors. In addition, in transactional systems, cursors may not exist outside of a transaction and so must be explicitly closed.