Berkeley db使用方法简介（c接口）

谷梁振

2023-12-01

本文转载于http://www.opensourceproject.org.cn/article.php?id=817 Berkeley db

Berkeley db使用方法简介（c接口）

1. 打开数据库
首先必须调用db_create()函数初始化DB句柄，然后就可以使用open()方法打开数据库了。默认情况下，如果数据库不存在，DB不会创建。为了覆盖缺省行为，可以在open()调用中指定DB_CREATE标记。
以下代码示范了如何打开数据库：
#include <db.h>
...
DB *dbp;           /* DB structure handle */
u_int32_t flags;   /* database open flags */
int ret;           /* function return value */
/* Initialize the structure. This
* database is not opened in an environment,
* so the environment pointer is NULL. */
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Database open flags */
flags = DB_CREATE;    /* If the database does not exist,
                       * create it.*/
/* open the database */
ret = dbp->open(dbp,        /* DB structure pointer */
                NULL,       /* Transaction pointer */
                "my_db.db", /* On-disk file that holds the database. */
                NULL,       /* Optional logical database name */
                DB_BTREE,   /* Database access method */
                flags,      /* Open flags */
                0);         /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
在open参数中，DB_BTREE是DB的一种访问方法，此外还有DB_HASH、DB_HASH、DB_RECNO、DB_QUEUE等访问方法。
DB存取数据的基本方式是key/value对，DB_BTREE和DB_HASH的key可为任意值，而DB_RECNO和DB_QUEUE的 key只能为逻辑上的数字。DB_BTREE用在记录数据较小的情况，通常能直接保存在内存里，而DB_HASH用在记录数据较大的情况。这里的记录数据指单条记录。
DB_QUEUE和DB_RECNO用数字来标记每条记录(记录ID)，前者逻辑数字不会随着记录的删除发生变化，后者会发生变化。
打开标记出了DB_CREATE外，还可有如下标记的或组合。
- DB_EXCL
独占式创建，如果数据库已经存在则打开失败，仅和DB_CREATE组合使用。
- DB_RDONLY
只读方式打开。
- DB_TRUNCATE
截断磁盘上的数据库文件，清空数据。
2. 关闭数据库
在关闭数据库之前必须确保所有的游标(cursors)已经关闭，关闭数据库时自动把缓存数据写入磁盘。如果需要手工写缓存数据，可调用DB->sync()方法。
以下是示范代码：
#include <db.h>
...
DB *dbp;           /* DB struct handle */
...
/*
* Database open and access operations
* happen here.
*/
...
/* When we're done with the database, close it. */
if (dbp != NULL)
    dbp->close(dbp, 0);
3. DB管理方法
- DB->get_open_flags() 获取数据库的打开标记
前提是数据库必须已经打开。
#include <db.h>
...
DB *dbp;
u_int32_t open_flags;
/* Database open and subsequent operations omitted for clarity */
dbp->get_open_flags(dbp, &open_flags);
- DB->remove() 删除数据库
如果database参数为NULL，则删除该方法所引用的整个文件。注意，不要删除已经有打开句柄的数据库。
#include <db.h>
...
DB *dbp;
/* Database open and subsequent operations omitted for clarity */
dbp->remove(dbp,                   /* Database pointer */
            "mydb.db",             /* Database file to remove */
            NULL,                  /* Database to remove. This is
                                    * NULL so the entire file is
                                    * removed. */
           0);                     /* Flags. None used. */
- DB->rename() 数据库重命名
如果database参数是NULL，则该方法所引用的整个文件被重命名。注意，不要重命名已经有打开句柄的数据库。
#include <db.h>
...
DB *dbp;
/* Database open and subsequent operations omitted for clarity */
dbp->rename(dbp,                    /* Database pointer */
             "mydb.db",             /* Database file to rename */
             NULL,                  /* Database to rename. This is
                                     * NULL so the entire file is
                                     * renamed. */
            "newdb.db",             /* New database file name */
            0);                     /* Flags. None used. */
4. 错误处理函数
- set_errcall()
当DB发生错误的时候，将调用set_errcall()所指定的回调函数，错误前缀和错误消息发送给该回调函数，错误显示由回调函数处理。
- set_errfile()
设置C库的FILE *来显式错误信息。
- set_errpfx()
设置DB的错误消息前缀
- err()
产生一条错误消息。错误消息将被发送给set_errcall所指定的回调函数，如果没有设置回调函数，则发送给set_errfile()所指定的文件，如果还是没有设置，则发送给标准错误输出(stderr)。
错误消息包含前缀串(由set_errprefix()指定)，可选的printf风格的格式化串，错误消息，尾部换行。
- errx()
同err()，除了与错误值关联的DB消息不会追加到错误串。
此外，还可以用db_strerror()直接返回与特定的错误号对应的错误串。
例如，为了把错误消息发送给回调函数，可先定义该回调函数。
/*
* Function called to handle any database error messages
* issued by DB.
*/
void
my_error_handler(const char *error_prefix, char *msg)
{
/*
   * Put your code to handle the error prefix and error
   * message here. Note that one or both of these parameters
   * may be NULL depending on how the error message is issued
   * and how the DB handle is configured.
   */
}
然后注册它。
#include <db.h>
#include <stdio.h>
...
DB *dbp;
int ret;
/*
* Create a database and initialize it for error
* reporting.
*/
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
        fprintf(stderr, "%s: %s/n", "my_program",
          db_strerror(ret));
        return(ret);
}
/* Set up error handling for this database */
dbp->set_errcall(dbp, my_error_handler);
dbp->set_errpfx(dbp, "my_example_program");
产生一条错误消息：
ret = dbp->open(dbp,
                NULL,
                "mydb.db",
                NULL,
                DB_BTREE,
                DB_CREATE,
                0);
if (ret != 0) {
    dbp->err(dbp, ret,
      "Database open failed: %s", "mydb.db");
    return(ret);
}
5. DB环境
DB环境是对1个或者多个数据库的封装，一般用法是先打开DB环境，然后在此环境中打开数据库。这样创建或者打开的数据库是相对于环境HOME目录的。DB环境对于一般简单的嵌入式数据库用不上，但是对于多数据库文件、多线程多进程支持、事务处理、高可用性（复制）支持、日志子系统等是必要的。
以下代码示范了如何创建/打开DB环境。
#include <db.h>
...
DB_ENV *myEnv;            /* Env structure handle */
DB *dbp;                  /* DB structure handle */
u_int32_t db_flags;       /* database open flags */
u_int32_t env_flags;      /* env open flags */
int ret;                  /* function return value */
/*
   Create an environment object and initialize it for error
   reporting.
*/
ret = db_env_create(&myEnv, 0);
if (ret != 0) {
    fprintf(stderr, "Error creating env handle: %s/n", db_strerror(ret));
    return -1;
}
/* Open the environment. */
env_flags = DB_CREATE;   /* If the environment does not exist,
                          * create it. */
ret = myEnv->open(myEnv,   /* DB_ENV ptr */
"/export1/testEnv",      /* env home directory */
env_flags,               /* Open flags */
0);                      /* File mode (default) */
if (ret != 0) {
    fprintf(stderr, "Environment open failed: %s", db_strerror(ret));
    return -1;
}
DB环境打开后，就可以在其中创建/打开数据库了。
/*
* Initialize the DB structure. Pass the pointer
* to the environment in which this DB is opened.
*/
ret = db_create(&dbp, myEnv, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Database open flags */
db_flags = DB_CREATE;    /* If the database does not exist,
                          * create it.*/
/* open the database */
ret = dbp->open(dbp,        /* DB structure pointer */
                NULL,       /* Transaction pointer */
                "my_db.db", /* On-disk file that holds the database. */
                NULL,       /* Optional logical database name */
                DB_BTREE,   /* Database access method */
                db_flags,   /* Open flags */
                0);         /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
在关闭DB环境之前，必须先关闭打开的数据库。
/*
* Close the database and environment
*/
if (dbp != NULL) {
    dbp->close(dbp, 0);
}
if (myEnv != NULL) {
    myEnv->close(myEnv, 0);
}
6. DB记录
DB记录包含两部分 - 关键字部分(key)和数据部分(data)，二者在DB中分别封装为DBT结构。DBT结构有一个void*字段用来存放数据，还有一个字段用来指定数据的长度，因此key和data可以为内存连续的任意长度的数据。
以下代码示范了如何给key和data赋值。
#include <db.h>
#include <string.h>
...
DBT key, data;
float money = 122.45;
char *description = "Grocery bill.";
/* Zero out the DBTs before using them. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
key.data = &money;
key.size = sizeof(float);
data.data = description;
data.size = strlen(description) + 1;
要取回key和data的值，只要把DBT中void *部分返回给相应的变量就可以了。但是上例有点例外，因为浮点数在某些系统中要求内存对齐，为安全起见我们自己提供浮点数的内存，而不使用DB返回的内存。通过使用DB_DBT_USEMEM标记就可以实现。
#include <db.h>
#include <string.h>
...
float money;
DBT key, data;
char *description;
/* Initialize the DBTs */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
key.data = &money;
key.ulen = sizeof(float);
key.flags = DB_DBT_USERMEM;
/* Database retrieval code goes here */
/*
* Money is set into the memory that we supplied.
*/
description = data.data;
7. 读写DB记录和数据的持久化
默认情况下，DB不支持重复记录（同一个key对应到多个data）。DBT->put()和DBT->get()适用于这种情况下的记录读写操作。对于重复记录，可使用游标(Cursor)。
写记录通过DB->put()方法实现，它带有一些标记，其中DB_NOOVERWRITE标记指定不能插入重复记录，如果记录有重复，就算数据库支持重复记录，也会返回DB_KEYEXIST错误。
#include <db.h>
#include <string.h>
...
char *description = "Grocery bill.";
DBT key, data;
DB *my_database;
int ret;
float money;
/* Database open omitted for clarity */
money = 122.45;
/* Zero out the DBTs before using them. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
key.data = &money;
key.size = sizeof(float);
data.data = description;
data.size = strlen(description) +1;
ret = my_database->put(my_database, NULL, &key, &data, DB_NOOVERWRITE);
if (ret == DB_KEYEXIST) {
    my_database->err(my_database, ret,
      "Put failed because key %f already exists", money);
}
读记录通过DB->get()方法实现，如果数据库支持重复记录，它只返回第一个与key匹配的记录。当然也可以用一批get调用来获取重复的记录，需要在调用DB->get()时指定DB_MULTIPLE标记。
默认情况DB->get()返回与提供的key匹配的记录，如果需要记录既和key匹配，也和data匹配，可指定DB_GET_BOTH标记。如果没有匹配的记录，本方法返回DB_NOTFOUND。
#include <db.h>
#include <string.h>
...
DBT key, data;
DB *my_database;
float money;
char *description;
/* Database open omitted for clarity */
money = 122.45;
/* Zero out the DBTs before using them. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/*
* Use our own memory to retrieve the float.
* For data alignment purposes.
*/
key.data = &money;
key.ulen = sizeof(float);
key.flags = DB_DBT_USERMEM;
my_database->get(my_database, NULL, &key, &data, 0);
/*
* Money is set into the memory that we supplied.
*/
description = data.data;
删除记录可调用DB->del()方法，如果记录有重复，则删除所有的与key匹配的记录。要删除单条记录，可考虑用游标。要删除数据库的所有记录，可调用DB->truncate()方法。
#include <db.h>
#include <string.h>
...
DBT key;
DB *my_database;
float money = 122.45;
/* Database open omitted for clarity */
/* Zero out the DBTs before using them. */
memset(&key, 0, sizeof(DBT));
key.data = &money;
key.size = sizeof(float);
my_database->del(my_database, NULL, &key, 0);
默认情况下数据的修改被缓存起来，另一个打开数据库的进程不会立即看到修改后的数据，在数据库关闭时数据自动保存到磁盘。可以调用Db->sync()强制保存缓存的数据，也可采用事务数据库防止缓存数据的丢失。
如果不希望数据库关闭时保存缓存数据，可在DB->close()中指定DB_NOSYNC标记。
如果系统崩溃而且没有使用事务数据库，可调用DB->verify()验证数据，如果验证失败，可用db_dump命令（使用-R或-r参数控制db_dump的恢复程度）尽可能恢复丢失的数据。
8. DB游标
游标使用DBC结构管理，打开游标可调用DB->cursor()方法。
#include <db.h>
...
DB *my_database;
DBC *cursorp;
/* Database open omitted for clarity */
/* Get a cursor */
my_database->cursor(my_database, NULL, &cursorp, 0);
关闭有掉调用DBC->c_close()方法，注意在关闭数据库之前必须关闭所有的游标，否则后果不可预料。
#include <db.h>
...
DB *my_database;
DBC *cursorp;
/* Database and cursor open omitted for clarity */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
if (my_database != NULL)
    my_database->close(my_database, 0);
为了在数据库中遍历记录，可调用DBC->c_get()方法，指定DB_NEXT标记。
#include <db.h>
#include <string.h>
...
DB *my_database;
DBC *cursorp;
DBT key, data;
int ret;
/* Database open omitted for clarity */
/* Get a cursor */
my_database->cursor(my_database, NULL, &cursorp, 0);
/* Initialize our DBTs. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/* Iterate over the database, retrieving each record in turn. */
while ((ret = cursorp->c_get(cursorp, &key, &data, DB_NEXT)) == 0) {
        /* Do interesting things with the DBTs here. */
}
if (ret != DB_NOTFOUND) {
        /* Error handling goes here */
}
/* Cursors must be closed */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
if (my_database != NULL)
    my_database->close(my_database, 0);
要从数据库的尾部到头部遍历记录，可使用DB_PREV代替DB_NEXT标记。
#include <db.h>
#include <string.h>
...
DB *my_database;
DBC *cursorp;
DBT key, data;
int ret;
/* Database open omitted for clarity */
/* Get a cursor */
my_database->cursor(my_database, NULL, &cursorp, 0);
/* Initialize our DBTs. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/* Iterate over the database, retrieving each record in turn. */
while ((ret = cursorp->c_get(cursorp, &key,
      &data, DB_PREV)) == 0) {
        /* Do interesting things with the DBTs here. */
}
if (ret != DB_NOTFOUND) {
        /* Error handling goes here */
}
// Cursors must be closed
if (cursorp != NULL)
    cursorp->c_close(cursorp);
if (my_database != NULL)
    my_database->close(my_database, 0);
9. 使用游标查找记录
使用DBC->c_get()可查找与key或者key和data匹配的记录，如果数据库支持已序多重记录集，还可执行部分匹配 (partial matches)。参数key和data会被填充找到的DB记录，游标指向找到的记录位置。如果查找失败，游标状态保持不变，返回 DB_NOTFOUND。
DBC->c_get()可带如下的标记：
- DB_SET
移动游标到第一条匹配key的数据库记录。
- DB_SET_RANGE
和DB_SET的Cursor.getSearchKey()类似，除非使用BTree的访问方法，在BTree的访问方法下，游标移到第一条大于等于key的记录位置，这种比较由你提供比较函数，或者缺省按字典顺序。
比如数据库存储了如下key：
Alabama
Alaska
Arizona
如果查找的key为Al，则游标移到Alabama，key为Alas游标移到Alaska，key为Ar移动到最后一条记录。
- DB_GET_BOTH
移动游标到第一条匹配key和data的记录位置。
- DB_GET_BOTH_RANGE
和DB_SET_RANGE类似，先匹配key，然后匹配data。
假设数据库存储了如下的key/data对，
Alabama/Athens
Alabama/Florence
Alaska/Anchorage
Alaska/Fairbanks
Arizona/Avondale
Arizona/Florence
则查找的key/data和查找结果关系如下
待查的key 待查的data 游标位置
---------------------------------------
Al Fl Alabama/Florence
Ar Fl Arizona/Florence
Al Fa Alaska/Fairbanks
Al A Alabama/Athens
假设有一个数据库存储了美国的州(key)和城市(data)的关系，则如下代码片断可定位游标的位置，然后打印key/data对：
#include <db.h>
#include <string.h>
...
DBC *cursorp;
DBT key, data;
DB *dbp;
int ret;
char *search_data = "Fa";
char *search_key = "Al";
/* database open omitted for clarity */
/* Get a cursor */
dbp->cursor(dbp, NULL, &cursorp, 0);
/* Set up our DBTs */
key.data = search_key;
key.size = strlen(search_key) + 1;
data.data = search_data;
data.size = strlen(search_data) + 1;
/*
* Position the cursor to the first record in the database whose
* key and data begin with the correct strings.
*/
ret = cursorp->c_get(cursorp, &key, &data, DB_GET_BOTH_RANGE);
if (!ret) {
    /* Do something with the data */
} else {
    /* Error handling goes here */
}
/* Close the cursor */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
/* Close the database */
if (dbp != NULL)
    dbp->close(dbp, 0);
10. 使用游标处理重复记录
所谓重复记录是指不同的data有相同的key，它们共享同一个key。只有BTree和Hash访问方法支持重复记录，重复记录可通过游标读取，以下是DBC->c_get()相关的标记：
- DB_NEXT, DB_PREV
取数据库中的下一条/上一条记录，不管它是否重复。
- DB_GET_BOTH_RANGE
定位游标到一条特定的记录，不管它是否重复。
- DB_NEXT_NODUP, DB_PREV_NODUP
获取下一条/上一条不重复的记录。
- DB_NEXT_DUP
获取和当前记录分享同一个key的下一条记录。如果没有则返回DB_NOTFOUND。
比如，下面的代码片断定位游标到某条记录，然后显式它以及和它重复的记录。
#include <db.h>
#include <string.h>
...
DB *dbp;
DBC *cursorp;
DBT key, data;
int ret;
char *search_key = "Al";
/* database open omitted for clarity */
/* Get a cursor */
dbp->cursor(dbp, NULL, &cursorp, 0);
/* Set up our DBTs */
key.data = search_key;
key.size = strlen(search_key) + 1;
/*
* Position the cursor to the first record in the database whose
* key and data begin with the correct strings.
*/
ret = cursorp->c_get(cursorp, &key, &data, DB_SET);
while (ret != DB_NOTFOUND) {
    printf("key: %s, data: %s/n", (char *)key.data, (char *)data.data);
    ret = cursorp->c_get(cursorp, &key, &data, DB_NEXT_DUP);
}
/* Close the cursor */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
/* Close the database */
if (dbp != NULL)
    dbp->close(dbp, 0);
11. 使用游标插入记录
使用游标插入记录的时候，游标位于新插入的记录上。注意游标插入记录不能获取事务性保护，要想事务性地保护数据库，直接用DB句柄插入。以下是DBC->c_put()相关的标记。
- DB_NODUPDATA
如果key已经存在返回DB_KEYEXIST错误。该标记只支持已序重复数据的数据库插入，比如BTree或者Hash。
- DB_KEYFIRST
如果数据库支持重复数据，且key已经存在，则新记录插在最前面。
- DB_KEYLAST
如果数据库支持重复数据，且key已经存在，则新记录插在最后面。
比如：
#include <db.h>
#include <string.h>
...
DB *dbp;
DBC *cursorp;
DBT data1, data2, data3;
DBT key1, key2;
char *key1str = "My first string";
char *data1str = "My first data";
char *key2str = "A second string";
char *data2str = "My second data";
char *data3str = "My third data";
int ret;
/* Set up our DBTs */
key1.data = key1str;
key1.size = strlen(key1str) + 1;
data1.data = data1str;
data1.size = strlen(data1str) + 1;
key2.data = key2str;
key2.size = strlen(key2str) + 1;
data2.data = data2str;
data2.size = strlen(data2str) + 1;
data3.data = data3str;
data3.size = strlen(data3str) + 1;
/* Database open omitted */
/* Get the cursor */
dbp->cursor(dbp, NULL, &cursorp, 0);
/*
* Assuming an empty database, this first put places
* "My first string"/"My first data" in the first
* position in the database
*/
ret = cursorp->c_put(cursorp, &key1,
&data1, DB_KEYFIRST);
/*
* This put places "A second string"/"My second data" in the
* the database according to its key sorts against the key
* used for the currently existing database record. Most likely
* this record would appear first in the database.
*/
ret = cursorp->c_put(cursorp, &key2,
&data2, DB_KEYFIRST); /* Added according to sort order */
/*
* If duplicates are not allowed, the currently existing record that
* uses "key2" is overwritten with the data provided on this put.
* That is, the record "A second string"/"My second data" becomes
* "A second string"/"My third data"
*
* If duplicates are allowed, then "My third data" is placed in the
* duplicates list according to how it sorts against "My second data".
*/
ret = cursorp->c_put(cursorp, &key2,
&data3, DB_KEYFIRST); /* If duplicates are not allowed, record
                         * is overwritten with new data. Otherwise,
                         * the record is added to the beginning of
                         * the duplicates list.
                         */
12. 使用游标删除记录
调用DBC->c_del()方法。
#include <db.h>
#include <string.h>
...
DB *dbp;
DBC *cursorp;
DBT key, data;
char *key1str = "My first string";
int ret;
/* Set up our DBTs */
key.data = key1str;
key.size = strlen(key1str) + 1;
/* Database open omitted */
/* Get the cursor */
dbp->cursor(dbp, NULL, &cursorp, 0);
/* Initialize our DBTs. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/* Iterate over the database, deleting each record in turn. */
while ((ret = cursorp->c_get(cursorp, &key,
               &data, DB_SET)) == 0) {
    cursorp->c_del(cursorp, 0);
}
/* Cursors must be closed */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
if (dbp != NULL)
    dbp->close(dbp, 0);
13. 使用游标替换记录
调用DBC->c_put()方法，指定DB_CURRENT标记。
#include <db.h>
#include <string.h>
...
DB *dbp;
DBC *cursorp;
DBT key, data;
char *key1str = "My first string";
char *replacement_data = "replace me";
int ret;
/* Set up our DBTs */
key.data = key1str;
key.size = strlen(key1str) + 1;
/* Database open omitted */
/* Get the cursor */
dbp->cursor(dbp, NULL, &cursorp, 0);
/* Initialize our DBTs. */
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/* Position the cursor */
ret = cursorp->c_get(cursorp, &key, &data, DB_SET);
if (ret == 0) {
    data.data = replacement_data;
    data.size = strlen(replacement_data) + 1;
    cursorp->c_put(cursorp, &key, &data, DB_CURRENT);
}
/* Cursors must be closed */
if (cursorp != NULL)
    cursorp->c_close(cursorp);
if (dbp != NULL)
    dbp->close(dbp, 0);
注意，如果你想替换多条重复记录集的记录，而且没有使用定制的排序函数，最好的做法是先删除记录，然后插入新记录。
14. 从属数据库
假设你有一个用户数据库DBUser，key为用户ID，data为用户的相关信息，比如name，age，address等。你可以根据用户ID 很快找出这个用户信息来。但是有一天老板要求根据用户的name信息找这个用户，用户的ID是未知的。根据目前所学的东西，只能遍历整个数据库，一条一条记录去比对，这是很费力不讨好的事情。
从属数据库是因这个需求产生的。你可以根据DBUser库的name信息建立从属数据库，这样就相当于为DBUser建立了基于name的索引。查询从属数据库的name信息，很快就可以找到DBUser库里对应的记录。
使用从属数据库的基本步骤是：创建数据库，打开它，和主数据库建立关联。此外还需要提供一个回调函数，用来根据主数据库的记录建立从属数据库的关键字。
关闭从属数据库的时候，确保关闭了主数据库，这在多线程的时候非常重要。
以下代码示范了如何打开从属数据库：
#include <db.h>
...
DB *dbp, *sdbp;    /* Primary and secondary DB handles */
u_int32_t flags;   /* Primary database open flags */
int ret;           /* Function return value */
/* Primary */
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Secondary */
ret = db_create(&sdbp, NULL, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Usually we want to support duplicates for secondary databases */
ret = sdbp->set_flags(sdbp, DB_DUPSORT);
if (ret != 0) {
/* Error handling goes here */
}

/* Database open flags */
flags = DB_CREATE;    /* If the database does not exist,
                       * create it.*/
/* open the primary database */
ret = dbp->open(dbp,        /* DB structure pointer */
                NULL,       /* Transaction pointer */
                "my_db.db", /* On-disk file that holds the database. */
                NULL,       /* Optional logical database name */
                DB_BTREE,   /* Database access method */
                flags,      /* Open flags */
                0);         /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
/* open the secondary database */
ret = sdbp->open(sdbp,          /* DB structure pointer */
                 NULL,          /* Transaction pointer */
                 "my_secdb.db", /* On-disk file that holds the database. */
                 NULL,          /* Optional logical database name */
                 DB_BTREE,      /* Database access method */
                 flags,         /* Open flags */
                 0);            /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
/* Now associate the secondary to the primary */
dbp->associate(dbp,            /* Primary database */
               NULL,           /* TXN id */
               sdbp,           /* Secondary database */
               get_sales_rep, /* Callback used for key creation. Not
                                * defined in this example. See the next
                                * section. */
               0);              /* Flags */
关闭主从数据库的代码如下：
/* Close the secondary before the primary */
if (sdbp != NULL)
    sdbp->close(sdbp, 0);
if (dbp != NULL)
    dbp->close(dbp, 0);
15. 从属数据库的操作
我们必须提供一个回调函数，为从属数据库建立key。key的建立可依据主数据库记录的key或者data，主要看应用需要。
假设主数据库存储了如下的数据：
typedef struct vendor {
    char name[MAXFIELD];             /* Vendor name */
    char street[MAXFIELD];           /* Street name and number */
    char city[MAXFIELD];             /* City */
    char state[3];                   /* Two-digit US state code */
    char zipcode[6];                 /* US zipcode */
    char phone_number[13];           /* Vendor phone number */
    char sales_rep[MAXFIELD];        /* Name of sales representative */
    char sales_rep_phone[MAXFIELD]; /* Sales rep's phone number */
} VENDOR;
你想基于主数据库的sales_rep来查询主数据库，可这样写回调函数：
#include <db.h>
...
int
get_sales_rep(DB *sdbp,          /* secondary db handle */
              const DBT *pkey,   /* primary db record's key */
              const DBT *pdata, /* primary db record's data */
              DBT *skey)         /* secondary db record's key */
{
    VENDOR *vendor;
    /* First, extract the structure contained in the primary's data */
    vendor = pdata->data;
    /* Now set the secondary key's data to be the representative's name */
    memset(skey, 0, sizeof(DBT));
    skey->data = vendor->sales_rep;
    skey->size = strlen(vendor->sales_rep) + 1;
    /* Return 0 to indicate that the record can be created/updated. */
    return (0);
}
如果回调函数返回DB_DONOTINDEX或者其他非0值，则记录不会被索引。
还需要通过associate()建议关联：
dbp->associate(dbp,            /* Primary database */
               NULL,           /* TXN id */
               sdbp,           /* Secondary database */
               get_sales_rep, /* Callback used for key creation. */
               0);             /* Flags */
读从属数据库和读主数据库没什么区别，不同在于DB->get()或者DB->pget()返回的是主数据库的key和data。
#include <db.h>
#include <string.h>
...
DB *my_secondary_database;
DBT key; /* Used for the search key */
DBT pkey, pdata; /* Used to return the primary key and data */
char *search_name = "John Doe";
/* Primary and secondary database opens omitted for brevity */
/* Zero out the DBTs before using them. */
memset(&key, 0, sizeof(DBT));
memset(&pkey, 0, sizeof(DBT));
memset(&pdata, 0, sizeof(DBT));
key.data = search_name;
key.size = strlen(search_name) + 1;
/* Returns the key from the secondary database, and the data from the
* associated primary database entry.
*/
my_secondary_database->get(my_secondary_database, NULL,
&key, &pdata, 0);
/* Returns the key from the secondary database, and the key and data
* from the associated primary database entry.
*/
my_secondary_database->pget(my_secondary_database, NULL,
&key, &pkey, &pdata, 0);
一般我们不用直接修改从属数据库，而是让DB管理。也可直接删除从属数据库的记录，这时主数据库对应的记录会被删除。比如：
#include <db.h>
#include <string.h>
...
DB *dbp, *sdbp;    /* Primary and secondary DB handles */
DBT key;           /* DBTs used for the delete */
int ret;           /* Function return value */
char *search_name = "John Doe"; /* Name to delete */
/* Primary */
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Secondary */
ret = db_create(&sdbp, NULL, 0);
if (ret != 0) {
/* Error handling goes here */
}
/* Usually we want to support duplicates for secondary databases */
ret = sdbp->set_flags(sdbp, DB_DUPSORT);
if (ret != 0) {
/* Error handling goes here */
}
/* open the primary database */
ret = dbp->open(dbp,        /* DB structure pointer */
                NULL,       /* Transaction pointer */
                "my_db.db", /* On-disk file that holds the database.
                             * Required. */
                NULL,       /* Optional logical database name */
                DB_BTREE,   /* Database access method */
                0,          /* Open flags */
                0);         /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
/* open the secondary database */
ret = sdbp->open(sdbp,          /* DB structure pointer */
                 NULL,          /* Transaction pointer */
                 "my_secdb.db", /* On-disk file that holds the database.
                                 * Required. */
                 NULL,          /* Optional logical database name */
                 DB_BTREE,      /* Database access method */
                 0,             /* Open flags */
                 0);            /* File mode (using defaults) */
if (ret != 0) {
/* Error handling goes here */
}
/* Now associate the secondary to the primary */
dbp->associate(dbp,            /* Primary database */
               NULL,           /* TXN id */
               sdbp,           /* Secondary database */
               get_sales_rep, /* Callback used for key creation. */
               0);             /* Flags */
/*
* Zero out the DBT before using it.
*/
memset(&key, 0, sizeof(DBT));
key.data = search_name;
key.size = strlen(search_name) + 1;
/* Now delete the secondary record. This causes the associated primary
* record to be deleted. If any other secondary databases have secondary
* records referring to the deleted primary record, then those secondary
* records are also deleted.
*/
sdbp->del(sdbp, NULL, &key, 0);
从属数据库也可以使用游标进行查询和遍历，但是不能使用DB_GET_BOTH和相关的标记调用DB->c_get()，应该使用DB-> c_pget()。而且在那种情况下，主库的key和从库的key必须都和DB->c_pget()给予的key参数匹配，才返回主库的结果。
比如以下代码在从库里查询人名，然后删除所有主库和从库使用那个名字的记录。
#include <db.h>
#include <string.h>
...
DB *sdbp;          /* Secondary DB handle */
DBC *cursorp;      /* Cursor */
DBT key, data;     /* DBTs used for the delete */
char *search_name = "John Doe"; /* Name to delete */
/* Primary and secondary database opens omitted for brevity. */
/* Get a cursor on the secondary database */
sdbp->cursor(sdbp, NULL, &cursorp, 0);
/*
* Zero out the DBT before using it.
*/
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
key.data = search_name;
key.size = strlen(search_name) + 1;

/* Position the cursor */
while (cursorp->c_get(cursorp, &key, &data, DB_SET) == 0)
    cursorp->c_del(cursorp, 0);
16. 数据库联接
假设你有一个汽车数据库，保存了汽车的颜色、厂商、价格、样式等信息。为了便于检索，分别为颜色、厂商、价格建立了从属数据库。如果要查询符合指定颜色和指定厂商、价格的汽车，就需要数据库联接了。
数据库联接的建立过程如下：
- 打开每一个从库的游标，这些从库和同样的主库关联着
- 把每个游标分别定位到符合单项条件的记录，比如颜色定位到红色，厂商定位到红旗
- 建立游标数组，把上面的每个游标放进去
- 获取联接游标。调用DB->join()并把游标数组传入。
- 遍历匹配的记录，直到返回值非0
- 关闭联接游标
- 关闭所有其他的游标
例如：
#include <db.h>
#include <string.h>
...
DB *automotiveDB;
DB *automotiveColorDB;
DB *automotiveMakeDB;
DB *automotiveTypeDB;
DBC *color_curs, *make_curs, *type_curs, *join_curs;
DBC *carray[3];
DBT key, data;
int ret;
char *the_color = "red";
char *the_type = "minivan";
char *the_make = "Toyota";
/* Database and secondary database opens omitted for brevity.
* Assume a primary database handle:
*   automotiveDB
* Assume 3 secondary database handles:
*   automotiveColorDB -- secondary database based on automobile color
*   automotiveMakeDB -- secondary database based on the manufacturer
*   automotiveTypeDB -- secondary database based on automobile type
*/
/* initialize pointers and structures */
color_curs = NULL;
make_curs = NULL;
type_curs = NULL;
join_curs = NULL;
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
/* open the cursors */
if (( ret =
    automotiveColorDB->cursor(automotiveColorDB, NULL,
      &color_curs, 0)) != 0) {
        /* Error handling goes here */
}
if (( ret =
    automotiveMakeDB->cursor(automotiveMakeDB, NULL,
      &make_curs, 0)) != 0) {
        /* Error handling goes here */
}
if (( ret =
    automotiveTypeDB->cursor(automotiveTypeDB, NULL,
      &type_curs, 0)) != 0) {
        /* Error handling goes here */
}
/* Position the cursors */
key.data = the_color;
key.size = strlen(the_color) + 1;
if ((ret = color_curs->c_get(color_curs, &key, &data, DB_SET)) != 0)
    /* Error handling goes here */
key.data = the_make;
key.size = strlen(the_make) + 1;
if ((ret = make_curs->c_get(make_curs, &key, &data, DB_SET)) != 0)
    /* Error handling goes here */
key.data = the_type;
key.size = strlen(the_type) + 1;
if ((ret = type_curs->c_get(type_curs, &key, &data, DB_SET)) != 0)
    /* Error handling goes here */
/* Set up the cursor array */
carray[0] = color_curs;
carray[1] = make_curs;
carray[2] = type_curs;
/* Create the join */
if ((ret = automotiveDB->join(automotiveDB, carray, &join_curs, 0)) != 0)
    /* Error handling goes here */
/* Iterate using the join cursor */
while ((ret = join_curs->c_get(join_curs, &key, &data, 0)) == 0) {
    /* Do interesting things with the key and data */
}
/*
* If we exited the loop because we ran out of records,
* then it has completed successfully.
*/
if (ret == DB_NOTFOUND) {
    /*
     * Close all our cursors and databases as is appropriate, and
     * then exit with a normal exit status (0).
     */
}
17. 配置数据库 - 内存页大小
首先我们需要考虑的是内存页的大小，取值范围在512字节 ~ 64K之间。可通过创建数据库时调用DB->set_pagesize()设置，取值必须为2的指数次方。
当一页内存不足以存储一条记录时，剩下的数据会存在另外一页，这称为内存页溢出。太多的溢出内存页会导致性能下降，溢出页数可通过DB->stat()查看，也可通过db_stat命令行检查。对于BTree访问方法，一页内存理想的情况下能存4条记录。
DB是基于内存页加锁的，这意味着如果一页内存保存的记录数太多，对记录的读写就会导致频繁的加锁/解锁发生，同样会降低性能。一个取巧的做法是先选择合适的页面大小，如果有太多的竞争锁事件发生，再减少页面尺寸。
可用DB_ENV->lock_stat()方法查看锁冲突，或者使用命令行工具db_stat。由于冲突而无法获取锁的数目保存在st_nconflicts字段里。
页大小还会影响数据移入移出磁盘的I/O效率，尤其是缓冲内存不足以容纳整个工作数据集的时候更是如此。
通常对于页的大小，首先要考虑和文件系统的块大小一致，如果页不足以容纳4条记录（对于BTree），再考虑增加页大小。这时页应该越大越好，直到竞争锁事件频繁发生为止。
18. 配置数据库 - 选择缓存大小
缓存的大小和磁盘I/O的次数息息相关，也是需要考虑的因素。可通过DB->set_cachesize()或者DB_ENV-> set_cachesize()改变缓存的大小，设置值必须为2的指数次方。幸运的是缓存是在打开数据库设置的，因此修改更方便些。
最好的决定缓存大小的方法是先决定一个值，然后在产品环境跑应用。观察磁盘I/O的次数，如果应用的性能较差，富余的内存较多，磁盘I/O次数较多，可增加缓存值。
可使用db_stat命令和-m选项查看缓存的效率，百分比越接近100%效果越好。
18. BTree配置
BTree默认按照字典顺序对key进行排序，可调用DB->set_bt_compare()设置自己的排序函数。排序函数的原型是：
int (*function)(DB *db, const DBT *key1, const DBT *key2)
如果key1 > key2，返回正数；key1 = key2返回0；否则返回负数。
例如一个对整数key的排序函数是：
int
compare_int(DB *dbp, const DBT *a, const DBT *b)
{
    int ai, bi;
    /*
     * Returns:
     * < 0 if a < b
     * = 0 if a = b
     * > 0 if a > b
     */
    memcpy(&ai, a->data, sizeof(int));
    memcpy(&bi, b->data, sizeof(int));
    return (ai - bi);
}
其设置代码如下：
#include <db.h>
#include <string.h>
...
DB *dbp;
int ret;
/* Create a database */
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
        fprintf(stderr, "%s: %s/n", "my_program",
          db_strerror(ret));
        return(-1);
}
/* Set up the btree comparison function for this database */
dbp->set_bt_compare(dbp, compare_int);
/* Database open call follows sometime after this. */
BTree支持重复记录，几条记录共享同一个key。对于重复记录，有已序重复记录和无序重复记录之分。一般我们使用已序重复记录。
如果是已序重复记录，可调用DB->set_dup_compare()设置重复记录的排序函数。排序函数的原型同key的排序函数，不过这里是对data部分进行排序。
对于支持无序重复记录的数据库，在插入数据时，如果是调用DB->put()方法，则记录插在重复记录集的结尾。如果是调用游标方法，这要看DBC->c_put()方法的标记：
- DB_AFTER
DBC->c_put()的data参数被作为重复值插入，key是游标当前所指记录的key，DBC->c_put()传入的key参数被忽略。重复记录插在游标所指记录的后面。对于已序重复记录忽略本标记。
- DB_BEFORE
同上，除了新记录插在游标位置的前面。
- DB_KEYFIRST
如果DBC->c_put()参数里的key已经存在，数据库支持无序重复记录，新记录插在重复记录列表的最前面。
- DB_KEYLAST
同上，除了新记录插在重复记录列表的最后面。
在创建数据库的时候，可以调用DB->set_flags()指定数据库是否支持重复记录：DB_DUP标记无序重复记录，DB_DUPSORT标记已序重复记录。
以下代码演示了如何创建支持重复记录的数据库：
#include <db.h>
...
DB *dbp;
FILE *error_file_pointer;
int ret;
char *program_name = "my_prog";
char *file_name = "mydb.db";
/* Variable assignments omitted for brevity */
/* Initialize the DB handle */
ret = db_create(&dbp, NULL, 0);
if (ret != 0) {
    fprintf(error_file_pointer, "%s: %s/n", program_name,
        db_strerror(ret));
    return(ret);
}
/* Set up error handling for this database */
dbp->set_errfile(dbp, error_file_pointer);
dbp->set_errpfx(dbp, program_name);
/*
* Configure the database for sorted duplicates
*/
ret = dbp->set_flags(dbp, DB_DUPSORT);
if (ret != 0) {
    dbp->err(dbp, ret, "Attempt to set DUPSORT flag failed.");
    dbp->close(dbp, 0);
    return(ret);
}
/* Now open the database */
ret = dbp->open(dbp,        /* Pointer to the database */
                NULL,       /* Txn pointer */
                file_name, /* File name */
                NULL,       /* Logical db name (unneeded) */
                DB_BTREE,   /* Database type (using btree) */
                DB_CREATE, /* Open flags */
                0);         /* File mode. Using defaults */
if (ret != 0) {
    dbp->err(dbp, ret, "Database '%s' open failed.", file_name);
    dbp->close(dbp, 0);
    return(ret);
}
本文参考了Berkeley DB 4.3的Started Guide，仅供入门，更全面和高级的话题请参考Berkeley DB的文档：Programmer's Tutorial and Reference Guide。

Berkeley db使用方法简介（c接口）

Berkeley db使用方法简介（c接口）

相关阅读

相关文章

相关问答

相关文档