正在做的搜索改版项目中,使用clucene做全文检索,通过apache接受用户搜索请求,解析后交给clucene去检索,取得结果后,反馈给用户。
在前两天,发现,启动apache时,进行对clucene的初始化都是失败的,用更简单的测试代码去试了下,还是不成功。重建索引到一个新的目录之后,再去尝试,就可以。如果,mv 新的索引目录到为原来失败的索引目录再去尝试,这种情况还是不行的。跟了下代码,发现在创建IndexSearcher的时候抛出异常。
再跟进去,则是,内部进行加锁的时候,不成功导致的。
错误堆栈如下:
#0 0x00002aaaab51bf45 in raise () from /lib64/libc.so.6
#1 0x00002aaaab51d340 in abort () from /lib64/libc.so.6
#2 0x00002aaaabb28ae4 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#3 0x00002aaaabb26c26 in ?? () from /usr/lib64/libstdc++.so.6
#4 0x00002aaaabb26c53 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5 0x00002aaaabb26d3a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6 0x00002aaaab885841 in lucene::store::LuceneLock::obtain (this=<value optimized out>, lockWaitTimeout=<value optimized out>) at ../src/CLucene/store/Lock.cpp:18
#7 0x00002aaaab862256 in runAndReturn (directory=<value optimized out>, closeDirectory=true) at ./CLucene/store/Lock.h:81
#8 lucene::index::IndexReader::open (directory=<value optimized out>, closeDirectory=true) at ../src/CLucene/index/IndexReader.cpp:101
#9 0x00002aaaab86235c in lucene::index::IndexReader::open (path=<value optimized out>) at ../src/CLucene/index/IndexReader.cpp:74
#10 0x00002aaaab8509e1 in lucene::search::IndexSearcher::IndexSearcher (this=<value optimized out>, path=0x6aae78 "/webservice/server/search/pipi/index/defindex")
at ../src/CLucene/search/IndexSearcher.cpp:113
#11 0x00002aaaab82e4fa in CPPSearcher::Init (this=0x6abca0) at /home/ppstat/packet/trunk/PpSearch/search/PipiSearcher.cpp:91
#12 0x00002aaaab82e5af in CPPSearcher::InitPath (this=0x6abca0, _indexer_dir="/webservice/server/search/pipi/index/defindex", _cityLevel_dir="/webservice/server/search/pipi/city_config")
at /home/ppstat/packet/trunk/PpSearch/search/PipiSearcher.cpp:60
#13 0x00002aaaab831f1e in CPPSearcherManager::Init (this=0x2aaaaba4e100, _indexer_dir="/webservice/server/search/pipi/index/defindex", _cityLevel_dir=
"/webservice/server/search/pipi/city_config", _template_path="/webservice/server/search/pipi/template/pipi.tpl") at /home/ppstat/packet/trunk/PpSearch/search/PPSearcherManager.cpp:48
#14 0x00002aaaab81cb7e in InitPath (_indexer_dir="/webservice/server/search/pipi/index/defindex", _cityLevel_path="/webservice/server/search/pipi/city_config", _template_path=
"/webservice/server/search/pipi/template/pipi.tpl") at /home/ppstat/packet/trunk/PpSearch/search/api_search.cpp:21
#15 0x00002aaaab81c409 in module_init (pchild=0x65c3b8, s=0x5a5200) at mod_search.cpp:124
#16 0x0000000000436bfd in ap_run_child_init (pchild=0x65c3b8, s=0x5a5200) at config.c:153
#17 0x0000000000463aef in child_main (child_num_arg=0) at worker.c:1161
#18 0x0000000000463e56 in make_child (s=0x5a5200, slot=0) at worker.c:1306
#19 0x0000000000463f16 in startup_children (number_to_start=20) at worker.c:1375
#20 0x0000000000464d27 in ap_mpm_run (_pconf=0x59c138, plog=<value optimized out>, s=0x5a5200) at worker.c:1742
#21 0x000000000042444d in main (argc=2, argv=0x7fffffffc898) at main.c:753
(gdb) quit
最后出错的是在lucene::store::LuceneLock::obtain这个位置,查了下,没有这方面的资料,看了下代码,则是
是在进入每隔一段时间去判断是否能去加载index
bool LuceneLock::obtain(int64_t lockWaitTimeout) {
bool locked = obtain();
int maxSleepCount = (int)(lockWaitTimeout / LOCK_POLL_INTERVAL);
int sleepCount = 0;
while (!locked) {
if (sleepCount++ == maxSleepCount) {
_CLTHROWA(CL_ERR_IO,"Lock obtain timed out");
}
_LUCENE_SLEEP(LOCK_POLL_INTERVAL);
locked = obtain();
}
return locked;
}
然后再进入到
bool FSDirectory::FSLock::obtain() {
if (disableLocks)
return true;
if ( !Misc::dir_Exists(lockDir) ){
//todo: should construct directory using _mkdirs... have to write replacement
if ( _mkdir(lockDir) == -1 ){
char* err = _CL_NEWARRAY(char,34+strlen(lockDir)+1); //34: len of "Couldn't create lock directory: "
strcpy(err,"Couldn't create lock directory: ");
strcat(err,lockDir);
_CLTHROWA_DEL(CL_ERR_IO, err );
}
}
int32_t r = _open(lockFile, O_RDWR | O_CREAT | O_RANDOM | O_EXCL,
_S_IREAD | _S_IWRITE); //must do this or file will be created Read only
if ( r < 0 )
return false;
else{
_close(r);
return true;
}
}
关键的地方就是_open,这里的lockFile就是一个文件的路径名,路径是/tmp,文件名是lucenne-xxxxx-commit.lock
当文件存在的时候,这里就会返回false,因此,加锁失败,创建搜索对象IndexSearcher失败。
我自己去查看了下/tmp目录下,有好几个这样的文件,难过怎么启动,都会在初始化IndexSeacher这个地方失败。
造成这样错误的原因是,可能在某个过程中,比如open那个文件锁之后,还没有close,程序自己就出现问题了,而那个文件在出错之后
就会在下次使用的时候失败。
PS:希望对正在搞并碰到这样错误的朋友能起到帮助。