最近一个老项目需要重写,由于数据量很大搜索使用了sphinx的扩展版本coreseek,进而可以减少数据库查询的压力,接下来吾爱编程为大家详细的介绍一下coreseek安装及使用方法,有需要的小伙伴可以参考一下:
1、介绍:
Coreseek 是一款中文全文检索/搜索软件,以GPLv2许可协议开源发布,基于Sphinx研发并独立发布,专攻中文搜索和信息处理领域,适用于行业/垂直搜索、论坛/站内搜索、数据库搜索、文档/文献检索、信息检索、数据挖掘等应用场景。
2、安装前准备:yum install make gcc g++ gcc-c++ libtool autoconf automake imake mysql-devel libxml2-devel expat-devel
3、下载并解压:cd opt
wget https://down.itbiancheng.com/uploads/soft/coreseek-4.1-beta.tar.gz
tar -xzvf coreseek-4.1-beta.tar.gz
cd coreseek-4.1-beta
4、安装mmseg:cd mmseg-3.2.14
./bootstrap
./configure --prefix=/opt/coreseek-4.1
make
make install
make clean
5、安装csft:
安装前先编辑configure.ac,位置13行AM_INIT_AUTOMAKE([-Wall -Werror foreign])
修改为
AM_INIT_AUTOMAKE([-Wall foreign])cd csft-4.1/
./buildconf.sh
./configure --prefix=/opt/coreseek-4.1 --without-unixodbc --with-mmseg --with-mmseg-includes=/opt/coreseek-4.1/include/mmseg/ --with-mmseg-libs=/opt/coreseek-4.1/lib/ --with-mysql
make
make install
make clean
6、配置使用:
(1)、配置数据源:cd /opt/coreseek-4.1/etc
cp sphinx-min.conf.dist csft.conf
vim /opt/coreseek-4.1/etc/csft.conf
内容如下:source lkeyw
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = 123456
sql_db = new_itbiancheng
sql_port = 3306 # optional, default is 3306
sql_sock = /tmp/mysql.sock
sql_query_pre = SET NAMES utf8 #命令行查询时,设置正确的字符集
sql_query_range = SELECT MIN(id),MAX(id) FROM web_lkeyw1
sql_range_step = 1000
sql_query = SELECT ID,ID as testid,title,url,updatetime FROM web_lkeyw WHERE ID BETWEEN $start AND $end;
sql_attr_uint = testid
sql_attr_timestamp = updatetime
sql_field_string = title
sql_field_string = url
#sql_query_info = SELECT ID,title,content,updatetime,url_link FROM web_lkeyw WHERE ID=$id
}
index lkeyw
{
source = lkeyw
path = /opt/coreseek/lkeyw
docinfo = extern
# 必须设置,表示词典文件的目录,该目录下必须有uni.lib词典文件存在
charset_dictpath = /opt/coreseek-4.1/etc/
# 必须设置
charset_type = zh_cn.utf-8
# 必须设置,表示取消原有的一元字符切分模式,不使其对中文分词产生干扰
ngram_len = 0
mlock = 0
morphology = none
min_word_len = 1
stopwords = /opt/coreseek/stopwords-cn.txt /opt/coreseek/stopwords-en.txt
html_strip = 1
html_remove_elements = style, script
preopen = 1
ondisk_dict = 1
inplace_enable = 1 # 减少了建立索引时的磁盘压力
inplace_hit_gap = 1M
inplace_docinfo_gap = 1M
}
indexer
{
mem_limit = 256M
max_iops = 50 # 每秒IO操作限制
write_buffer = 4M # 写缓冲区越大则所需的磁盘写入次数越少
}
searchd
{
listen = 9313
log = /opt/coreseek/searchd.log
#query_log = /opt/coreseek/query.log
query_log_format = sphinxql # default plain
binlog_path = /opt/coreseek/
read_timeout = 5
max_children = 30
pid_file = /opt/coreseek/coreseek.pid
max_matches = 1000
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
ondisk_dict_default = 1 # keep all dictionaries on disk
workers = fork # for RT to work
compat_sphinxql_magics = 0
}
(2)、开启searchd服务,生成索引/opt/coreseek-4.1/bin/searchd -c /opt/coreseek-4.1/etc/csft.conf
/opt/coreseek-4.1/bin/indexer -c /opt/coreseek-4.1/etc/csft.conf --all --rotate
(3)、测试:[[email protected] etc]# /opt/coreseek-4.1/bin/search -c /opt/coreseek-4.1/etc/csft.conf 微信为什么删不了好友
Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)
using config file '/usr/local/coreseek-4.1/etc/csft.conf'...
index 'lkeyw': query '微信为什么删不了好友 ': returned 2 matches of 2 total in 0.024 sec
displaying matches:
1. document=1, weight=4463, testid=1, title=微信为什么删不了好友,删了又有, url=enoz4pgk3w, updatetime=Sat May 6 14:45:16 2017
2. document=4, weight=2449, testid=4, title=微信被人删好友之后发消息的提示, url=23ewkejq36, updatetime=Sat May 6 15:08:15 2017
words:
1. '微': 10 documents, 10 hits
2. '信': 10 documents, 10 hits
3. '删': 2 documents, 3 hits
4. '好友': 3 documents, 3 hits
(4)、配合PHP使用:require_once('sphinxapi');
$s = new SphinxClient();
$s->SetServer('127.0.0.1','9313'); //设置searchd的主机名和TCP端口
$s->SetConnectTimeout(2); // 设置连接超时
$s->SetMatchMode(SPH_MATCH_BOOLEAN); //设置全文查询的匹配模式
$page_size=5;//自己定义的页数
$s->SetLimits($start,$page_size); //设置返回结果集偏移量和数目
$s->SetSortMode( SPH_SORT_EXTENDED,"good_count DESC, @id DESC" ); // 设置排序
$s->SetArrayResult(true);//控制搜索结果集的返回格式
$res = $s->Query($keyword,'*');// 执行搜索查询
$res_list = $res['matches'];
7、常用命令:#启动
/opt/coreseek-4.1/bin/searchd -c /opt/coreseek-4.1/etc/csft_mysql.conf
#停止
/opt/coreseek-4.1/bin/searchd -c /opt/coreseek-4.1/etc/csft_mysql.conf --stop
#建立索引
/opt/coreseek-4.1/bin/indexer -c /opt/coreseek-4.1/etc/csft_mysql.conf --all
#重建索引
/opt/coreseek-4.1/bin/indexer -c /opt/coreseek-4.1/etc/csft_mysql.conf --all --rotate
8、常见错误解决:
以上就是吾爱编程为大家介绍的关于centos安装coreseek,了解更多相关文章请关注吾爱编程网!