当前位置: 首页 > 工具软件 > Misago > 使用案例 >

关于Misago的中文搜索

秦诚
2023-12-01

网上没有一点资料,所以问了misago开发人员,他们告诉说使用的是django1.10自带的搜索引擎,所以找到了这么一个类,
misago.threads.search.SearchThreads
它使用这么一项来做搜索引擎的配置:
MISAGO_SEARCH_CONFIG = 'simple'

而这项配置上面还有这么一段话:
# Misago specific settings                                                                                                                                                                     
# https://misago.readthedocs.io/en/latest/developers/settings.html                                                                                                                             

# PostgreSQL text search configuration to use in searches                                                                                                                                      
# Defaults to "simple", for list of installed configurations run "\dF" in "psql"                                                                                                               
# Standard configs as of PostgreSQL 9.5 are: dutch, english, finnish, french,                                                                                                                  
# german, hungarian, italian, norwegian, portuguese, romanian, russian, simple,                                                                                                                
# spanish, swedish and turkish                                                                                                                                                                 
# Example on adding custom language can be found here: https://github.com/lemonskyjwt/plpstgrssearch


所以跳转到 github.com,原来它需要设置数据库自己的过滤语言

 

---》今天早上发现git上面的脚本不可用,这个问题不再继续挖掘了,直接邮件给Django-users

顺便发现:

MISAGO_SEARCH_CONFIG

PostgreSQL text search configuration to use in searches. Defaults to “simple”, for list of installed configurations run “dF” in “psql”.

Standard configs as of PostgreSQL 9.5 are: dutch, english, finnish, french, german, hungarian, italian, norwegian, portuguese, romanian, russian, simple, spanish, swedish, turkish.

Note

Example on adding custom language can be found here.

Note

Items in Misago are usually indexed in search engine on save or update. If you change search configuration, you’ll need to rebuild search for past posts to get reindexed using new configuration. Misago comes with rebuildpostssearch tool for this purpose.


项目在Misago通常是在搜索引擎的索引保存或更新。如果你改变搜索配置,你需要重建过去的帖子搜索得到重新索引使用新的配置。misago自带rebuildpostssearch工具用于这一目的。

 

 

 

zhparser是什么

zhparser是一个PostgreSQL中文分词的插件,通过它,可以使PostgreSQL支持中文的全文检索(Full Text Search)。

为什么需要zhparser

一般英语等语言分词比较简单,按照标点、空格切分语句即可获得有含义的词语,PostgreSQL自带的parser就是按照这个原理来分词的,比较简单。而中文就比较复杂,词语之间没有空格分割,长度也不固定,怎么分词有时还跟语句的语义有关,因此PG自带的parser不能用来做中文分词。使用zhparser这个插件,便可以使PG支持中文分词,继而可以使用PG做中文全文检索。

zhparser原理是什么

zhparser用C语言实现了PostgreSQL TEXT SEARCH PARSER需要的接口,这些接口会调用SCWS中文分词引擎进行分词。

 

 

安装:zhparser
make失败找不到各种文件

解决:
#gcc找到头文件的路径https://app.yinxiang.com/Home.action#n=783c49bf-6cb0-417c-a996-6e4184dec849&ses=4&sh=2&sds=5&
C_INCLUDE_PATH=/usr/include/libxml2:/MyLib
export C_INCLUDE_PATHhttps://app.yinxiang.com/Home.action#n=783c49bf-6cb0-417c-a996-6e4184dec849&ses=4&sh=2&sds=5&

make失败找不到scws
解决:
cp /usr/local/scws/lib/libscws.so /usr/local/lib



testdb=# create extension zhparser;
创建失败,找不到libscws.so.1
解决:
copy /usr/local/scws-1.2/lib/lib/scws.so.1 /usr/local/lib

 

 

psql

创建全文检索引擎

create extension zhparser;
创建一个测试索引库testzhcfg
CREATE TEXT SEARCH CONFIGURATION testzhcfg (PARSER = zhparser);
添加分词规则,比如动词形容词等
ALTER TEXT SEARCH CONFIGURATION testzhcfg ADD MAPPING FOR n,v,a,i,e,l WITH simple;
中文分词测试
select to_tsvector('testzhcfg','南京市长江大桥');



 

 

 

 类似资料: