当前位置: 首页 > 工具软件 > Xapian > 使用案例 >

Redmine plugin redmine_xapian安裝

锺离刚洁
2023-12-01

前言

redmine_xapian是redmine的一個plugin,它的plugin頁面及github頁面分別為redmine_xapianxelkano/redmine_xapian

有了它我們就可以搜索檔案(包括txt,docx,pdf,…)裡的內容。它也可以搜索project下repository裡的檔案內容。

要注意的是它不能跟另外一個plugin: redmine_full_text_search同時存在。

安裝步驟

安裝xapian-core,omega,xapian-bindings

Github上給出的安裝指令為:

sudo apt install xapian-omega ruby-xapian -y 

但是它有一個前提,那就是ruby不能是由rvm安裝的。如果ruby是用rvm裝的,那麼xapian-core,omega,xapian-bindings就不能從apt安裝,要手動從源碼安裝。

To use the full-text search engine you must install ruby-xapian and xapian-omega packages. In case of using of Bitnami stack or Ruby installed via RVM it might be necessary to install Xapian bindings from sources. See https://xapian.org for details.

可以從Xapian downloads尋找這三個package最新版的壓縮檔來下載。

以下三個package的安裝方式參考自AWS MarketplaceでインストールしたRedmineでDMSFを使う

安裝xapian-core

注:當前xapian-core已更新到1.4.22版

cd /tmp
wget https://oligarchy.co.uk/xapian/1.4.17/xapian-core-1.4.17.tar.xz
tar xf xapian-core-1.4.17.tar.xz
cd xapian-core-1.4.17
./configure --prefix=/opt
make -j16
sudo make install -j16

安裝pcre

因為omega依賴於pcre,所以此處先安裝pcre

cd /tmp
wget https://ftp.pcre.org/pub/pcre/pcre-8.44.tar.gz
tar zvxf pcre-8.44.tar.gz
cd pcre-8.44
./configure --prefix=/opt
make -j16
sudo make install -j16

2023年使用較新版的pcre2-10.42:

wget https://github.com/PCRE2Project/pcre2/releases/download/pcre2-10.42/pcre2-10.42.tar.gz
tar zvxf pcre2-10.42.tar.gz
cd pcre2-10.42/
./configure --prefix=/opt
make -j16
sudo make install -j16

安裝omega

注:當前xapian-omega已更新到1.4.22版
omega還依賴於ligmagic-dev

sudo apt install -y libmagic-dev

安裝omega

cd /tmp #或cd /home/<user_name>
wget https://oligarchy.co.uk/xapian/1.4.17/xapian-omega-1.4.17.tar.xz
tar xf xapian-omega-1.4.17.tar.xz
cd xapian-omega-1.4.17
./configure XAPIAN_CONFIG=/opt/bin/xapian-config PCRE_CONFIG=/opt/bin/pcre-config
make -j16
sudo make install -j16

安裝xapian-bindings

注:當前xapian-bindings已更新到1.4.22版

cd /tmp
wget https://oligarchy.co.uk/xapian/1.4.17/xapian-bindings-1.4.17.tar.xz
tar xf xapian-bindings-1.4.17.tar.xz
cd xapian-bindings-1.4.17
./configure XAPIAN_CONFIG=/opt/bin/xapian-config
make -j16
sudo make install -j16

它的安裝路徑為:

/usr/share/rvm/rubies/ruby-2.6.5/lib/ruby/site_ruby/2.6.0/x86_64-linux/_xapian.so
/usr/local/lib/x86_64-linux-gnu/perl/5.30.0/auto/Xapian/Xapian.so

裝完後可用以下指令檢查,如果沒有任何訊息表示安裝成功:

ruby -e "require 'xapian'"

安裝其它required packages

sudo apt install libxapian-dev xpdf poppler-utils antiword unzip catdoc libwpd-tools libwps-tools gzip unrtf catdvi djview djview3 uuid uuid-dev xz-utils libemail-outlook-message-perl -y

如果使用sudo apt install xpdf時出現:

E: Package 'xpdf' has no installation candidate

表示xpdf現已無法透過apt安裝,需手動安裝,參考How to install xpdf on Ubuntu 20.04

sudo add-apt-repository ppa:linuxuprising/libpng12
sudo apt update
sudo apt install libpng12-0
wget http://archive.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler58_0.41.0-0ubuntu1_amd64.deb
sudo apt install ./libpoppler58_0.41.0-0ubuntu1_amd64.deb
wget http://archive.ubuntu.com/ubuntu/pool/universe/x/xpdf/xpdf_3.04-1ubuntu1.1_amd64.deb
sudo apt install ./xpdf_3.04-1ubuntu1.1_amd64.deb

注意xpdf依賴於libpoppler58libpoppler58依賴於libpng12,所以應照上述順序安裝,否則會出現:

The following packages have unmet dependencies:
libpoppler58 : Depends: libpng12-0 (>= 1.2.13-4) but it is not installable
E: Unable to correct problems, you have held broken packages.

The following packages have unmet dependencies:
 xpdf : Depends: libpoppler58 (>= 0.41.0) but it is not installable
 Recommends: poppler-utils
 Recommends: gsfonts-x11 but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

另外注意當前最新版的libpopplerlibpoppler126_22.12.0-2ubuntu1_amd64.deb,但嘗試安裝這個版本時會出現以下錯誤:

The following packages have unmet dependencies:
 libpoppler126 : Depends: libc6 (>= 2.35) but 2.31-0ubuntu9.7 is to be installed
                 Depends: libstdc++6 (>= 11) but 10.3.0-1ubuntu1~20.04 is to be installed
                 Depends: libtiff6 (>= 4.0.3) but it is not installable
E: Unable to correct problems, you have held broken packages.

這是因為目前apt倉庫裡只有到2.31版的libc6而沒有較新的2.35版。根據How can I get glibc 2.35 on Ubuntu 20.04?,libc是系統級的函式庫,最好避免更新,所以目前選擇安裝較舊版本的libpoppler58

安裝redmine_xapian plugin

cd redmine/plugins/
git clone https://github.com/xelkano/redmine_xapian.git
# move to the rails application "redmine"'s directory!
cd ..
bundle install
RAILS_ENV=production bundle exec rake db:migrate
RAILS_ENV=production bundle exec rake redmine:plugins:migrate NAME=redmine_xapian

重新載入apache2使它生效:

systemctl reload apache2

Troubleshooting

install omega時make出錯

make時出錯,從錯誤訊息中看出來這是因為它找不到pcre所導致的。

如果原來xapian-omega-1.4.17是在/tmp目錄下,可以把它移到/home/<user_name>下再試一次。

make  all-recursive
make[1]: Entering directory '/tmp/xapian-omega-1.4.17'
Making all in .
make[2]: Entering directory '/tmp/xapian-omega-1.4.17'
/bin/bash ./libtool  --tag=CXX   --mode=link g++ -Wall -W -Wredundant-decls -Wpointer-arith -Wcast-qual -Wcast-align -Wformat-security -fno-gnu-keywords -Wundef -Woverloaded-virtual -Wstrict-null-sentinel -Wshadow -Wstrict-overflow=1 -Wlogical-op -Wmissing-declarations -Wdouble-promotion -Winit-self -I/opt/include -g -O2    -o omega omega.o query.o cgiparam.o utils.o configfile.o date.o cdb_init.o cdb_find.o cdb_hash.o cdb_unpack.o jsonescape.o loadfile.o datevalue.o common/str.o sample.o sort.o urlencode.o weight.o expand.o csvescape.o timegm.o -L/opt/lib -lxapian libtransform.la
libtool: link: g++ -Wall -W -Wredundant-decls -Wpointer-arith -Wcast-qual -Wcast-align -Wformat-security -fno-gnu-keywords -Wundef -Woverloaded-virtual -Wstrict-null-sentinel -Wshadow -Wstrict-overflow=1 -Wlogical-op -Wmissing-declarations -Wdouble-promotion -Winit-self -I/opt/include -g -O2 -o omega omega.o query.o cgiparam.o utils.o configfile.o date.o cdb_init.o cdb_find.o cdb_hash.o cdb_unpack.o jsonescape.o loadfile.o datevalue.o common/str.o sample.o sort.o urlencode.o weight.o expand.o csvescape.o timegm.o  -L/opt/lib /opt/lib/libxapian.so ./.libs/libtransform.a -lrt -lz -luuid -Wl,-rpath -Wl,/opt/lib -Wl,-rpath -Wl,/opt/lib
/usr/bin/ld: ./.libs/libtransform.a(libtransform_la-transform.o): in function `get_re(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)':
/tmp/xapian-omega-1.4.17/transform.cc:47: undefined reference to `pcre_compile'
/usr/bin/ld: ./.libs/libtransform.a(libtransform_la-transform.o): in function `omegascript_match(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)':
/tmp/xapian-omega-1.4.17/transform.cc:87: undefined reference to `pcre_exec'
/usr/bin/ld: ./.libs/libtransform.a(libtransform_la-transform.o): in function `omegascript_transform(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)':
/tmp/xapian-omega-1.4.17/transform.cc:131: undefined reference to `pcre_exec'
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:1124: omega] Error 1
make[2]: Leaving directory '/tmp/xapian-omega-1.4.17'
make[1]: *** [Makefile:1442: all-recursive] Error 1
make[1]: Leaving directory '/tmp/xapian-omega-1.4.17'
make: *** [Makefile:903: all] Error 2

bundle install的位置

以下幾個錯誤都是因為bundle install下錯位置導致的,下的位置應為redmine的根目錄,即/home/<user_name>/redmine

Your Gemfile has no gem server sources.

bundle install時出現以下錯誤:

Your Gemfile has no gem server sources. If you need gems that are not already on your machine, add a line like this to your Gemfile:
source 'https://rubygems.org'
Could not find concurrent-ruby-1.1.5 in any of the sources

can’t find executable rake for gem rake.

RAILS_ENV=production bundle exec rake db:migrate

時出錯:

Traceback (most recent call last):
        4: from /usr/share/rvm/gems/ruby-2.7.0/bin/ruby_executable_hooks:24:in `<main>'
        3: from /usr/share/rvm/gems/ruby-2.7.0/bin/ruby_executable_hooks:24:in `eval'
        2: from /usr/share/rvm/gems/ruby-2.7.0/bin/rake:23:in `<main>'
        1: from /usr/share/rvm/rubies/ruby-2.7.0/lib/ruby/2.7.0/bundler/rubygems_integration.rb:402:in `block in replace_bin_path'
/usr/share/rvm/rubies/ruby-2.7.0/lib/ruby/2.7.0/bundler/rubygems_integration.rb:374:in `block in replace_bin_path': can't find executable rake for gem rake. rake is not currently included in the bundle, perhaps you meant to add it to your Gemfile? (Gem::Exception)

LoadError: Error loading the ‘xxx’ Active Record adapter. Missing a gem it depends on? yyy is not part of the bundle. Add it to your Gemfile.

RAILS_ENV=production bundle exec rake db:migrate

時出錯:

rake aborted!
LoadError: Error loading the 'postgresql' Active Record adapter. Missing a gem it depends on? pg is not part of the bundle. Add it to your Gemfile.

NameError: uninitialized constant

RAILS_ENV=production bundle exec rake db:migrate

時出錯:

rake aborted!
NameError: uninitialized constant ChupaText

與full_text_search會發生衝突

[DEPRECATED] Your Gemfile contains multiple primary sources.

bundle install時出錯:

[DEPRECATED] Your Gemfile contains multiple primary sources. Using `source` more than once without a block is a security risk, and may result in installing unexpected gems. To resolve this warning, use a block to indicate which gems should come from the secondary source. To upgrade this warning to an error, run `bundle config set --local disable_multisource true`.
Fetching source index from https://rubygems.org/

Retrying fetcher due to error (2/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <no such name (https://rubygems.org/specs.4.8.gz)>

Retrying fetcher due to error (3/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <no such name (https://rubygems.org/specs.4.8.gz)>

Retrying fetcher due to error (4/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <no such name (https://rubygems.org/specs.4.8.gz)>

Could not fetch specs from https://rubygems.org/ due to underlying error <no such name (https://rubygems.org/specs.4.8.gz)>

這是因為有兩個gem source產生衝突。

經過一番查找後,發現是redmine根目錄下的Gemfileplugins/full_text_search/Gemfile發生衝突。

因為full_text_searchredmine_xapian這兩個plugin不相容,所以一種解決方式是移除full_text_search這個plugin。

Internal server error when search on redmine

在執行搜索時,log/production.log出現以下錯誤:

Started GET "/search?utf8=%E2%9C%93&scope=&q=abc" for 172.17.0.1 at 2020-12-25 11:20:42 +0800
Processing by SearchController#index as HTML
  Parameters: {"utf8"=>"~\~S", "scope"=>"", "q"=>"abc"}
  Current user: admin (id=1)
[full-text-search][search] {"search_id":"1608866442.145025","q":"abc","scope":"","all_words":"1","titles_only":"0","attachments":"1","open_issues":"0","offset":0,"limit":10,"order_target":"score","order_type":"desc","options":"0","issues":"1","news":"1","documents":"1","changesets":"1","wiki_pages":"1","messages":"1","projects":"1","changes":"1","tags":[],"user_id":1,"project_id":null,"n_hits":0,"total_n_hits":0,"elapsed_time":0.0001387596130371094,"timestamp":"2020-12-25T03:20:42Z"}
  Rendering plugins/redmine_xapian/app/views/search/index.html.erb within layouts/base
  Rendered plugins/redmine_xapian/app/views/search/index.html.erb within layouts/base (5.8ms)
Completed 500 Internal Server Error in 37ms (ActiveRecord: 13.6ms)

ActionView::Template::Error (undefined method `each' for nil:NilClass):
    41: <fieldset class="box">
    42: <legend><%= toggle_checkboxes_link('p#search-types input') %></legend>
    43: <p id="search-types">
    44:   <% @object_types.each do |t| %>
    45:     <label>
    46:       <%= check_box_tag t, 1, @scope.include?(t) %>
    47:       <%# Plugin change do %>

與正常的log做對比:

Started GET "/search?utf8=%E2%9C%93&scope=&q=abc" for 172.17.0.1 at 2020-12-25 11:25:34 +0800
Processing by SearchController#index as HTML
  Parameters: {"utf8"=>"~\~S", "scope"=>"", "q"=>"abc"}
  Current user: admin (id=1)
[full-text-search][search] {"search_id":"1608866734.2071953","q":"abc","scope":"","all_words":"1","titles_only":"0","attachments":"1","open_issues":"0","offset":0,"limit":10,"order_target":"score","order_type":"desc","options":"0","issues":"1","news":"1","documents":"1","changesets":"1","wiki_pages":"1","messages":"1","projects":"1","changes":"1","tags":[],"user_id":1,"project_id":null,"n_hits":0,"total_n_hits":0,"elapsed_time":0.01783418655395508,"timestamp":"2020-12-25T03:25:34Z"}
  Rendering plugins/full_text_search/app/views/search/index.html.erb within layouts/base
  Rendered plugins/full_text_search/app/views/search/index.html.erb within layouts/base (43.3ms)
Completed 200 OK in 190ms (Views: 52.5ms | ActiveRecord: 107.0ms)

做了1.3 Setup還是有這個問題,後來發現把另外一個名為full_text_search的plugin關掉就能用了,參考https://github.com/xelkano/redmine_xapian/issues/42。

Plugin redmine_xapian was not found

在安裝plugin時:

RAILS_ENV=production bundle exec rake redmine:plugins:migrate NAME=redmine_xapian

console出現以下錯誤:

Plugin redmine_xapian was not found.

log/production.log出現以下錯誤訊息:

No Xapian search engine interface for Ruby installed => Full-text search won't be available.
                      Install a ruby-xapian package or an alternative Xapian binding (https://xapian.org).
Creating scope :system. Overwriting existing method Enumeration.system.

根據Error: Plugin redmine_xapian was not found.,如果ruby是用rvm安裝的,卻用apt安裝xapian-core,xapian-bindings,omega,就會出現這個錯誤。

解決辦法就是手動從源碼安裝這三個package。

setup index

為檔案中的文字建立索引:

cd redmine/
XAPIAN_CJK_NGRAM=true ruby plugins/redmine_xapian/extra/xapian_indexer.rb -xv

註1:上傳新檔案後要更新索引
註2:更新後apache2不用重開,會立即生效

每天更新index,參考Linux 設定 crontab 例行性工作排程教學與範例

crontab -e

選擇編輯器,貼上以下內容:

@daily XAPIAN_CJK_NGRAM=true ruby plugins/redmine_xapian/extra/xapian_indexer.rb -xv

@daily www-data XAPIAN_CJK_NGRAM=true ruby plugins/redmine_xapian/extra/xapian_indexer.rb -xv代表:每天以www-data使用者身份執行ruby xxx.rb

使用以下指令檢查:

crontab -l

輸出如下:

# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').
#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command

@daily ruby redmine/plugins/redmine_xapian/extra/xapian_indexer.rb -v

Troubleshooting

Can’t open Xapian database /home/redmine/redmine/file_index/english

Started GET "/search?utf8=%E2%9C%93&scope=&q=cycle" for 172.17.0.1 at 2021-04-21 11:03:59 +0800
Processing by SearchController#index as HTML
  Parameters: {"utf8"=>"_", "scope"=>"", "q"=>"cycle"}
  Current user: admin (id=1)
Can't open Xapian database /home/redmine/redmine/file_index/english - #<IOError: DatabaseOpeningError: Couldn't stat '/home/redmine/redmine/file_index/english' (No such file or directory)>
Can't open Xapian database /home/redmine/redmine/file_index/repodb - #<IOError: DatabaseOpeningError: Couldn't stat '/home/redmine/redmine/file_index/repodb' (No such file or directory)>
REDMINE_XAPIAN ERROR: Xapian database is not properly set, initiated or it's corrupted.
DatabaseOpeningError: Couldn't stat '/home/redmine/redmine/dmsf_index/english' (No such file or directory)
  Rendering plugins/redmine_xapian/app/views/search/index.html.erb within layouts/base
  Rendered plugins/redmine_xapian/app/views/search/index.html.erb within layouts/base (6.7ms)
  Rendered plugins/redmine_zenedit/app/views/zenedit/_additional_assets.html.erb (0.2ms)
Completed 200 OK in 145ms (Views: 26.7ms | ActiveRecord: 67.4ms)

解決辧法為重新setup index。

/usr/bin/scriptindex does not exist, exiting…

如果在執行ruby xapian_indexer.rb -v時出現以下錯誤(重點在最後一行):

Trying to load Redmine environment <</home/redmine/redmine/config/environment.rb>>...
[dry-types] Dry::Types.module is deprecated and will be removed in the next major version
Use Dry.Types() instead. Beware, it exports strict types by default, for old behavior use Dry.Types(default: :nominal). See more options in the changelog
/usr/share/rvm/rubies/ruby-2.6.5/lib/ruby/site_ruby/2.6.0/bundler/runtime.rb:81:in `require'
Redmine environment [RAILS_ENV=production] correctly loaded ...
/usr/local/bin/omindex -s english --db /home/redmine/redmine/file_index/english /home/redmine/redmine/files --url / --depth-limit=0 -v
[Entering directory ""]
[Entering directory "2021/"]
[Entering directory "2021/04/"]
Indexing "2021/04/210420170136_robot.txt" as text/plain ... already indexed
Indexing "2021/04/210420170136_robot.docx" as application/vnd.openxmlformats-officedocument.wordprocessingml.document ... already indexed
Indexing "2021/04/210420170321_robot.pdf" as application/pdf ... already indexed
Indexing "delete.me" as text/plain ... already indexed
Redmine files indexed
/usr/bin/scriptindex does not exist, exiting...

代表它在/usr/bin下找不到執行檔scriptindex。可以用which指令查看它的位置:

which scriptindex

輸出:

/usr/local/bin/scriptindex

得知scriptindex的正確路徑之後,修改extra/xapian_indexer.rb

# Redmine installation directory
$redmine_root = File.expand_path('../../../../', __FILE__)

# Files location
$files = 'files'

# scriptindex binary path
$scriptindex  = '/usr/bin/scriptindex'

# omindex binary path
$omindex = '/usr/bin/omindex'

將其中的:

$scriptindex  = '/usr/bin/scriptindex'

修改為:

$scriptindex  = '/usr/local/bin/scriptindex'

如果接下來出現了以下錯誤:

/usr/bin/omindex does not exist, exiting...

也是用類似的方式處理。

中文搜索

參考Question: How to make Japanese/Chinese characters searchable?,如果想要搜索中文,將索引的指令改成:

cd redmine/
XAPIAN_CJK_NGRAM=true ruby plugins/redmine_xapian/extra/xapian_indexer.rb -xv

redmine_xapian只能搜索英文關鍵字,如果要讓它能搜索中文,可以參考以下幾個連結:

Xapian实现中文搜索(使用C++)

hightman/scws(使用C++)

用xapian跟mmseg实现中文搜索(使用Python)

學習資源

xapian文檔

Xapian构建索引说明

 类似资料: