Elastic 提供丰富的排序,大部分基于TF/IDF计算score。然后有时业务需要自定义排序,就是根据一个规则来计算score,然后根据这个score进行排序。目前实现自定义排序有两种方案:
- Function Score
- Script
1.1 Groovy scripts
1.2 Native Scripts
本文重点介绍以Native Scripts插件的形式实现Elatic 自定义排序。
注意:Elastic版本更新比较快,在不同版本实现方式不一样。在参考本文时候注意Elastic的版本。Native Scripts在5.0~5.4可以正常使用,在5.5版本中被弃用,6.0版本完全被移除。使用Elastic 5.5版本以上需要使用ScriptEngine。
- Sometimes groovy and expression aren’t enough. For those times you can implement a native script.
- Native Scripts were deprecated in v5.5.0 and removed in v6.0.0。Consider migrating your native scripts to the ScriptEngine.
使用ScriptPlugin插件实现一个简单排序:
定义一个”feature”字段,而该字段的打分规则由我们自己制定。其规则如下:
- 如果查询字段feature与被查询字段feature长度相等,此时被查询的文档得分90
- 如果查询字段feature长度比被查询字段feature长度小,此时被查询的文档得分60
- 如果查询字段feature长度比被查询字段feature长度大,此时被查询的文档得分30
参考 Native(Java)Scripts帮助文档, 其代码实现如下:
package com.zz.localservice.es.plugin;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.xcontent.support.XContentMapValues;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.plugins.ScriptPlugin;
import org.elasticsearch.script.AbstractDoubleSearchScript;
import org.elasticsearch.script.ExecutableScript;
import org.elasticsearch.script.NativeScriptFactory;
import java.util.Collections;
import java.util.List;
import java.util.Map;
/**
* Created with IntelliJ IDEA.
* User: smartfly2017
* Date: 2017/10/16
* Time: 15:53
* Description:
* To change this template use File | Settings | File Templates | Includes | File Header
*/
public class MyNativeScriptPlugin extends Plugin implements ScriptPlugin{
private final static Logger logger = LogManager.getLogger(MyNativeScriptPlugin.class);
@Override
public List<NativeScriptFactory> getNativeScripts() {
return Collections.singletonList(new MyNativeScriptFactory());
}
public static class MyNativeScriptFactory implements NativeScriptFactory {
@Override
public ExecutableScript newScript(@Nullable Map<String, Object> params) {
String feature = params == null ? null : XContentMapValues.nodeStringValue(params.get("feature"), null);
if (feature == null){
logger.info("feature is null!");
}
return new MyNativeScript(feature);
}
@Override
public boolean needsScores() {
return false;
}
@Override
public String getName() {
return "my_script";
}
}
public static class MyNativeScript extends AbstractDoubleSearchScript {
private final String feature;
public MyNativeScript(String feature) {
this.feature = feature;
}
@Override
public double runAsDouble() {
String sourceFeature = (String) source().get("feature");
int len1 = feature.length();
int len2 = sourceFeature.length();
if (len1 == len2){
return 90;
} else if (len1 < len2){
return 60;
} else {
return 30;
}
}
}
}
由于Elastic所有插件必须包含plugin-descriptor.properties
文件在elasticsearch文件夹中。
其plugin-descriptor.properties
文件如下:
description=${project.description}.
version=${project.version}
name=${project.artifactId}
classname=com.zz.localservice.es.plugin.MyNativeScriptPlugin
java.version=1.8
elasticsearch.version=5.3.0
为了保证配置文件和jar都包含在elasticsearch文件下,使用plugin.xml配置文件,其配置如下:
<?xml version="1.0"?>
<assembly>
<id>plugin</id>
<formats>
<format>zip</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<files>
<file>
<source>${project.basedir}/src/main/resources/plugin-descriptor.properties</source>
<outputDirectory>elasticsearch</outputDirectory>
<filtered>true</filtered>
</file>
</files>
<dependencySets>
<dependencySet>
<outputDirectory>elasticsearch</outputDirectory>
<useProjectArtifact>true</useProjectArtifact>
<useTransitiveFiltering>true</useTransitiveFiltering>
</dependencySet>
</dependencySets>
</assembly>
为了保证编译打包正确,需要配置好pom文件。其pom.xml文件配置如下:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.zz.localservice.ai</groupId>
<artifactId>es-sort</artifactId>
<version>1.0-SNAPSHOT</version>
<name>Plugin: Basic</name>
<description>Only for test</description>
<properties>
<es.version>5.3.0</es.version>
<lucene.version>6.4.1</lucene.version>
</properties>
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${es.version}</version>
<scope>provided</scope>
</dependency>
<!-- Testing -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.7</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.7</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.elasticsearch.test</groupId>
<artifactId>framework</artifactId>
<version>${es.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-test-framework</artifactId>
<version>${lucene.version}</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>false</filtering>
<excludes>
<exclude>*.properties</exclude>
</excludes>
</resource>
</resources>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.3</version>
<configuration>
<appendAssemblyId>false</appendAssemblyId>
<outputDirectory>${project.build.directory}/releases/</outputDirectory>
<descriptors>
<descriptor>${basedir}/src/main/assemblies/plugin.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.7.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
使用打包的方式和一般的maven项目相同,使用下面命令:
mvn clean install
打包完成后插件在/target/releases/目录下es-sort-1.0-SNAPSHOT.zip文件
当我们此时Java plugin时候,需要安装该插件。此处安装插件跟安装其他插件一样。使用bin/elasticsearch-plugin install file:///path/to/your/plugin
。
注意:
为了使Elastic支持动态插件,需要在elasticsearch.yml配置文件添加如下配置:
script.inline: true
script.stored: true
如果不添加此配置,在后面测试会报如下错误:
Failed to compile inline script [my_script] using lang [native]
PUT my_index_test
{
"mappings": {
"my_type": {
"properties": {
"feature": {
"type": "keyword",
"index": "not_analyzed"
},
"tag": {
"type": "keyword"
},
"testname": {
"type": "text"
}
}
}
}
}
PUT /my_index_test/my_type/1
{
"feature": "abc",
"tag": "mytagabc",
"testname": "Hello world"
}
PUT /my_index_test/my_type/2
{
"feature": "123456",
"tag": "2mytagabc",
"testname": "2Hello world"
}
PUT /my_index_test/my_type/3
{
"feature": "def789kkk",
"tag": "3mytagabc",
"testname": "3Hello world"
}
POST /my_index_test/my_type/_search?pretty
{
"query": {
"function_score": {
"query": {
"match_all" : { }
},
"functions": [
{
"script_score": {
"script": {
"inline": "my_script",
"lang" : "native",
"params":
{
"feature": "aaaaaa"
}
}
}
}
]
}
}
}
测试结果为:
{
"took" : 101,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 90.0,
"hits" : [
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "2",
"_score" : 90.0,
"_source" : {
"feature" : "123456",
"tag" : "2mytagabc",
"testname" : "2Hello world"
}
},
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "3",
"_score" : 60.0,
"_source" : {
"feature" : "def789kkk",
"tag" : "3mytagabc",
"testname" : "3Hello world"
}
},
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "1",
"_score" : 30.0,
"_source" : {
"feature" : "abc",
"tag" : "mytagabc",
"testname" : "Hello world"
}
}
]
}
}