Spring中使用ElasticSearch实现中文分词全文搜索

益承颜
2023-12-01

1,创建Index:

需要使用全文索引的字段,需要添加analyzer属性

PUT /industry_index
{
  "settings": {
     "refresh_interval": "5s",
     "number_of_shards" :   3, 
     "number_of_replicas" : 2,
      "analysis" : {
            "analyzer" : {
             "ik" : {
                    "tokenizer" : "ik_max_word"
                }
            }
        }},
    "mappings": {
      "industryCapacity": {
        "properties": {
          "degist": {
            "type": "keyword"
          },
          "source": {
            "type": "keyword"
          },
          "status": {
            "type": "long"
          },
          "uploadTime": {
            "type": "keyword"
          },
          "title": {
              "type":"text",
			  "analyzer": "ik_max_word"
          }
        }
      }
    }
}

2,使用脚本想ES里面插入数据

这里我用的python,仅展示主要代码

    es = elasticsearch.Elasticsearch("localhost", port=9200)
    for result in datas:
	     path = result[6]
	     action = {
	         "_index": "industry_research_index",
	         "_type": "industryResearch",
	         "_source": {"title": result[0], "source": result[1],"uploadTime": result[2], "degist": result[3], "status": result[4]}
	     }
	     ACTIONS.append(action)
	 helpers.bulk(es, actions=ACTIONS)

3, Spring 应用ES全文搜索功能

public Flux<IndustryCapacityEs> findIndustryCapacityByName(String name) {
	MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("name", name).analyzer("ik_max_word");
      // 1.创建并设置SearchSourceBuilder对象
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    // 查询条件--->生成DSL查询语句
    searchSourceBuilder.query(queryBuilder);
    // 第几页
    searchSourceBuilder.from(0);
    // 每页多少条数据
    searchSourceBuilder.size(1);
    SearchRequest searchRequest = new SearchRequest();
    // 设置request要搜索的索引和类型
searchRequest.indices("industry_index").types("industryCapacity");
    // 设置SearchSourceBuilder查询属性
    searchRequest.source(searchSourceBuilder);
    return doSearch(searchRequest, IndustryCapacityEs.class);
}
/**
 * 解析结果
 */
private <T> Flux<T> doSearch(SearchRequest searchRequest, Class<T> responseClazz) {
     Mono<List<T>> listMono = Mono
              .<SearchResponse>create(sink -> {
                  client.searchAsync(searchRequest, listenerToSink(sink));
              })
              .map(SearchResponse::getHits)
              .map(e -> {
                  log.info("e.totalHits:" + e.totalHits);
                  SearchHit[] hits = e.getHits();
                  return Arrays.stream(hits).map(k -> {
                      Map<String, Object> source = k.getSource();
                      return objectMapper.convertValue(source, responseClazz);
                  }).collect(Collectors.toList());
              });
      Flux<T> industryResearchFlux = listMono.flatMapMany(Flux::fromIterable);
      return industryResearchFlux;
}
/**
 * 监听ES返回
 */
private <T> ActionListener<T> listenerToSink(MonoSink<T> sink) {
    return new ActionListener<T>() {
        @Override
        public void onResponse(T response) {
            sink.success(response);
        }

        @Override
        public void onFailure(Exception e) {
            sink.error(e);
        }
    };
}

4,删除Index里面数据

ES没有提供直接清空Index数据的命令,使用_delete_by_query代替

POST /industry_index/_delete_by_query
{  
  "query": {  
    "match_all": {}  
  }
}
 类似资料: