问题：

在spring data elasticsearch中，是否可以将聚合查询放在相应的实现中？

燕朝明

2023-03-14

我第一次使用spring boot elasticsearch。现在，我已经知道如何使用elastics java api描述我的串行差异管道查询。正如您将在下面看到的，这个查询相当大，它为每个对象返回几个bucket以及每个bucket之间的序列差。我在Spring数据存储库中看到的搜索示例似乎都在如下查询注释中详细说明了查询的json主体：

@Repository
public interface SonarMetricRepository extends ElasticsearchRepository<Article, String> {

    @Query("{\"bool\": {\"must\": {\"match\": {\"authors.name\": \"?0\"}}, \"filter\": {\"term\": {\"tags\": \"?1\" }}}}")
    Page<Article> findByAuthorsNameAndFilteredTagQuery(String name, String tag, Pageable pageable);
}

对于基本的CRUD操作来说，这似乎很优雅，但我如何才能将下面的查询放入存储库对象中，而无需使用@Query的原始查询语法？如果您有一个类似的示例，说明Model对象为串行差异查询结果或任何管道聚合构建了什么，那也会更有帮助。基本上，我想要像这样的存储库中的搜索方法

Page<Serial Difference Result Object> getCodeCoverageMetrics(String projectKey, Date start, Date end, String interval, int lag);

我应该提到我想使用这个对象的部分原因是我在这里也会有其他CRUD查询，我认为它会为我处理分页，所以这很有吸引力。

以下是我的查询，它显示了一周内声纳项目代码覆盖率之间的序列差异：

        SerialDiffPipelineAggregationBuilder serialDiffPipelineAggregationBuilder =
            PipelineAggregatorBuilders
                    .diff("Percent_Change", "avg_coverage")
                    .lag(1);

    AvgAggregationBuilder averageCoverageAggregationBuilder = AggregationBuilders
            .avg("avg_coverage")
            .field("coverage");

    AggregationBuilder coverageHistoryAggregationBuilder = AggregationBuilders
            .dateHistogram("coverage_history")
            .field("@timestamp")
            .calendarInterval(DateHistogramInterval.WEEK)
            .subAggregation(averageCoverageAggregationBuilder)
            .subAggregation(serialDiffPipelineAggregationBuilder);

    TermsAggregationBuilder sonarProjectKeyAggregationBuilder = AggregationBuilders
            .terms("project_key")
            .field("key.keyword")
            .subAggregation(coverageHistoryAggregationBuilder);

    BoolQueryBuilder searchQuery = new BoolQueryBuilder()
            .filter(matchAllQuery())
            .filter(matchPhraseQuery("name.keyword", "my-sample-sonar-project"))
            .filter(rangeQuery("@timestamp")
                    .format("strict_date_optional_time")
                    .gte("2020-07-08T19:29:12.054Z")
                    .lte("2020-07-15T19:29:12.055Z"));

    // Join query and aggregation together
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder()
            .query(searchQuery)
            .aggregation(sonarProjectKeyAggregationBuilder);

    SearchRequest searchRequest = new SearchRequest("sonarmetrics").source(searchSourceBuilder);
    SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);

孟意致

2023-03-14

好的，如果我做对了，那么您需要向存储库查询添加聚合。对于由Spring Data Elasticsearch自动创建的方法，这是不可能的，但实现它并不太困难。

为了向您展示如何做到这一点，我使用了一个更简单的示例，其中我们定义了一个个人实体：

@Document(indexName = "person")
public class Person {

    @Id
    @Nullable
    private Long id;

    @Field(type = FieldType.Text, fielddata = true)
    @Nullable
    private String lastName;

    @Field(type = FieldType.Text, fielddata = true)
    @Nullable
    private String firstName;

    // getter/setter
}

还有一个相应的存储库：

public interface PersonRepository extends ElasticsearchRepository<Person, Long>{
}

现在，我们希望扩展此存储库，以便能够搜索具有名字的人，并为这些人返回前10名的姓氏和计数（姓氏上的术语聚合）。

要做的第一件事是定义一个描述您需要的方法的自定义存储库：

interface PersonCustomRepository {
    SearchPage<Person> findByFirstNameWithLastNameCounts(String firstName, Pageable pageable);
}

我们希望传入一个可分页的，以便这些方法返回数据页。我们返回一个SearchPage对象检查返回类型的文档，这些返回类型将包含分页信息以及一个SearchHits

然后我们更改个性化存储库以扩展这个新接口：

public interface PersonRepository extends ElasticsearchRepository<Person, Long>, PersonCustomRepository {
}

当然，我们现在需要在名为PersonCustomRepositoryImpl的类中提供一个实现（必须像添加了Impl的接口一样命名）：

public class PersonCustomRepositoryImpl implements PersonCustomRepository {

    private final ElasticsearchOperations operations;

    public PersonCustomRepositoryImpl(ElasticsearchOperations operations) { // let Spring inject an operations which we use to do the work
        this.operations = operations;
    }

    @Override
    public SearchPage<Person> findByFirstNameWithLastNameCounts(String firstName, Pageable pageable) {

        Query query = new NativeSearchQueryBuilder()                       // we build a Elasticsearch native query
            .addAggregation(terms("lastNames").field("lastName").size(10)) // add the aggregation
            .withQuery(QueryBuilders.matchQuery("firstName", firstName))   // add the query part
            .withPageable(pageable)                                        // add the requested page
            .build();

        SearchHits<Person> searchHits = operations.search(query, Person.class);  // send it of and get the result

        return SearchHitSupport.searchPageFor(searchHits, pageable);  // convert the result to a SearchPage
    }
}

搜索的实现就到这里，现在存储库多了这个方法，怎么用呢？

对于这个演示，我假设我们有一个REST控制器，它取一个名称并返回一对：

查找到的人员作为SearchHit的列表

这可以按以下方式实现，注释描述了所做的工作：

@GetMapping("persons/firstNameWithLastNameCounts/{firstName}")
public Pair<List<SearchHit<Person>>, Map<String, Long>> firstNameWithLastNameCounts(@PathVariable("firstName") String firstName) {

    // helper function to get the lastName counts from an Elasticsearch Aggregations
    // Spring Data Elasticsearch does not have functions for that, so we need to know what is coming back
    Function<Aggregations, Map<String, Long>> getLastNameCounts = aggregations -> {
        if (aggregations != null) {
            Aggregation lastNames = aggregations.get("lastNames");
            if (lastNames != null) {
                List<? extends Terms.Bucket> buckets = ((Terms) lastNames).getBuckets();
                if (buckets != null) {
                    return buckets.stream().collect(Collectors.toMap(Terms.Bucket::getKeyAsString, Terms.Bucket::getDocCount));
                }
            }
        }
        return Collections.emptyMap();
    };

    // the parts of the returned object
    Map<String, Long> lastNameCounts = null;
    List<SearchHit<Person>> searchHits = new ArrayList<>();

    // request pages of size 1000
    Pageable pageable = PageRequest.of(0, 1000);
    boolean fetchMore = true;
    while (fetchMore) {
        // call the custom method implementation
        SearchPage<Person> searchPage = personRepository.findByFirstNameWithLastNameCounts(firstName, pageable);

        // get the aggregations on the first call, will be the same on the other pages
        if (lastNameCounts == null) {
            Aggregations aggregations = searchPage.getSearchHits().getAggregations();
            lastNameCounts = getLastNameCounts.apply(aggregations);
        }

        // collect the returned data
        if (searchPage.hasContent()) {
            searchHits.addAll(searchPage.getContent());
        }

        pageable = searchPage.nextPageable();
        fetchMore = searchPage.hasNext();
    }

    // return the collected stuff
    return Pair.of(searchHits, lastNameCounts);
}

我希望这能为如何实现自定义存储库功能和添加未现成提供的功能提供思路。

在spring data elasticsearch中，是否可以将聚合查询放在相应的实现中？

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档