问题：

Elasticsearch-使用function_score查询的不同文档

阎彬炳

2023-03-14

我在ElasticSearch有索引。其中的文档具有重复的字段值。在查询结果中，我需要删除所有重复项，并且只获得不同的值。例如：

PUT localhost：9200/人

{
    "mappings" : {
        "person" : {
            "properties" : {
                "name" : { "type" : "keyword" }
            }
        }
    }
}

POST localhost：9200/人/人

{
    "name": "John"
}

{
    "name": "John"
}

{
    "name": "Marry"
}

{
    "name": "Tomas"
}

我试图通过字段“name”删除重复的术语聚合，但它不起作用。

获取localhost:9200/person/person/_search

{
  "size": 3,
  "query": {
    "function_score": {
      "functions": [
        {
          "random_score": {
            "seed": "dasdfdLBpnM0"
          }
        }
      ]
    }
  },
  "aggs": {
    "top-names": {
      "terms": {
        "field": "name",
        "size": 3
      },
      "aggs": {
        "top_names_hits": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

结果：

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 10,
        "max_score": 0.9506482,
        "hits": [
            {
                "_index": "person",
                "_type": "person",
                "_id": "H-5D8GoB8pRyckNSVUeN",
                "_score": 0.9506482,
                "_source": {
                    "name": "Tomas"
                }
            },
            {
                "_index": "person",
                "_type": "person",
                "_id": "He5D8GoB8pRyckNSPEfa",
                "_score": 0.7700638,
                "_source": {
                    "name": "John"
                }
            },
            {
                "_index": "person",
                "_type": "person",
                "_id": "HO5D8GoB8pRyckNSN0fo",
                "_score": 0.71723765,
                "_source": {
                    "name": "John"
                }
            }
        ]
    },
    "aggregations": {
        "top-names": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "John",
                    "doc_count": 2,
                    "top_names_hits": {
                        "hits": {
                            "total": 2,
                            "max_score": 0.7700638,
                            "hits": [
                                {
                                    "_index": "person",
                                    "_type": "person",
                                    "_id": "He5D8GoB8pRyckNSPEfa",
                                    "_score": 0.7700638,
                                    "_source": {
                                        "name": "John"
                                    }
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "Marry",
                    "doc_count": 1,
                    "top_names_hits": {
                        "hits": {
                            "total": 1,
                            "max_score": 0.66815424,
                            "hits": [
                                {
                                    "_index": "person",
                                    "_type": "person",
                                    "_id": "Iu5D8GoB8pRyckNScUdv",
                                    "_score": 0.66815424,
                                    "_source": {
                                        "name": "Marry"
                                    }
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "Tomas",
                    "doc_count": 1,
                    "top_names_hits": {
                        "hits": {
                            "total": 1,
                            "max_score": 0.9506482,
                            "hits": [
                                {
                                    "_index": "person",
                                    "_type": "person",
                                    "_id": "H-5D8GoB8pRyckNSVUeN",
                                    "_score": 0.9506482,
                                    "_source": {
                                        "name": "Tomas"
                                    }
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
}

聚合应用于name=“marry”的文档，但我不明白为什么，以及如何才能将聚合仅应用于查询结果。

共有1个答案

蓟清野

2023-03-14

下面是或多或少的Elasticsearch查询蓝图....

{
  "size": n, // Return the n documents based on "query" section (to frontend)

  "query": {
          //  Here is where you are supposed to mention what documents you want
          //  Any filter/bool/match query condition
          //  In your case, you haven't specified any correct condition. 
          //  So basically, it would return all the documents or documents based on size parameter. In your case it returns 3. 
  },

  "aggs":{
      //  This aggregation query would only be applied on documents 
      //  based on documents filtered/matched by the "query" section. 
      //  In your case it is applying aggregation on all documents of that index as per the comment I've mentioned in the above query section.
   }
}

要获得您要查找的内容，只需使用下面的简化查询，它与Terms Gaggregation（Top Hits）一起使用，作为子Gaggregation（子Gaggregation）。

POST person/_search
{
  "size": 0,                          <------- This is to say, I don't want "query" results to be returned and that I only want below aggregation results. 
  "aggs": {
    "top-names": {
      "terms": {
        "field": "name",
        "size": 10
      },
      "aggs": {
        "top_hits_documents": {       <------- Top hits would return the actual documents
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

通过在顶部指定“size”:0，您基本上是在所有文档上应用聚合，并且不会返回任何查询结果。

您只需返回聚合结果。

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 0.0,
    "hits" : [ ]                    <------ Notice this. No query results returned
  },
  "aggregations" : {                <------ Aggregation Result starts
    "top-names" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "John",           <------- This is to say there's a value called John 
          "doc_count" : 2,          <------- John occurs in two documents.
          "top_hits_documents" : {
            "hits" : {
              "total" : 2,
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "person",
                  "_type" : "person",
                  "_id" : "2",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "John"
                  }
                }
              ]
            }
          }
        },
        {
          "key" : "Marry",
          "doc_count" : 1,
          "top_hits_documents" : {
            "hits" : {
              "total" : 1,
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "person",
                  "_type" : "person",
                  "_id" : "3",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "Marry"
                  }
                }
              ]
            }
          }
        },
        {
          "key" : "Thomas",
          "doc_count" : 1,
          "top_hits_documents" : {
            "hits" : {
              "total" : 1,
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "person",
                  "_type" : "person",
                  "_id" : "4",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "Thomas"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

希望有帮助！

类似资料：

Elasticsearch搜索查询返回不同数量的文档

elasticsearch实例的一些背景：一个节点，在一台机器上特定索引由大小为1.23TB的26亿文档组成索引被分成4个碎片。堆大小设置为30 GB 服务器有256GB内存和40个内核。 Elasticsearch（版本1.4.3）是这个服务器上唯一运行的东西我想返回所有具有特定名称的文档。属性名称已映射为：我尝试过使用不同类型的搜索；过滤器、查询字符串、术语。结果都一样。当前查询如
使用mongoosastic的Elasticsearch查询

我尝试这样做查询：但它返回： “Status”：400,“DisplayName”：“BadRequest”,“Message”：“SearchPhaseExecutionException[未能执行阶段[query],所有碎片失败；shardFailures{[PzLsLPHfTMu68AQ94_Af8g][gyms][0]:SearchParseException[[gyms][0]:Fro
选择布尔查询elasticsearch的不同值

问题内容：我有一个查询，可以从弹性索引中获取一些用户发布的数据。我对该查询感到满意，尽管我需要使其返回具有唯一用户名的行。当前，它显示用户的相关帖子，但是可能显示一个用户两次。我已经阅读了有关聚合的内容，但了解得不多（也尝试使用aggs但也没有用）....非常感谢您的帮助问题答案：您将需要使用术语汇总来获取所有唯一身份用户，然后使用热门匹配来针对每个用户仅获取一个结果。这就是它的样子。在
has_parent查询包含脚本化function_score的问题

问题内容：我有两种文档类型，具有父子关系：该字段将用于自定义评分/排序。直接针对父文档的此查询按预期工作：但是，当尝试通过查询对子文档进行类似评分时，出现错误：错误是： QueryPhaseExecutionException [[myIndex] [3]：查询[过滤（ParentQuery [myParent]（过滤（功能评分（ConstantScore（：），函数=脚本[_sco
elasticsearch嵌套文档查询

我是elasticsearch的新手，对如何进行过滤器、查询和聚合有一些想法，但不确定如何解决下面的问题。我希望能够从下面显示的文档中只查询公司的最新交付（日期和crate_quantity）。我不确定如何去做。有没有办法使用最大聚合从每个文档中只提取最近的交付？
查询DSL elasticsearch不起作用

我想让ElasticSearch在我的盒子上工作。我有以下映射：所以我有一个“运动鞋”索引，它有一个“运动鞋”类型，一个“品牌”属性，它有一个“ID”和一个“标题”。检查运动鞋是否存在，运行curl-xget“http://localhost:9200/sneakers/sneaker/1？prettley”，我得到：现在，runningcurl-xget'http://localhost:

Elasticsearch-使用function_score查询的不同文档

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档