可以在elasticsearch中计算“总和”和“平均数”吗？

许俊雅

2023-03-14

问题内容：

如何在Elasticsearch中计算“明显平均值”？我有一些这样的非规范化数据：

{ "record_id" : "100", "cost" : 42 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "400", "cost" : 11 }
{ "record_id" : "400", "cost" : 11 }
{ "record_id" : "500", "cost" : 10 }
{ "record_id" : "600", "cost" : 99 }

请注意，对于给定的“ record_id”，“成本”如何始终相同。

因此，根据以上数据：

如何获得“费用”字段的平均值但由“ record_id”区分的平均值？结果将是（42 + 67 + 11 + 10 + 99）/5=45.8
如何获得“费用”字段的SUM值，但如何通过“ record_id”显示DISTINCT？结果将是42 + 67 + 11 + 10 + 99 = 229

问题答案：

它不适用于termsaggs。这是使用无痛脚本的可能方式：

编制索引-您的实际映射可能与生成的默认映射不同（尤其是上的.keyword部分rec_id）：

POST _bulk
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"100","cost":42}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"400","cost":11}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"400","cost":11}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"500","cost":10}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"600","cost":99}

然后汇总

GET uniques/_search
{
  "size": 0,
  "aggs": {
    "terms": {
      "scripted_metric": {
        "init_script": "state.id_map = [:]; state.sum = 0.0; state.elem_count = 0.0;",
        "map_script": """
          def id = doc['record_id.keyword'].value;
          if (!state.id_map.containsKey(id)) {
            state.id_map[id] = true;
            state.elem_count++;
            state.sum += doc['cost'].value;
          }
        """,
        "combine_script": """
            def sum = state.sum;
            def avg = sum / state.elem_count;

            def stats = [:];
            stats.sum = sum;
            stats.avg = avg;

            return stats
        """,
        "reduce_script": "return states"
      }
    }
  }
}

并屈服

...
"aggregations" : {
    "terms" : {
      "value" : [
        {
          "avg" : 45.8,
          "sum" : 229.0
        }
      ]
    }
  }

可以在elasticsearch中计算“总和”和“平均数”吗？

相关阅读

相关文章

相关问答

相关工具

相关文档