统计与聚合

优质
小牛编辑
130浏览
2023-12-01

统计

BuguDao提供如下常用的统计功能,它们都是基于Aggregation实现的。


/* 求最大值 */
public double max(String key)

public double max(String key, BuguQuery query)

/* 求最小值 */

public double min(String key)

public double min(String key, BuguQuery query)

/* 求和 */

public double sum(String key)

public double sum(String key, BuguQuery query)

/* 求平均值 */

public double average(String key)

public double average(String key, BuguQuery query)

/* 求全体标准差 */

public double stdDevPop(String key)

public double stdDevPop(String key, BuguQuery query)

/* 求抽样标准差 */

public double stdDevSamp(String key)

public double stdDevSamp(String key, BuguQuery query)

示例代码:

FooDao dao = new FooDao();
double maxValue = dao.max("embed.x");
double minValue = dao.min("embed.x");

聚合

bugu-mongo框架使用BuguAggregation类来实现aggregate的功能,其用法与BuguQuery类似。

得到一个BuguAggregation实例:

BuguDao<MockEntity> dao = ...
BuguAggregation agg = dao.aggregate();

BuguAggregation包括如下方法:

/* pipeline方法 */

public BuguAggregation lookup(DBObject dbo)

public BuguAggregation lookup(String jsonString)

public BuguAggregation lookup(Lookup lookup)

public BuguAggregation project(DBObject dbo)

public BuguAggregation project(String jsonString)

public BuguAggregation projectInclude(String... fields)

public BuguAggregation project(String key, Object val)

public BuguAggregation match(String key, Object value)

public BuguAggregation match(DBObject dbo)

public BuguAggregation match(BuguQuery query)

public BuguAggregation limit(int n)

public BuguAggregation skip(int n)

public BuguAggregation unwind(String field)

public BuguAggregation geoNear(DBObject dbo)

public BuguAggregation geoNear(String jsonString)

public BuguAggregation geoNear(GeoNearOptions options)

public BuguAggregation group(DBObject dbo)

public BuguAggregation group(String jsonString)

public BuguAggregation sort(String jsonString)

public BuguAggregation sort(DBObject dbo)

public BuguAggregation out(String target)

/* 返回结果 */

public Iterable<DBObject> results()

一段代码示例如下:

BuguDao<MockEntity> dao = new BuguDao<MockEntity>(MockEntity.class);
DBObject match = dao.query().greaterThanEquals("score", 60).getCondition();
DBObject group = new BasicDBObject();
group.put("_id", null);
group.put("average", new BasicDBObject("$avg", "$score"));
Iterable<DBObject> results = dao.aggregate().match(match).group(group).results();
DBObject dbo = results[0];
System.out.println(dbo.get("average"));

project和group等方法,其参数不仅支持DBObject,而且还支持JSON字符串。例如:

BuguAggregation agg = dao.aggregate();
agg.group("{_id:'$author', count:{$sum:1}}");
agg.sort("{count:-1}");
Iterable<DBObject> it = agg.results();

Lookup

从mongoDB 3.2开始,聚合操作开始支持lookup功能。lookup相当于SQL中的left outer join。详见MongoDB官方文档

bugu-mongo中提供了辅助类Lookup,用于简化该功能的实现。Lookup的构造函数如下:

/**
 * constructor
 * @param from 需要join的另一个表
 * @param localField 本表的字段名称
 * @param foreignField 需要join的字段
 * @param as 添加到本表的新的数组名称
 */
public Lookup(String from, String localField, String foreignField, String as)

例:

假如有图书类Book、评论类Comment,两者之间通过图书的title来关联(现实中这样不太合理,这只是个例子)。

@Entity
public class Book extends SimpleEntity {
    private String title;
    private String author;
    private float price;
    ...
}

@Entity
public class Comment extends SimpleEntity {
    private String title;
    private int star;
    ...
}

可以用lookup来关联这2个collection,计算出每位作者的平均得分(star):

BookDao dao = new BookDao();
BuguAggregation agg = dao.aggregate();
agg.lookup(new Lookup("comment", "title", "title", "book_comment"));
agg.unwind("$book_comment");
agg.group("{_id:'$author', averageStar:{$avg:'$book_comment.star'}}");
agg.sort("{averageStar:-1}");
Iterable<DBObject> it = agg.results();
for(DBObject dbo : it){
    System.out.println(dbo.get("_id"));
    System.out.println(dbo.get("averageStar"));
}

就像前面说的,通过title来关联,仅仅是为了举例的需要。在实际的应用中,更合理的设计,应该是通过@Ref来关联。

需要特别注意的是,通过@Ref来关联的属性,如果要进行lookup操作,则必须将其设置成为reduced=true。

例如:

@Entity
public class Book extends SimpleEntity {
    private String title;
    private String author;
    private float price;
    ...
}

@Entity
public class CoolComment extends SimpleEntity {
    @Ref(reduced = true)
    private Book book;
    private int star;
    ...
}

计算出每位作者的平均得分(star):

BookDao dao = new BookDao();
BuguAggregation agg = dao.aggregate();
agg.lookup(new Lookup("coolcomment", "_id", "book", "book_comment"));  //注意:localField是_id
agg.unwind("$book_comment");
agg.group("{_id:'$author', averageStar:{$avg:'$book_comment.star'}}");
agg.sort("{averageStar:-1}");
Iterable<DBObject> it = agg.results();
for(DBObject dbo : it){
    System.out.println(dbo.get("_id"));
    System.out.println(dbo.get("averageStar"));
}

lookup聚合操作,编写起来比较麻烦。鉴于此,bugu-mongo提供了一个更简单的实现方法:JoinQuery

ExpressionBuilder

为了方便编写聚合代码,bugu-mongo提供了ExpressionBuilder工具类,用于创建不同的expression表达式。最常用的2个是BoolBuilder和CondBuilder。

例如,我们需要把图书的价格进行阶梯统计,1-10为一个阶梯,10-20为一个阶梯,20-30为一个阶梯,然后计算出每个阶梯的图书数量:

BookDao dao = new BookDao();

BuguAggregation agg = dao.aggregate();

DBObject cond1 = ExpressionBuilder.cond().ifCondition("{'$lte':['$price', 10]}").thenValue(10).elseValue("$price").build();

DBObject bool2 = ExpressionBuilder.bool().and("{'$gt':['$price', 10]}", "{'$lte':['$price', 20]}").build();
DBObject cond2 = ExpressionBuilder.cond().ifCondition(bool2).thenValue(20).elseValue("$price").build();

DBObject bool3 = ExpressionBuilder.bool().and("{'$gt':['$price', 20]}", "{'$lte':['$price', 30]}").build();
DBObject cond3 = ExpressionBuilder.cond().ifCondition(bool3).thenValue(30).elseValue("$price").build();

agg.project("price", cond1);
agg.project("price", cond2);
agg.project("price", cond3);

agg.group("{_id:'$price', count:{$sum:1}}");

Iterable<DBObject> it = agg.results();
for(DBObject dbo : it){
    System.out.print(dbo.get("_id"));
    System.out.print(" : ");
    System.out.println(dbo.get("count"));
}

更多的BuguAggregation的使用示例,请参考项目源代码中的单元测试部分。