mongo-scala某字段有则更新无则插入

楚举
2023-12-01

问题:
SparkStreaming处理实时数据将统计结果写入mongo,用mongo-java的api需要做一层判断即对某个维度进行查找如果存在则把指标更新,如果不存在则插入维度与指标字段,这种方式耗时效率低下
换用mongo-scala的api使用其upsert方式实现插入与跟新,需要query的字段需在mongo中建立索引

/**
   * Performs an update operation.
   * @param q search query for old object to update
   * @param o object with which to update `q`
   */
  def update[A, B](q: A, o: B, upsert: Boolean = false, multi: Boolean = false,
                   concern:                  com.mongodb.WriteConcern = this.writeConcern,
                   bypassDocumentValidation: Option[Boolean]          = None)(implicit queryView: A => DBObject, objView: B => DBObject,
                                                                              encoder: DBEncoder = customEncoderFactory.map(_.create).orNull): WriteResult = {
    bypassDocumentValidation match {
      case None                   => underlying.update(queryView(q), objView(o), upsert, multi, concern, encoder)
      case Some(bypassValidation) => underlying.update(queryView(q), objView(o), upsert, multi, concern, bypassValidation, encoder)
    }
  }

添加依赖:


    <dependency>
            <groupId>org.mongodb</groupId>
            <artifactId>casbah_2.11</artifactId>
            <version>3.1.1</version>
            <type>pom</type>
        </dependency>
        
         <dependency>
            <groupId>org.mongodb.spark</groupId>
            <artifactId>mongo-spark-connector_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        

注:pom中和需要去掉java-mongo-driver的依赖,否则冲突

 类似资料: