当前位置: 首页 > 知识库问答 >
问题:

来自Databricks笔记本的COSMOS DB写入问题

茹建茗
2023-03-14

根据data bricks docs-https://docs . data bricks . com/data/data-sources/azure/cosmos db-connector . html,我已经下载了最新的azure-cosmosdb-spark库(azure-cosmos db-spark _ 2 . 4 . 0 _ 2.11-2 . 1 . 2-Uber . jar)并放在dbfs的libraries位置。

当尝试将数据从数据帧写入COSMOS容器时,我得到以下错误,任何帮助将不胜感激。

My Databricks 运行时版本是:7.0(包括 Apache Spark 3.0.0、Scala 2.12)

从笔记本导入:

import java.time.LocalDateTime
import java.time.format.DateTimeFormatter
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
import org.apache.spark.sql.types.{StructField, _}
import org.apache.spark.sql.functions._
import com.microsoft.azure.cosmosdb.spark.schema._
import com.microsoft.azure.cosmosdb.spark.CosmosDBSpark
import com.microsoft.azure.cosmosdb.spark.config._

val dtcCANWrite = Config(Map(
  "Endpoint" -> "NOT DISPLAYED",
  "Masterkey" -> "NOT DISPLAYED",
  "Database" -> "NOT DISPLAYED",
  "Collection" -> "NOT DISPLAYED",
  "preferredRegions" -> "NOT DISPLAYED",
  "Upsert" -> "true"
))

distinctCANDF.write.mode(SaveMode.Append).cosmosDB(dtcCANWrite)

错误:

    at com.microsoft.azure.cosmosdb.spark.config.CosmosDBConfigBuilder.<init>(CosmosDBConfigBuilder.scala:31)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:259)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:240)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:7)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:69)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:71)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:73)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:75)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:77)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:79)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:81)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:83)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:85)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:87)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:89)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:91)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw.<init>(command-3649834446724317:93)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw.<init>(command-3649834446724317:95)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw.<init>(command-3649834446724317:97)
    at line6d80624d7a774601af6eb962eb59453253.$read.<init>(command-3649834446724317:99)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<init>(command-3649834446724317:103)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<clinit>(command-3649834446724317)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print$lzycompute(<notebook>:7)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print(<notebook>:6)
    at line6d80624d7a774601af6eb962eb59453253.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:215)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:202)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:396)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:275)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at com.microsoft.azure.cosmosdb.spark.config.CosmosDBConfigBuilder.<init>(CosmosDBConfigBuilder.scala:31)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:259)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:240)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:7)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:69)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:71)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:73)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:75)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:77)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:79)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:81)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:83)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:85)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:87)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:89)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:91)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw.<init>(command-3649834446724317:93)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw.<init>(command-3649834446724317:95)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw.<init>(command-3649834446724317:97)
    at line6d80624d7a774601af6eb962eb59453253.$read.<init>(command-3649834446724317:99)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<init>(command-3649834446724317:103)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<clinit>(command-3649834446724317)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print$lzycompute(<notebook>:7)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print(<notebook>:6)
    at line6d80624d7a774601af6eb962eb59453253.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:215)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:202)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:396)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:275)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
    at java.lang.Thread.run(Thread.java:748)

共有1个答案

林夕
2023-03-14

感谢Rayan Ral的建议。所以问题来自于版本冲突。

解决方案已降级版本。在这种情况下,您可以从Spark 3.0.0、Scala 2.12降级到Spark 2.4.4、Scala 1.11。

 类似资料:
  • 在本教程之后,我在数据库中的集群上上传了一个jar库,但是我无法从数据库笔记本中导入该库或使用该库的方法。我无法找到解决此主题的论坛或留档,所以我不确定此时是否可能。 我可以在Database ricks中将jar文件作为作业运行,只是无法将jar库导入Notebook以从那里运行它。 我还尝试使用%sh魔术命令运行jar文件,但收到以下:

  • 我使用的是在远程服务器上运行的ipython笔记本电脑。我想在本地保存笔记本中的数据(例如熊猫数据框)。 目前,我正在将数据保存为远程服务器上的. csv文件,然后通过scp将其移动到我的本地机器。有没有更优雅的直接从笔记本电脑获取的方法?

  • 问题 我有一个场景,需要将scala映射转换为case类对象,在以下引用的帮助下,我能够在本地实现它(scala版本2.12.13): Scala:将地图转换为案例类 将贴图转换为Scala对象 但当我尝试在Databricks笔记本中运行相同的代码块时,它抛出了一个错误: 中使用默认的无参数构造函数来实例化非静态内部类 集群配置:Databricks runtime 8.2(包括Spark 3.

  • 写入 /dbfs/mnt(blob存储)需要2小时。同样的工作需要8分钟才能写入 /dbfs/FileStore. 我想了解为什么这两种情况下的写性能不同。我还想知道/dbfs/FileStor使用哪个后端存储? 我知道DBFS是可扩展对象存储之上的抽象。在这种情况下,对于 /dbfs/mnt/blob 存储和 /dbfs/文件存储/,应该需要相同的时间。 问题陈述: 源文件格式:.tar.gz

  • 我正在使用Azure Data Factory运行我的databricks笔记本,它在运行时创建了作业集群,现在我想知道这些作业的状态,我的意思是它们是成功还是失败。那么我可以知道,我如何使用作业id或运行id来获得运行状态。 注意:我没有在我的databricks工作区中创建任何作业,我正在使用Azure Data Factory运行我的笔记本,它在运行时创建作业集群,并在该集群上运行该笔记本,

  • 如何在IPython笔记本中显示LaTeX代码?