当前位置: 首页 > 知识库问答 >
问题:

Scala Spark read json

吴德辉
2023-03-14
val sparkConf = new SparkConf().setAppName("Json Test").setMaster("local[*]") 
val sc = new SparkContext(sparkConf) 
val sqlContext = new org.apache.spark.sql.SQLContext(sc) 
import sqlContext.implicits._

val path = "/path/log.json" 
val df = sqlContext.read.json(path)
df.show()

{“IFAM”:“EQR”,“KTM”:1430006400000,“COL”:21,“Data”:[{“MLRATE”:“30”,“NROUT”:“0”,“UP”:NULL,“板条箱”:“2”},{“MLRATE”:“31”,“NROUT”:“0”,“UP”:NULL,“板条箱”:“2”},{“MLRATE”:“30”,“NROUT”:“5”,“UP”:“NULL”:“2”},{“MLRATE”:“34”,“NROUT”:“0”,“UP”:NULL“33”,“nrout”:“0”,“up”:null,“板条箱”:“2”},{“mlrate”:“30”,“nrout”:“8”,“up”:null,“板条箱”:“2”}]}

在scala ide中发生错误时,我无法理解这一点:

信息SharedState:Warehouse path是'file://c://users/ben53/workspace/demo/spark-warehouse/'。线程“main”java.util.ServiceConfigurationerror:org.apache.spark.sql.sources.datasourceRegister:Provider org.apache.spark.sql.hive.orc.DefaultSource无法在java.util.ServiceLoader.fail(未知源)在java.util.ServiceLoader.Access$100(未知源)在java.util.ServiceLoader$LazyIterator.NextService(未知源)在java.util.ServiceLoader$LazyIterator.Next$jiteratorwrapper.next(wrappers.scala:43)在scala.collection.iterator$class.foreach(iterator.scala:893)在scala.collection.abstractiterator.foreach(iterator.scala:1336)在scala.collection.iterablelike$class.foreach(iterablelike.scala:72)在scala.collection.abstraction.foreach(iterablelike.scala:54)在在org.apache.spark.sql.execution.datasources.datasource$.lookupdatasource(datasource.scala:575)在org.apache.spark.sql.execution.datasources.datasources.providingClass$lzycompute(datasource.scala:86)在org.apache.spark.sql.execution.datasources.datasources.datasources.datasources.datasources.datasources:86)在在org.apache.spark.sql.dataframereader.load(dataframereader.scala:152)在org.apache.spark.sql.dataframereader.json(dataframereader.scala:298)在org.apache.spark.sql.dataframereader.json(dataframereader.scala:251)在com.dataflair.spark.querylog$.main(querylog.scala:27)在spark/sql/hive/orc/defaultsource.createrelation(lorg/apache/spark/sql/sqlcontext;[ljava/lang/string;lscala/option;lscala/option;lscala/collection/immutable/map;)lorg/apache/spark/sql/sources/hadoopfsrelations;@35:arether原因:类型“org/apache/spark/sql/hive/orc/orcrelelation”(当前帧,堆栈[0])不可分配给“org/apache/spark/sql/sources/hadoopfsrelations”(来自方法签名)当前帧:bci:@35标志:{}局部变量:{Apache/Spark/SQL/Hive/Orc/Orcrelation}字节码:0x0000000:b200 1C2B c100 1EBB 000E 592A b700 22B6 0x0000010:0026 bb00 2859 2C2D b200 2D19 0419 052B 0x0000020:b700 30B0

在java.lang.class.GetDeclaredConstructors0(本机方法)在java.lang.class.PrivateGetDeclaredConstructors(未知源)在java.lang.class.NewInstance(未知源)...还有20个

共有1个答案

何涵畅
2023-03-14

路径应该是正确的。但是提供的JSON是无效的。请更正示例JSON,然后尝试。您可以在https://jsonlint.com/上验证JSON

它显示了JSON的无效部分。

虽然我尝试了这个示例,但得到的输出如下所示:

    +---+--------------------+----+-------------+
|COL|                DATA|IFAM|          KTM|
+---+--------------------+----+-------------+
| 21|[[2,30,0,null], [...| EQR|1430006400000|
+---+--------------------+----+-------------+
object Test {

  def main(args: Array[String]) {
    val sparkConf = new SparkConf().setAppName("Json Test").setMaster("local[*]")
    val sc = new SparkContext(sparkConf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.implicits._

    val path = "/home/test/Desktop/test.json"
    val df = sqlContext.read.json(path)
    df.show()
  }
}
 类似资料:

相关问答

相关文章

相关阅读