我对Spark还是个新手。我试着找,但找不到一个合适的解决办法。我已经在两个盒子(一个主节点和另一个工作节点)上安装了hadoop 2.7.2。我已经通过以下链接来设置集群:http://javadev.org/docs/hadoop/centos/6/installation/multi-node-installation-on-centos-6-non-sucure-mode/I作为root用户运行hadoop和spark应用程序来测试集群。
我已经在主节点上安装了spark,并且spark正在启动,没有任何错误。但是,当我使用spark submit提交作业时,我会得到File Not Found异常,即使该文件存在于主节点中的错误所在的同一位置。我正在执行spark submit命令下面,请在命令下面找到日志输出。
/bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster /app/spark-test.jar
16/04/21 19:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/04/21 19:16:13 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 19:16:14 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 19:16:14 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 19:16:14 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 19:16:14 INFO Client: Setting up container launch context for our AM 16/04/21 19:16:14 INFO Client: Setting up the launch environment for our AM container 16/04/21 19:16:14 INFO Client: Preparing resources for our AM container 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/app/spark-test.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-120aeddc-0f87-4411-9400-22ba01096249/__spark_conf__5619348744221830008.zip 16/04/21 19:16:14 INFO SecurityManager: Changing view acls to: root 16/04/21 19:16:14 INFO SecurityManager: Changing modify acls to: root 16/04/21 19:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 19:16:15 INFO Client: Submitting application 1 to ResourceManager 16/04/21 19:16:15 INFO YarnClientImpl: Submitted application application_1461246306015_0001 16/04/21 19:16:16 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:16 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: UNDEFINEDsparkcluster01.testing.com tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461246306015_0001/ user: root 16/04/21 19:16:17 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:18 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:19 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:20 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:21 INFO Client: Application report for application_1461246306015_0001 (state: FAILED) 16/04/21 19:16:21 INFO Client: client token: N/A diagnostics: Application application_1461246306015_0001 failed 2 times due to AM Container for appattempt_1461246306015_0001_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001Then, click on links to logs of each attempt. Diagnostics: java.io.FileNotFoundException: File file:/app/spark-test.jar does not exist Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001 user: root Exception in thread "main" org.ap/app/spark-test.jarache.spark.SparkException: Application application_1461246306015_0001 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我甚至尝试在HDFS文件系统上运行spark,方法是将我的应用程序放在HDFS上,并在spark Submit命令中给出HDFS路径。即使这样,它的抛出文件在某个Spark Conf文件上找不到异常。我正在执行下面的Spark Submit命令,请在命令下面找到日志输出。
./bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar
16/04/21 18:11:45 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 18:11:46 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 18:11:46 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 18:11:46 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 18:11:46 INFO Client: Setting up container launch context for our AM 16/04/21 18:11:46 INFO Client: Setting up the launch environment for our AM container 16/04/21 18:11:46 INFO Client: Preparing resources for our AM container 16/04/21 18:11:46 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 18:11:47 INFO Client: Uploading resource hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar -> file:/root/.sparkStaging/application_1461234217994_0017/spark-test.jar 16/04/21 18:11:49 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip 16/04/21 18:11:50 INFO SecurityManager: Changing view acls to: root 16/04/21 18:11:50 INFO SecurityManager: Changing modify acls to: root 16/04/21 18:11:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 18:11:50 INFO Client: Submitting application 17 to ResourceManager 16/04/21 18:11:50 INFO YarnClientImpl: Submitted application application_1461234217994_0017 16/04/21 18:11:51 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:51 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: UNDEFINED tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461234217994_0017/ user: root 16/04/21 18:11:52 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:53 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:54 INFO Client: Application report for application_1461234217994_0017 (state: FAILED) 16/04/21 18:11:54 INFO Client: client token: N/A diagnostics: Application application_1461234217994_0017 failed 2 times due to AM Container for appattempt_1461234217994_0017_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017Then, click on links to logs of each attempt. Diagnostics: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist java.io.FileNotFoundException: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017 user: root Exception in thread "main" org.apache.spark.SparkException: Application application_1461234217994_0017 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/04/21 18:11:55 INFO ShutdownHookManager: Shutdown hook called 16/04/21 18:11:55 INFO ShutdownHookManager: Deleting directory /tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21
spark配置没有指向正确的hadoop配置目录。2.7.2的hadoop配置位于文件路径hadoop 2.7.2./etc/hadoop/而不是/root/hadoop2.7.2/conf。当我在spark-env.sh下指向hadoop_conf_dir=/root/hadoop2.7.2/etc/hadoop/时,spark submit开始工作,文件未找到异常消失。之前它指向的是/root/hadoop2.7.2/conf(不存在)。如果spark没有指向正确的hadoop配置目录,可能会导致类似的错误。我认为这可能是spark中的一个bug,它应该优雅地处理它,而不是抛出模棱两可的错误消息。
我一直在谷歌上寻找这个非常常见的问题,但没有找到答案。当我通过NetBeans构建和清理我的项目时,我的MANIFEST.MF中只有这个内容 我期待着和其他人一样的东西: ..但我没有。我已经在属性上设置了我的主类。顺便说一下,这是我的主要课程: ..当我在机器中运行命令时 我不想继续我的项目,除非我解决这个问题。我需要这个,以便其他计算机可以使用我的项目。
我是一名spark/纱线新手,在提交纱线集群上的spark作业时遇到exitCode=13。当spark作业在本地模式下运行时,一切正常。 我使用的命令是: Spark错误日志:
我对Spark和使用python编写使用PySpark的作业是新手。我想在一个yarn集群上运行我的脚本,并通过发送将日志记录级别设置为使用标记来删除详细的日志记录。我有一个本地csv文件,脚本使用,我需要包括这以及。如何使用标记来包含这两个文件? 我正在使用以下命令: 但是我得到以下错误:`
我在AWS上有一个Hadoop/Yarn集群设置,我有一个主服务器和三个从服务器。我已经验证有3个活动节点在端口50070和8088上运行。我在客户机部署模式下测试了一个spark工作,一切都很好。 当我尝试使用。我得到以下错误。
我有以下代码来下载通过TCP传输的文件: 一旦所有的字节都被处理了(它们总是被处理的,它像正常的一样工作,只是不退出循环),文件就被创建了,没有一个问题,但是,循环没有退出。
容器总是在创建和运行后立即退出。 我尝试使用命令运行mssql实例 当尝试类似的SO link link时 它显示 但在run命令中我已经设置了'accept_eula=y'。