当前位置: 首页 > 知识库问答 >
问题:

无法使用集群上的java类调用spark应用程序

丁钟展
2023-03-14

火花-应用:

scala1.scala//我从这个类调用java类。

java.java//这将向yarn集群提交另一个spark应用程序。

Error: Could not find or load main class 
org.apache.spark.deploy.yarn.ApplicationMaster

 Application application_1493671618562_0072 failed 5 times due to AM Container for appattempt_1493671618562_0072_000005 exited with exitCode: 1
 For more detailed output, check the application tracking page: http://headnode.internal.cloudapp.net:8088/cluster/app/application_1493671618562_0072 Then click on links to logs of each attempt.
 Diagnostics: Exception from container-launch.
 Container id: container_e02_1493671618562_0072_05_000001
 Exit code: 1
 Exception message: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
 Stack trace: ExitCodeException exitCode=1: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
 at org.apache.hadoop.util.Shell.run(Shell.java:844)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
 at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:225)
 at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
 at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Container exited with a non-zero exit code 1
 Failing this attempt. Failing the application.

我的项目的目录结构如下所示:

lrwxrwxrwx 1 yarn hadoop   95 May  5 06:03 __app__.jar -> /mnt/resource/hadoop/yarn/local/filecache/10/sparkfiller-1.0-SNAPSHOT-jar-with-dependencies.jar
-rw-r--r-- 1 yarn hadoop   74 May  5 06:03 container_tokens
-rwx------ 1 yarn hadoop  710 May  5 06:03 default_container_executor_session.sh
-rwx------ 1 yarn hadoop  764 May  5 06:03 default_container_executor.sh
-rwx------ 1 yarn hadoop 6433 May  5 06:03 launch_container.sh
lrwxrwxrwx 1 yarn hadoop  102 May  5 06:03 __spark_conf__ -> /mnt/resource/hadoop/yarn/local/usercache/helixuser/filecache/80/__spark_conf__6125877397366945561.zip
lrwxrwxrwx 1 yarn hadoop  125 May  5 06:03 __spark__.jar -> /mnt/resource/hadoop/yarn/local/usercache/helixuser/filecache/81/spark-assembly-1.6.3.2.5.4.0-121-hadoop2.7.3.2.5.4.0-121.jar
drwx--x--- 2 yarn hadoop 4096 May  5 06:03 tmp
find -L . -maxdepth 5 -ls:
3933556      4 drwx--x---   3 yarn     hadoop       4096 May  5 06:03 .
3933558      4 drwx--x---   2 yarn     hadoop       4096 May  5 06:03 ./tmp
3933562      4 -rw-r--r--   1 yarn     hadoop         60 May  5 06:03 ./.launch_container.sh.crc
3933517 185944 -r-x------   1 yarn     hadoop   190402950 May  5 06:03 ./__spark__.jar
3933564      4 -rw-r--r--   1 yarn     hadoop          16 May  5 06:03 ./.default_container_executor_session.sh.crc
3933518      4 drwx------   2 yarn     hadoop        4096 May  5 06:03 ./__spark_conf__
3933548      4 -r-x------   1 yarn     hadoop         945 May  5 06:03 ./__spark_conf__/taskcontroller.cfg
3933543      4 -r-x------   1 yarn     hadoop         249 May  5 06:03 ./__spark_conf__/slaves
3933541      4 -r-x------   1 yarn     hadoop        2316 May  5 06:03 ./__spark_conf__/ssl-client.xml.example
3933520      4 -r-x------   1 yarn     hadoop        1734 May  5 06:03 ./__spark_conf__/log4j.properties
3933526      4 -r-x------   1 yarn     hadoop         265 May  5 06:03 ./__spark_conf__/hadoop-metrics2-azure-file-system.properties
3933536      4 -r-x------   1 yarn     hadoop        1045 May  5 06:03 ./__spark_conf__/container-executor.cfg
3933519      8 -r-x------   1 yarn     hadoop        5685 May  5 06:03 ./__spark_conf__/hadoop-env.sh
3933531      4 -r-x------   1 yarn     hadoop        2358 May  5 06:03 ./__spark_conf__/topology_script.py
3933547      8 -r-x------   1 yarn     hadoop        4113 May  5 06:03 ./__spark_conf__/mapred-queues.xml.template
3933528      4 -r-x------   1 yarn     hadoop         744 May  5 06:03 ./__spark_conf__/ssl-client.xml
3933544      4 -r-x------   1 yarn     hadoop         417 May  5 06:03 ./__spark_conf__/topology_mappings.data
3933549      4 -r-x------   1 yarn     hadoop         342 May  5 06:03 ./__spark_conf__/__spark_conf__.properties
3933523      4 -r-x------   1 yarn     hadoop         247 May  5 06:03 ./__spark_conf__/hadoop-metrics2-adl-file-system.properties
3933535      4 -r-x------   1 yarn     hadoop        1020 May  5 06:03 ./__spark_conf__/commons-logging.properties
3933525     24 -r-x------   1 yarn     hadoop       22138 May  5 06:03 ./__spark_conf__/yarn-site.xml
3933529      4 -r-x------   1 yarn     hadoop        2450 May  5 06:03 ./__spark_conf__/capacity-scheduler.xml
3933538      4 -r-x------   1 yarn     hadoop        2490 May  5 06:03 ./__spark_conf__/hadoop-metrics.properties
 3933534     12 -r-x------   1 yarn     hadoop        8754 May  5 06:03 ./__spark_conf__/hdfs-site.xml
 3933533      8 -r-x------   1 yarn     hadoop        4261 May  5 06:03 ./__spark_conf__/yarn-env.sh
 3933532      4 -r-x------   1 yarn     hadoop        1335 May  5 06:03 ./__spark_conf__/configuration.xsl
 3933530      4 -r-x------   1 yarn     hadoop         758 May  5 06:03 ./__spark_conf__/mapred-site.xml.template
 3933545      4 -r-x------   1 yarn     hadoop        1000 May  5 06:03 ./__spark_conf__/ssl-server.xml
 3933527      8 -r-x------   1 yarn     hadoop        4680 May  5 06:03 ./__spark_conf__/core-site.xml
 3933522      8 -r-x------   1 yarn     hadoop        5783 May  5 06:03 ./__spark_conf__/hadoop-metrics2.properties
 3933542      4 -r-x------   1 yarn     hadoop        1308 May  5 06:03 ./__spark_conf__/hadoop-policy.xml
 3933540      4 -r-x------   1 yarn     hadoop        1602 May  5 06:03 ./__spark_conf__/health_check
 3933537      8 -r-x------   1 yarn     hadoop        4221 May  5 06:03 ./__spark_conf__/task-log4j.properties
 3933521      8 -r-x------   1 yarn     hadoop        7596 May  5 06:03 ./__spark_conf__/mapred-site.xml
 3933546      4 -r-x------   1 yarn     hadoop        2697 May  5 06:03 ./__spark_conf__/ssl-server.xml.example
 3933539      4 -r-x------   1 yarn     hadoop         752 May  5 06:03 ./__spark_conf__/mapred-env.sh
 3932820 135852 -r-xr-xr-x   1 yarn     hadoop   139105807 May  4 22:53 ./__app__.jar
 3933566      4 -rw-r--r--   1 yarn     hadoop          16 May  5 06:03 ./.default_container_executor.sh.crc
 3933563      4 -rwx------   1 yarn     hadoop         710 May  5 06:03 ./default_container_executor_session.sh
 3933559      4 -rw-r--r--   1 yarn     hadoop          74 May  5 06:03 ./container_tokens
3933565      4 -rwx------   1 yarn     hadoop         764 May  5 06:03 ./default_container_executor.sh
3933560      4 -rw-r--r--   1 yarn     hadoop          12 May  5 06:03 ./.container_tokens.crc
3933561      8 -rwx------   1 yarn     hadoop        6433 May  5 06:03 ./launch_container.sh
broken symlinks(find -L . -maxdepth 5 -type l -ls):

下面是调用第二个Spark应用程序的java代码:

import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkException;


public class CallingSparkJob {

        public void submitJob(String latestreceivedpitrL,String newPtr) throws Exception {
           System.out.println("In submit job method");
            try{
            System.out.println("Building a spark command");

   // prepare arguments to be passed to 
   // org.apache.spark.deploy.yarn.Client object
   String[] args = new String[] {
       // the name of your application
"--name",
"name",
// "--master",
    // "yarn",
      //    "--deploy-mode",
    //  "cluster",           


          //       "--conf", "spark.yarn.executor.memoryOverhead=600", "--conf", 
        "spark.yarn.submit.waitAppCompletion=false",
       
       // memory for driver (optional)
       "--driver-memory",
       "1000M",

       "--num-executors",
       "2",
       "--executor-cores",
       "2",
          
       // path to your application's JAR file 
       // required in yarn-cluster mode      
       "--jar",
   "wasb://storage_account_container@storageaccount.blob.core.windows.net/user/ankushuser/sparkfiller/sparkfiller-1.0-SNAPSHOT-jar-with-dependencies.jar",
       // name of your application's main class (required)
       "--class",
       "com.test.SparkFiller",

       // comma separated list of local jars that want 
       // SparkContext.addJar to work with      
      // "--addJars",
      // "/Users/mparsian/zmp/github/data-algorithms-book/lib/spark-assembly-1.5.2-hadoop2.6.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/log4j-1.2.17.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/junit-4.12-beta-2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jsch-0.1.42.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/JeraAntTasks.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jedis-2.5.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jblas-1.2.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/hamcrest-all-1.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/guava-18.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math3-3.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math-2.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-logging-1.1.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang3-3.4.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang-2.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-io-2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-httpclient-3.0.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-daemon-1.0.5.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-configuration-1.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-collections-3.2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-cli-1.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/cloud9-1.3.2.jar",


       // argument 1 for latestreceivedpitrL
    "--arg",
       latestreceivedpitrL,

       // argument 2 for newPtr
     "--arg",
       newPtr,

"--arg",
"yarn-cluster"

       
   };
   
  System.out.println("create a Hadoop Configuration object");

// create a Hadoop Configuration object
   Configuration config = new Configuration();
   
   // identify that you will be using Spark as YARN mode
  System.setProperty("SPARK_YARN_MODE", "true");

   // create an instance of SparkConf object
   SparkConf sparkConf = new SparkConf();
sparkConf.setSparkHome("/usr/hdp/current/spark-client");
    // sparkConf.setMaster("yarn");
    sparkConf.setMaster("yarn-cluster");
    
   // sparkConf.setAppName("spark-yarn");
   //  sparkConf.set("master", "yarn");
    
    // sparkConf.set("spark.submit.deployMode", "cluster"); // worked

   // create ClientArguments, which will be passed to Client
   // ClientArguments cArgs = new ClientArguments(args);
   ClientArguments cArgs = new ClientArguments(args, sparkConf);
   
   // create an instance of yarn Client client
   Client client = new Client(cArgs, config, sparkConf); 
            
   // submit Spark job to YARN
   client.run(); 
   }catch(Exception e){
    
       System.out.println("Error submitting spark Job");
    
       System.out.println(e.getMessage());
   }
   
  }
   
 }

用于在本地运行第一个spark应用程序的spark-submit命令:

共有1个答案

支智志
2023-03-14

既然有赏金,我也会把它作为一个回复转发--但实际上我想把它标记为一个重复,因为实际的例外是另一个问题中涵盖的例外,并回答:

这是由于hdp.版本没有被正确替换造成的。必须在$SPARK_HOME/conf下的java-opts文件中设置hdp.version。

或者,使用

 类似资料:
  • Mesosphere在简化Mesos上运行Spark的过程方面做了很大的工作。我正在使用本指南在Google Cloud Compute上建立一个开发Mesos集群。 https://mesosphere.com/docs/tutorials/run-spark-on-mesos/ 我可以使用运行指南中的示例(查找小于10的数字)。但是,当我试图在本地提交一个与Spark一起正常工作的应用程序时,

  • 我试图运行一个连接到本地独立spark集群的java应用程序。我使用start-all.sh以默认配置启动集群。当我转到集群的web页面时,它被启动为OK。我可以用SparkR连接到这个集群,但是当我使用相同的主URL从Java内部连接时,我会得到一条错误消息。 下面是Spark日志中的输出:

  • 我有一个非常简单的Helloworld应用程序,它显示编码>消息。 我想在安装了MicroK8s的Ubuntu节点上部署该应用程序。 步骤如下: git克隆项目\u repo\u url\u此处 使用创建图像/mvnw k8s:构建 使用应用资源/mvnw k8s:资源 使用部署应用程序/mvnw k8s:部署 我能够使用命令,也可以使用命令查看映像,但无法使用和命令 波姆。xml Hellowo

  • 正如标题所预期的,我在向docker上运行的spark集群提交spark作业时遇到了一些问题。 我在scala中写了一个非常简单的火花作业,订阅一个kafka服务器,安排一些数据,并将这些数据存储在一个elastichsearch数据库中。 如果我在我的开发环境(Windows/IntelliJ)中从Ide运行spark作业,那么一切都会完美工作。 然后(我一点也不喜欢java),我按照以下说明添

  • 我在我的kubernetes集群上部署了RabbitMQ服务器,我能够从浏览器访问管理用户界面。但是我的Spring启动应用程序无法连接到端口5672,我收到连接拒绝错误。如果我将我的application.yml属性从kuberntes主机替换为localhost并在我的机器上运行docker映像,同样的代码也可以工作。我不确定我做错了什么? 有人试过这种设置吗?请帮帮忙。谢谢!

  • 本文向您展示如何创建 Kubernetes Service 对象,外部客户端可以使用它来访问集群中运行的应用程序。该 Service 可以为具有两个运行实例的应用程序提供负载均衡。 目的 运行 Hello World 应用程序的两个实例。 创建一个暴露 node 节点端口的 Service 对象。 使用 Service 对象访问正在运行的应用程序。 为在两个 pod 中运行的应用程序创建 serv