我在Cassandra上创建了一个文本数据选择程序。这是我的代码。这只是一个简单的选择所有数据并显示在控制台。
def get_spark_context(app_name, max_cores=120):
# checkpointDirectory = ""
conf = SparkConf().setMaster(local_settings.SPARK_MASTER).setAppName(app_name) \
.set("spark.cores.max", max_cores)\
.set("spark.jars.packages", "datastax:spark-cassandra-connector:2.0.0-s_2.11") \
.set("spark.cassandra.connection.host", local_settings.CASSANDRA_MASTER)
# setup spark context
sc = SparkContext.getOrCreate(conf=conf)
sc.setCheckpointDir(local_settings.CHECKPOINT_DIRECTORY)
return sc
def get_sql_context(sc):
sqlc = SQLContext.getOrCreate(sc)
return sqlc
def run():
sc = get_spark_context("Select data")
sql_context = get_sql_context(sc)
sql_context.read.format("org.apache.spark.sql.cassandra") \
.options(table="text", keyspace="data") \
.load().show()
19/02/21 09:09:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/02/21 09:09:23 WARN Utils: Your hostname, osboxes resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth0)
19/02/21 09:09:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/02/21 09:09:44 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/02/21 09:09:59 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/02/21 08:58:18 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 15264@mm_h01
19/02/21 08:58:18 INFO SignalUtils: Registered signal handler for TERM
19/02/21 08:58:18 INFO SignalUtils: Registered signal handler for HUP
19/02/21 08:58:18 INFO SignalUtils: Registered signal handler for INT
19/02/21 08:58:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/02/21 08:58:19 INFO SecurityManager: Changing view acls to: hadoop,osboxes
19/02/21 08:58:19 INFO SecurityManager: Changing modify acls to: hadoop,osboxes
19/02/21 08:58:19 INFO SecurityManager: Changing view acls groups to:
19/02/21 08:58:19 INFO SecurityManager: Changing modify acls groups to:
19/02/21 08:58:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop, osboxes); groups with view permissions: Set(); users with modify permissions: Set(hadoop, osboxes); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:202)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 11 more
19/02/21 09:00:19 ERROR RpcOutboxMessage: Ask timeout before connecting successfully
这是什么意思?难道师傅和工人之间没有减刑?多谢
这意味着作业已提交给YARN。但是,由于资源不足,它无法启动作业,因为yarn当前无法提供请求的资源。
转到Ambari/Cloudera UI查看是否有正在运行的作业。检查纱线的容器尺寸。检查为作业配置的资源是否多于可供纱/介OS使用的资源总数
警告TaskSchedulerImpl:初始作业未接受任何资源;检查集群UI以确保工作人员已注册并具有足够的资源 另外,在Spark UI中,我看到了以下内容: 作业持续运行-火花 编辑: 我检查了HistoryServer,这些作业没有显示在那里(即使在不完整的应用程序中)
问题内容: 我正在尝试从运行火花示例,并得到以下一般性错误: 我使用的版本是我使用Shell中的命令启动spark ,然后将我设置为: 我没有在这里添加任何其他代码,因为此错误在我正在运行的任何示例中都会弹出。这台机器是Mac OSX,我很确定它有足够的资源来运行最简单的示例。 我想念什么? 问题答案: 该错误表明您的集群没有足够的资源用于当前作业。由于尚未启动从属服务器,即worker。集群将没
当我提交spark作业时: 我不断收到以下警告信息: 本地模式运行成功:
可以使用以下样式设置文本输入的占位符: 我正在网上看一个网站,我想使用相同的占位符颜色,因为他们做的。有可能查出他们用的是什么颜色吗?我想包括任何alpha值,所以我不能仅仅用图像编辑器对颜色进行采样。 当我使用Chrome开发工具检查元素时,我看到: Dev tools在检查输入元素时不提供有关placeholder元素的信息。还有别的办法吗?
> 提交应用程序未设置,然后它将创建 1名16芯工人 使用提交,然后它将创建一个包含15个核心的worker
移动端的程序猿来说,以前台式机基本就注重CPU,编程这件事大多时候用不到多高配置,代码调试也基本是真机,模拟器又不准,测完又得以真机为准,那GPU的要求也可以降低了。