##安装Scala
在Scala官网下载合适的版本
解压到/usr/local/scala目录下(目录可随意修改)
在linux下加入环境变量
export PATH="$PATH:/usr/scala/bin"
输入scala检查是否安装成功
##手动安装sbt
在官网下载sbt,可以用zip或tgz
解压到/usr/local/sbt目录下
在/usr/local/sbt目录下新建sbt文件
cd /usr/local/sbt
vi sbt
输入以下内容(-XX:MaxPermSize=256M 在JAVA 1.8可以取消):
SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M"
java $SBT_OPTS -jar /usr/local/sbt/bin/sbt-launch.jar "$@"
配置仓库
vi ~/.sbt/repositories
输入以下内容
[repositories]
local
aliyun-nexus: http://maven.aliyun.com/nexus/content/groups/public/
#或者oschina: http://maven.oschina.net/content/groups/public/
jcenter: http://jcenter.bintray.com/
typesafe-ivy-releases: http://repo.typesafe.com/typesafe/ivy-releases/, [organization]/[module]/[revision]/[type]s/[artifact](-[classifier]).[ext], bootOnly
maven-central: http://repo1.maven.org/maven2/
配置环境变量
export SBT_HOME=/usr/local/sbt
export PATH=$PATH:$SBT_HOME
输入sbt命令检查是否安装成功
sbt第一次执行时会自动下载包,等出现sbt控制台即配置完成
sbt:sbt>
##部署spark-jobserver
###配置
github地址
设置配置环境为local
复制local.conf.template为local.conf,local.sh.template为local.sh
cd /home/hadoop/application/spark-jobserver/conf
cp local.conf.template local.conf
cp local.sh.template local.sh
vi local.sh
#!/usr/bin/env bash
# Environment and deploy file
# For use with bin/server_deploy, bin/server_package etc.
#ssh远程部署host,可以使用ip
DEPLOY_HOSTS="dashuju213
dashuju214"
#ssh安装时用户名和用户组
APP_USER=hadoop
APP_GROUP=hadoop
JMX_PORT=9999
# optional SSH Key to login to deploy server
#SSH_KEY=/path/to/keyfile.pem
# deploy安装目录
INSTALL_DIR=/home/hadoop/application/jobserver
# 日志目录
LOG_DIR=/home/hadoop/application/jobserver/logs
PIDFILE=spark-jobserver.pid
JOBSERVER_MEMORY=1G
#SPARK版本
SPARK_VERSION=2.3.2
MAX_DIRECT_MEMORY=512M
#SPARK目录
SPARK_HOME=/home/hadoop/application/spark
SPARK_CONF_DIR=$SPARK_HOME/conf
#SCALA版本
SCALA_VERSION=2.11.12
配置数据库
vi local.conf,只列出需要修改的配置
# also add the following line at the root level.
flyway.locations="db/mysql/migration"
spark {
# local[...], yarn, mesos://... or spark://...
master = "spark://dashuju213:6066,dashuju214:6066"
# client or cluster deployment
submit.deployMode = "cluster"
# Default # of CPUs for jobs to use for Spark standalone cluster
job-number-cpus = 2
jobserver {
...
sqldao {
# Slick database driver, full classpath
slick-driver = slick.driver.MySQLDriver
# JDBC driver, full classpath
jdbc-driver = com.mysql.jdbc.Driver
jdbc {
url = "jdbc:mysql://db_host/spark_jobserver"
user = "jobserver"
password = "secret"
}
dbcp {
maxactive = 20
maxidle = 10
initialsize = 10
}
}
}
}
配置ssh免密登录
配置ssh端口,默认使用了22端口,可以根据需要修改
vi server_deploy.sh
for host in $DEPLOY_HOSTS; do
# We assume that the deploy user is APP_USER and has permissions
ssh -p 2222 -o StrictHostKeyChecking=no $ssh_key_to_use ${APP_USER}@$host mkdir -p $INSTALL_DIR
scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use $FILES ${APP_USER}@$host:$INSTALL_DIR/
scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use "$CONFIG_DIR/$ENV.conf" ${APP_USER}@$host:$INSTALL_DIR/
scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use "$configFile" ${APP_USER}@$host:$INSTALL_DIR/settings.sh
done
###部署
进入bin目录下,执行部署命令
./server_deploy.sh local
执行完成后,进入INSTALL_DIR中的目录,使用server_start.sh和server_stop.sh进行启停
###遇到的问题
####启动问题
因为我在spark-defult.xml中配置了master和deployMode,因此需要修改server_start.sh,改为需要的方式
cmd='$SPARK_HOME/bin/spark-submit --master local[1] --deploy-mode
####数据库初始化失败
修改spark-jobserver\spark-jobserver-master\job-server\src\main\resources\db\mysql\migration\V0_7_2\V0_7_2__convert_binaries_table_to_use_milliseconds.sql
可以重新执行部署命令或直接修改jar包中文件
ALTER TABLE `BINARIES` MODIFY COLUMN `UPLOAD_TIME` TIMESTAMP;
Validate failed. Migration Checksum mismatch for migration 0.7.2
由于初始化失败造成,删除数据库下所有表,重新初始化
####java.lang.ClassNotFoundException: akka.event.slf4j.Slf4jLogger
修改project/Dependencies.scala
"com.typesafe.akka" %% "akka-slf4j" % akka % "provided",
...
"io.spray" %% "spray-routing" % spray,
改为
"com.typesafe.akka" %% "akka-slf4j" % akka,
...
"io.spray" %% "spray-routing-shapeless23" % "1.3.4",
project/Versions.scala 新增
lazy val mysql = "5.1.42"
###使用
####执行spark-sql
修改local.conf
spark {
jobserver {
# Automatically load a set of jars at startup time. Key is the appName, value is the path/URL.
job-binary-paths { # NOTE: you may need an absolute path below
sql = job-server-extras/target/scala-2.10/job-server-extras_2.10-0.6.2-SNAPSHOT-tests.jar
}
}
contexts {
sql-context {
num-cpu-cores = 1 # Number of cores to allocate. Required.
memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, 1G, etc.
context-factory = spark.jobserver.context.HiveContextFactory
}
}
}