下面是我的代码:
from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
sc = SparkContext.getOrCreate()
spark = SparkSession.builder \
.master("local") \
.appName("Test") \
.config('spark.jars','/Users/zhao/Downloads/snowflake-jdbc-3.5.4.jar,/Users/zhao/Downloads/spark-snowflake_2.11-2.3.2.jar') \
.getOrCreate()
sfOptions = {
"sfURL" : "xxx",
"sfUser" : "xxx",
"sfPassword" : "xxx",
"sfDatabase" : "xxx",
"sfSchema" : "xxx",
"sfWarehouse" : "xxx",
"sfRole": "xxx"
}
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
df = spark.read.format(SNOWFLAKE_SOURCE_NAME) \
.options(**sfOptions) \
.option("query", "select * from CustomerInfo limit 10") \
.load()
如果有人能给我一些想法,我将不胜感激:)
如何启动jupyter笔记本服务器实例?是否确保PythonPath
和SPARK_HOME
变量设置正确,并且Spark没有预先运行实例?另外,您的雪花火花连接器罐子是否使用了正确的火花和Scala版本变体?
下面是一个在macOS机器上完全引导和测试的运行,作为参考(使用自制的):
# Install JDK8
~> brew tap adoptopenjdk/openjdk
~> brew cask install adoptopenjdk8
# Install Apache Spark (v2.4.5 as of post date)
~> brew install apache-spark
# Install Jupyter Notebooks (incl. optional CLI notebooks)
~> pip3 install --user jupyter notebook
# Ensure we use JDK8 (using very recent JDKs will cause class version issues)
~> export JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home"
# Setup environment to allow discovery of PySpark libs and the Spark binaries
# (Uses homebrew features to set the paths dynamically)
~> export SPARK_HOME="$(brew --prefix apache-spark)/libexec"
~> export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/build:${PYTHONPATH}"
~> export PYTHONPATH="$(brew list apache-spark | grep 'py4j-.*-src.zip$' | head -1):${PYTHONPATH}"
# Download jars for dependencies in notebook code into /tmp
# Snowflake JDBC (v3.12.8 used here):
~> curl --silent --location \
'https://search.maven.org/classic/remotecontent?filepath=net/snowflake/snowflake-jdbc/3.12.8/snowflake-jdbc-3.12.8.jar' \
> /tmp/snowflake-jdbc-3.12.8.jar
# Snowflake Spark Connector (v2.7.2 used here)
# But more importantly, a Scala 2.11 and Spark 2.4.x compliant one is fetched
~> curl --silent --location \
'https://search.maven.org/classic/remotecontent?filepath=net/snowflake/spark-snowflake_2.11/2.7.2-spark_2.4/spark-snowflake_2.11-2.7.2-spark_2.4.jar' \
> /tmp/spark-snowflake_2.11-2.7.2-spark_2.4.jar
# Run the jupyter notebook service (opens up in webbrowser)
~> jupyter notebook
代码在新的Python 3
笔记本中运行:
from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
sfOptions = {
"sfURL": "account.region.snowflakecomputing.com",
"sfUser": "username",
"sfPassword": "password",
"sfDatabase": "db_name",
"sfSchema": "schema_name",
"sfWarehouse": "warehouse_name",
"sfRole": "role_name",
}
spark = SparkSession.builder \
.master("local") \
.appName("Test") \
.config('spark.jars','/tmp/snowflake-jdbc-3.12.8.jar,/tmp/spark-snowflake_2.11-2.7.2-spark_2.4.jar') \
.getOrCreate()
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
df = spark.read.format(SNOWFLAKE_SOURCE_NAME) \
.options(**sfOptions) \
.option("query", "select * from CustomerInfo limit 10") \
.load()
df.show()
} } /*我得到的错误是:文件:C:\Users\avino\Documents\java_project\gameScore。java[line:14]错误:转义序列无效(有效的是\b\t\n\f\r“'\)*/
当我试图在flash插件和red5服务器安装之间建立连接时,我得到了以下错误。请帮帮我. [ERROR][rtmpConnectionExecutor#dtqatxjixlu78-1]org.red5.server.net.rtmp.basertmphandler-Exception java.lang.nosuchmethoderror:org.red5.server.scope.scope$c
我试图在数据库中的CLOB类型列中插入一个很长的字符串,它基本上是一个base64编码的图像字符串,但我得到了异常java.sql.sqlsyntaxerroreXception。做这件事的正确方法是什么? 我尝试了setClob()中的Clob对象和setClob()中的reader对象,但给出了相同的异常“java.sql.sqlsyntaxerrorexception”,并且我将OJDBC1
当我试图运行这个骨架代码时,我一直收到这个错误。我试图在Eclipse中使用OpenGL。我不知道是什么导致了这个问题。我如何解决这个问题?我也已经将jar文件添加到用户库中。 代码: 这是我一直在犯的错误。 错误:错误1 错误2 Plhd-19/>(JComponent. java: 4839)在java.桌面/java. awt.容器. addNotify(容器. java: 2804)在ja
我对雪花+JMeter是新手。当我尝试使用以下配置来配置和运行Jmeter时,我收到以下错误。 null 我不确定,我在这里遗漏了什么。请帮帮我。 *来自Jemter结果视图的错误信息**响应消息:java.sql.sqlException:无法创建PoolableConnectionFactory(JDBC驱动程序遇到通信错误。消息:HTTP请求遇到异常:连接到
C:\Users\georg\Desktop\reactapp reactapp@1.0.0启动网页包开发服务器--热 节点:内部/模块/cjs/加载器:927抛出错误^ 错误:找不到模块'webpack-cli/bin/config-yargs'需要堆栈: C:\用户\georg\Desktop\reactapp\node_modules\webpack-dev-server\bin\webpa