当前位置: 首页 > 知识库问答 >
问题:

Apache Spark+Ignite集群瘦客户端

杨征
2023-03-14

我正在尝试使用apache-spark读取和写入Ignite集群,我可以使用JDBC瘦客户机,但不是本机方法,正如几个spark+Ignite示例中提到的那样。

现在,所有的spark+ignite示例都启动了一个本地ignite集群,但我希望我的代码作为客户端连接到已经存在的集群。

val df = spark.read
.format("jdbc")
.option("url", "jdbc:ignite:thin://3.88.248.113")
.option("fetchsize",100)
//.option("driver", "org.apache.ignite.IgniteJdbcDriver")
.option("dbtable", "Person").load()

df.printSchema()

df.createOrReplaceTempView("test")

spark.sql("select * from test where id=1").show(10)

spark.sql("select 4,'blah',124232").show(10)

import java.sql.DriverManager
val connection = DriverManager.getConnection("jdbc:ignite:thin://3.88.248.113")

import java.util.Properties
val connectionProperties = new Properties()

connectionProperties.put("url", "jdbc:ignite:thin://3.88.248.113")

spark.sql("select 4 as ID,'blah' as STREET,124232 as ZIP").write.mode(SaveMode.Append).jdbc("jdbc:ignite:thin://3.88.248.113",
  "Person",connectionProperties)

spark.read
  .format("jdbc")
  .option("url", "jdbc:ignite:thin://3.88.248.113")
  .option("fetchsize",100)
  .option("dbtable", "Person").load().show(10,false)
val igniteDF = spark.read
  .format(FORMAT_IGNITE) //Data source type.
  .option(OPTION_TABLE, "person") //Table to read.
  .option(OPTION_CONFIG_FILE, CONFIG) //Ignite config.
  .load()
  .filter(col("id") >= 2) //Filter clause.
  .filter(col("name") like "%J%") //Another filter clause.

完整代码:-(sparkDSLExample)函数无法使用thin连接ignite远程群集

package com.ignite.examples.spark

import com.ignite.examples.model.Address
import org.apache.ignite.{Ignite, Ignition}
import org.apache.ignite.cache.query.SqlFieldsQuery
import org.apache.ignite.client.{ClientCache, IgniteClient}
import org.apache.ignite.configuration.{CacheConfiguration, ClientConfiguration}
import java.lang.{Long => JLong, String => JString}

import org.apache.ignite.cache.query.SqlFieldsQuery
import org.apache.ignite.spark.IgniteDataFrameSettings.{FORMAT_IGNITE, OPTION_CONFIG_FILE, OPTION_TABLE}
import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.{SaveMode, SparkSession}
import org.apache.spark.sql.functions.col

object SparkClientConnectionTest {

  private val CACHE_NAME = "SparkCache"

  private val CONFIG = "/Users/kalit_000/Downloads/designing-event-driven-applications-apache-kafka-ecosystem/05/demos/kafka-streams-after/ApacheIgnitePoc/src/main/scala/com/ignite/examples/config/example-ignite.xml"

  def setupExampleData = {

    val cfg2 = new ClientConfiguration().setAddresses("3.88.248.113:10800")
    val igniteClient:IgniteClient = Ignition.startClient(cfg2)

    System.out.format(">>> Created cache [%s].\n", CACHE_NAME)

    val cache:ClientCache[Integer, Address] = igniteClient.getOrCreateCache(CACHE_NAME)

    cache.query(new SqlFieldsQuery(String.format("DROP TABLE IF EXISTS Person"))
      .setSchema("PUBLIC")).getAll

    cache.query(new SqlFieldsQuery(String.format("CREATE TABLE IF NOT EXISTS Person (id LONG,street varchar, zip VARCHAR, PRIMARY KEY (id) ) WITH \"VALUE_TYPE=%s\"", classOf[Address].getName))
      .setSchema("PUBLIC")).getAll

    cache.query(new SqlFieldsQuery("INSERT INTO Person(id,street, zip) VALUES(?,?, ?)").setArgs(1L.asInstanceOf[JLong],"Jameco", "04074").setSchema("PUBLIC")).getAll
    cache.query(new SqlFieldsQuery("INSERT INTO Person(id,street, zip) VALUES(?,?, ?)").setArgs(2L.asInstanceOf[JLong],"Bremar road", "520003").setSchema("PUBLIC")).getAll
    cache.query(new SqlFieldsQuery("INSERT INTO Person(id,street, zip) VALUES(?,?, ?)").setArgs(3L.asInstanceOf[JLong],"orange road", "1234").setSchema("PUBLIC")).getAll

    System.out.format(">>> Data Inserted into Cache [%s].\n", CACHE_NAME)

    val data=cache.query(new SqlFieldsQuery("select * from Person").setSchema("PUBLIC")).getAll

    println(data.toString)
  }

  def sparkDSLExample(implicit spark: SparkSession): Unit = {
    println("Querying using Spark DSL.")
    println


    val igniteDF = spark.read
      .format(FORMAT_IGNITE) //Data source type.
      .option(OPTION_TABLE, "person") //Table to read.
      .option(OPTION_CONFIG_FILE, CONFIG) //Ignite config.
      .load()
      .filter(col("id") >= 2) //Filter clause.
      .filter(col("name") like "%J%") //Another filter clause.

    println("Data frame schema:")

    igniteDF.printSchema() //Printing query schema to console.

    println("Data frame content:")

    igniteDF.show() //Printing query results to console.
  }


  def main(args: Array[String]): Unit = {

    setupExampleData

    //Creating spark session.
    implicit val spark = SparkSession.builder()
      .appName("Spark Ignite data sources example")
      .master("local")
      .config("spark.executor.instances", "2")
      .getOrCreate()

    // Adjust the logger to exclude the logs of no interest.
    Logger.getRootLogger.setLevel(Level.ERROR)
    Logger.getLogger("org.apache.ignite").setLevel(Level.INFO)

    //sparkDSLExample


    val df = spark.read
    .format("jdbc")
    .option("url", "jdbc:ignite:thin://3.88.248.113")
    .option("fetchsize",100)
    //.option("driver", "org.apache.ignite.IgniteJdbcDriver")
    .option("dbtable", "Person").load()

    df.printSchema()

    df.createOrReplaceTempView("test")

    spark.sql("select * from test where id=1").show(10)

    spark.sql("select 4,'blah',124232").show(10)

    import java.sql.DriverManager
    val connection = DriverManager.getConnection("jdbc:ignite:thin://3.88.248.113")

    import java.util.Properties
    val connectionProperties = new Properties()

    connectionProperties.put("url", "jdbc:ignite:thin://3.88.248.113")

    spark.sql("select 4 as ID,'blah' as STREET,124232 as ZIP").write.mode(SaveMode.Append).jdbc("jdbc:ignite:thin://3.88.248.113",
      "Person",connectionProperties)

    spark.read
      .format("jdbc")
      .option("url", "jdbc:ignite:thin://3.88.248.113")
      .option("fetchsize",100)
      .option("dbtable", "Person").load().show(10,false)

  }

}

示例-default.xml:-

<?xml version="1.0" encoding="UTF-8"?>

<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->

<!--
    Ignite configuration with all defaults and enabled p2p deployment and enabled events.
-->
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="
        http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/util
        http://www.springframework.org/schema/util/spring-util.xsd">
    <bean abstract="true" id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
        <!-- Set to true to enable distributed class loading for examples, default is false. -->
        <property name="peerClassLoadingEnabled" value="true"/>

        <!-- Enable task execution events for examples. -->
        <property name="includeEventTypes">
            <list>
                <!--Task execution events-->
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_TIMEDOUT"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_SESSION_ATTR_SET"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_REDUCED"/>

                <!--Cache events-->
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_PUT"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_READ"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_REMOVED"/>
            </list>
        </property>

        <!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    <!--
                        Ignite provides several options for automatic discovery that can be used
                        instead os static IP based discovery. For information on all options refer
                        to our documentation: http://apacheignite.readme.io/docs/cluster-config
                    -->
                    <!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
                    <!--<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">-->
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                        <property name="addresses">
                            <list>
                                <!-- In distributed environment, replace with actual host IP address. -->
                                <value>3.88.248.113:10800</value>
                            </list>
                        </property>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
</beans>

共有1个答案

关翰
2023-03-14

JDBC瘦客户机是连接AWS上的Ignite+外部Ignite集群的Spark的唯一方式,这在下面的grid gain博客中得到了回答。

https://forums.gridgain.com/community-home/digestviewer/viewthread?messagekey=13f5e836-1569-486a-8475-84c70fc141e0&communitykey=3b551477-7e2d-462f-bc5f-d6d10ccbbe35&tab=digestviewer&successmsg=thank+you+for+submitting+your+message。

Spark+Ignite示例:-https://github.com/kali786516/apacheignitePoc/blob/master/src/main/scala/com/Ignite/examples/Spark/sparkigniteCleanCode.scala

 类似资料:
  • 我有一个正在运行的Ignite集群,并且我使用进行节点发现: 它工作得很好,我可以使用节点连接到这个集群。 null

  • 我使用Apache Ignite 2.7.5作为.NET核心中服务器和瘦客户机。当我做与缓存相关的操作时,put、get和load等.net核心应用程序会自动崩溃。 因此,我想处理for循环内部的异常,例如、、等,然后从catch块抛出for循环,否则如果只有异常块,则继续循环迭代。

  • 现在我使用的是瘦客户机的ignite ClientCache,我没有找到ClientCache的分布式锁,如果要使用分布式锁,必须使用ignition.start()

  • 我们正在运行一个带有3个节点的ignite集群,它从第三方数据库(使用自定义缓存存储)预加载数据。当我们尝试使用java瘦客户机连接到集群时,如果请求在数据加载完成之前到达集群,我们会得到未知对异常和一些不稳定的行为。 堆栈跟踪

  • 2)如果瘦客户机和服务器驻留在不同的主机上,将500个条目保存到两个缓存中需要大约4分钟,这看起来非常糟糕。 即使我们考虑到一些网络延迟,我也无法证明在case2(我们希望采用的实现模式)中这种显著的延迟是合理的。我想知道这是否与我的缓存配置有关,如下所示? 瘦客户端代码: }

  • 当使用.NET瘦客户端时,是否可以启动和提交/回滚Ignite事务?我没有找到开始交易的方法。谢谢你!