当前位置: 首页 > 工具软件 > Google Spark > 使用案例 >

Spark组件之GraphX学习15--we-Google.txt大图分析

裴学
2023-12-01

更多代码请见:https://github.com/xubo245/SparkLearning


1解释

统计边和点和最大度


2.代码:

/**
 * @author xubo
 * ref http://spark.apache.org/docs/1.5.2/graphx-programming-guide.html
 * http://snap.stanford.edu/data/web-Google.html
 * time 20160503
 */

package org.apache.spark.graphx.learning

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.graphx.Graph
import org.apache.spark.graphx.Graph.graphToGraphOps
import org.apache.spark.graphx.VertexId
import org.apache.spark.graphx.util.GraphGenerators
import org.apache.spark.graphx.GraphLoader
import org.apache.spark.graphx.PartitionStrategy
import org.apache.spark.graphx.VertexRDD

object webGoogle {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("ConnectedComponents").setMaster("local[4]")
    val sc = new SparkContext(conf)

    // Parse the edge data which is already in userId -> userId format
    val graph = GraphLoader.edgeListFile(sc, "file/data/graphx/input/web-Google.txt")
    println("graph.numEdges:" + graph.numEdges);
    println("graph.numVertices:" + graph.numVertices);
    println("\n edges 10:");
    graph.edges.take(10).foreach(println);
    println("\n vertices 10:");
    graph.vertices.take(10).foreach(println);

    //***************************************************************************************************
    //*******************************          图的属性          *****************************************
    //***************************************************************************************************
    println("**********************************************************")
    println("属性演示")
    println("**********************************************************")
    println("Graph:");

    //Degrees操作
    println("找出图中最大的出度、入度、度数:")
    def max(a: (VertexId, Int), b: (VertexId, Int)): (VertexId, Int) = {
      if (a._2 > b._2) a else b
    }
    println("max of outDegrees:" + graph.outDegrees.reduce(max) + " max of inDegrees:" + graph.inDegrees.reduce(max) + " max of Degrees:" + graph.degrees.reduce(max))
    println

    sc.stop
  }
}



3.结果:

graph.numEdges:5105039
graph.numVertices:875713

 edges 10:
Edge(0,11342,1)
Edge(0,824020,1)
Edge(0,867923,1)
Edge(0,891835,1)
Edge(1,53051,1)
Edge(1,203402,1)
Edge(1,223236,1)
Edge(1,276233,1)
Edge(1,552600,1)
Edge(1,569212,1)

 vertices 10:
(266991,1)
(651447,1)
(182316,1)
(846729,1)
(627804,1)
(831957,1)
(512760,1)
(307248,1)
(449586,1)
(857454,1)
**********************************************************
属性演示
**********************************************************
Graph:
找出图中最大的出度、入度、度数:
max of outDegrees:(506742,456) max of inDegrees:(537039,6326) max of Degrees:(537039,6353)



参考

【1】 http://spark.apache.org/docs/1.5.2/graphx-programming-guide.html

【2】https://github.com/xubo245/SparkLearning

【3】 http://snap.stanford.edu/data/web-Google.html


 类似资料: