当前位置: 首页 > 工具软件 > Quick Survey > 使用案例 >

论文阅读 - Thinking Like a Vertex: A Survey of Vertex-

赵飞雨
2023-12-01

**Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing **ROBERT RYAN MCCUNE, TIM WENINGER, and GREG MADEY, University of Notre Dame

  • Highlight, page
    implements user- defined programs from the perspective of a vertex rather than a graph

  • Highlight, page
    interdependencies

  • Highlight, page
    However, for sequential graph algorithms, which require random access to all graph data, poor locality and the indivisibility of the graph structure cause time- and resource-intensive pointer chasing between storage mediums in order to access each datum.

  • Highlight, page
    MapReduce does not natively support iterative algo- rithms

  • Highlight, page
    identifying the tradeoffs in component implementations and providing data-driven discussion

  • Highlight, page
    subgraph-centric, or hybrid, frameworks

  • Highlight, page
    Think-like-a-vertex frameworks are platforms that iteratively execute a user-defined program over vertices of a graph.

  • Highlight, page
    Many graph problems can be solved by both a sequential, shared-memory algorithm and a distributed, vertex-centric algorithm.

  • Highlight, page
    TLAV frameworks are highly scalable and inherently parallel, with manage- able intermachine communication.

  • Highlight, page
    a given vertex to all other vertices in a graph

  • Highlight, page
    This algorithm, considered a distributed version of Bellman-Ford [Lynch 1996], is shown in Algorithm 1.

  • Highlight, page
    respective edge

  • Highlight, page
    detailing

  • Highlight, page
    proceeds

  • Highlight, page
    Communication

  • Highlight, page
    vertex program execution

  • Highlight, page
    interdependent

  • Highlight, page
    thorough understanding

  • Highlight, page
    interrelation

  • Highlight, page
    a sequential presentation

  • Highlight, page
    discussed earlier.

  • Highlight, page
    conceptually simple

  • Highlight, page
    the overhead becomes largely amortized for large graphs.

  • Underline, page
    synchronous systems are often implemented along with message-passing communication

  • Highlight, page
    batch messaging

  • Highlight, page
    low computation-to-communication ratio

  • Underline, page
    synchronization

  • Underline, page
    accounted for over 80% of the total running time [Chen et al. 2014a],

  • Highlight, page
    underutilized

  • Highlight, page
    the curse of the last reducer

  • Highlight, page
    straggler

  • Highlight, page
    lightweight

  • Highlight, page
    straggler

  • Highlight, page
    outperform

  • Highlight, page
    Theoretical and empirical research

  • Highlight, page
    One disadvantage, however, is that asynchronous execution cannot take advantage of batch messaging optimizations

  • Highlight, page 1
    at the expense of added complexity, not only from scheduling logic, but also from maintaining data consistency. Asynchronous systems typically implement shared memory

  • Highlight, page 1
    pseudo-supersteps [Chen et al. 2014a]

  • Highlight, page 1
    dynamic scheduling within a sin- gle superstep [Wang et al. 2013].

  • Highlight, page 1
    reduces the number of su- persteps by decoupling intraprocessor computation from the interprocessor commu- nication and synchronization [Chen et al. 2014a].

  • Highlight, page 1
    P++ framework [Zhou et al. 2014]

  • Underline, page 1
    boundary nodes

  • Highlight, page 1
    local nodes

  • Highlight, page 1
    pseudo-supersteps

  • Highlight, page 1
    GraphHP and P++, is the KLA paradigm [Harshvardhan et al. 2014]

  • Highlight, page 1
    KLA

  • Highlight, page 1
    is allowed for a certain number of levels before a synchronous round

  • Highlight, page 1
    KLA has multiple traversals of asyn- chronous execution before coordinating a round of synchronous execution

  • Highlight, page 1
    GRACE exposes a programming interface that, from within a given superstep, allows for prioritized exe- cution of vertices and selective receiving of messages outside of the previous superstep.

  • Highlight, page 1
    Motivated by the necessity for execution mode dynamism

  • Highlight, page 1
    PowerSwitch’s heuristics can accurately predict throughput

  • Highlight, page 1
    data for one process is directly and immediately accessible by another process

  • Underline, page 1
    In the case of the former

  • Highlight, page 1
    Otherwise

  • Highlight, page 1
    is flushed when it reaches a certain capacity, sending messages over the network in batches

  • Highlight, page 1
    Shared memory avoids the additional memory overhead constituted by messages and doesn’t require intermedi- ate processing by workers.

  • Highlight, page 1
    Chandy-Misra locking

  • Highlight, page 1
    derivative with

  • Highlight, page 1
    prioritized execution and low communication overhead

  • Highlight, page 1
    Cyclops is a synchronous shared-memory framework [Chen et al. 2014b]

  • Highlight, page 1
    a third method called active messages is implemented in the GRE framework [Yan et al. 2014b].

  • Highlight, page 1
    Within the GRE architecture, active messages combine the process of sending and receiving messages, removing the need to store an intermediate state, like message queues or edge data

  • Highlight, page 1
    The GRE framework modifies the data graph into an Agent-Graph.

  • Highlight, page 1
    The Agent-Graph adds combiner and scatter vertices to the original graph in order to reduce intermachine messaging.

  • Highlight, page 1
    costly

  • Highlight, page 1
    IBM’s X-Pregel [Bao and Suzumura 2013],

  • Highlight, page 1
    plateaus

  • Highlight, page 1
    Computation for the combiner must be commutative and associative because order cannot be guaranteed

  • Underline, page 1
    Conversely

  • Highlight, page 1
    For algorithms that combine vertices into a supervertex, like Boruvka’s Minimum Spanning Tree [Chung and Condon 1996]

  • Highlight, page 1
    counterintuitively

  • Highlight, page 1
    Pregel, including ACM Computing Surveys, Vol. 48, No. 2, Article 25, Publication date: October 2015. Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks 25:17 its open-source implementations [Avery 2011; Seo et al. 2010] and several related variants [Salihoglu and Widom 2013; Bao and Suzumura 2013; Redekopp et al. 2013].

  • Highlight, page 1
    The user provides two functions, one function that executes across each vertex in the active subset and another function that exe- cutes all outgoing edges in the subset.

  • Highlight, page 1
    Edge Centric. The X-Stream framework provides an edge-centric two-phase Scatter- Gather programming model [Roy et al. 2013],

  • Highlight, page 1
    Push Versus Pull. The flow of information for vertex programs can be character- ized as data being pushed or pulled [Nguyen et al. 2013; Hant et al. 2014; Cheng et al. 2012].

  • Highlight, page 1
    Ligra is a single-machine graph-processing framework that dynamically switches between push- and pull-based operators based on a threshold.

  • Highlight, page 1
    interpartition edges

  • Highlight, page 1
    However, for graphs of even medium size, the high computational cost and necessary random access of the entire graph render METIS and related heuristics impractical.

  • Highlight, page 1
    The densification problem is addressed in Vaquero et al. [2014], wherein

  • Highlight, page 1
    More advanced label propagation schemes for partitioning are presented in Wang et al. [2014] and Slota et al. [2014].

  • Highlight, page 1
    GraphBuilder [Jain et al. 2013] is a similar library ACM Computing Surveys, Vol. 48, No. 2, Article 25, Publication date: October 2015. 25:20  R. R. McCune et al. that, in addition to partitioning, supports an extensive variety of graph-loading-related processing tasks.

  • Highlight, page 2
    the partitioner may temporarily store a vertex and decide the partitioning later [Stanton and Kliot 2012];

  • Highlight, page 2
    however, experiments demonstrate that performance remains relatively consistent for breadth-first, depth- first, and random orderings of a graph [Stanton and Kliot 2012; Tsourakakis et al. 2014].

  • Highlight, page 2
    linear deterministic greedy (LDG), a heuristic that assigns a vertex to the parti- tion with which it shares the most edges while weighted by a penalty function linearly associated with a partition’s remaining capacity.

  • Highlight, page 2
    restreaming graph partitioning model, where a streaming partitioner is provided ac- cess to previous stream results [Nishimura and Ugander 2013].

  • Highlight, page 2
    eightfold

  • Highlight, page 2
    thorough analysis

  • Highlight, page 2
    too computationally expensive

  • Highlight, page 2
    Good workload balance for skewed degree distributions can also be achieved with degree-based hashing [Xie et al. 2014].

  • Highlight, page 2
    vary drastically over the course of computation, which creates processing imbalances and increases runtime.

  • Highlight, page 2
    such as graph coarsening [Wang et al. 2014].

  • Highlight, page 2
    According to Salihoglu and Widom [2013], a dynamic repartitioning strategy must directly address (1) how to select vertices to reassign, (2) how and when to move the assigned vertices, and (3) how to locate the reassigned vertices.

  • Highlight, page 2
    the loading and partitioning can be performed in parallel [Salihoglu and Widom 2013].

  • Highlight, page 2
    XPregel [Bao and Suzumura 2013] supports multithreading by dividing a partition into a user-defined number of sub- partitions

  • Highlight, page 2
    When a failure occurs, the system rolls back to the most recently saved point, all partitions are reloaded, and the entire system resumes process- ing from the checkpoint.

  • Highlight, page 2
    GraphLab [Low et al. 2012] implements asynchronous vertex checkpointing, based on Chandy-Lamport [Chandy and Lamport 1985] snapshots,

  • Highlight, page 2
    they are highly scal- able while providing a simple programming interface,

  • Highlight, page 2
    dictated

  • Highlight, page 2
    GraphChi. The seminal single-machine TLAV framework is GraphChi [Kyrola et al. 2012],

  • Highlight, page 2
    PathGraph also implements a path-centric compact storage system that improves compactness and locality [Yuan et al. 2014]. Because most iterative graph algorithms involve path traversal,

  • Highlight, page 2
    retaining

  • Highlight, page 2
    in varying degrees

  • Highlight, page 2
    shared either through vertex programs on boundary nodes or, in the case of Blogel, directly between subgraphs

  • Highlight, page 2
    Collectively, subgraph-centric frameworks dra- matically outperform TLAV frameworks, often by orders of magnitude in terms of computing time, number of messages, and total supersteps [Tian et al. 2013; Yan et al. 2014a].

  • Highlight, page 2
    A vertex subset interface is implemented in Ligra [Shun and Blelloch 2013].

  • Highlight, page 2
    Similarly, the Single Pivot (SP) optimization [Salihoglu and Widom 2014], first pre- sented in Quick et al. [2012], also temporarily adopts a global view.

  • Highlight, page 2
    the Parallel Boost Graph Library [Gregor and Lumsdaine 2005]

  • Highlight, page 2
    first-class citizens

  • Highlight, page 2
    Databases offer local or online queries, such as one-hop neighbors, whereas TLAV systems iteratively process the entire graph offline in batch

  • Highlight, page 3
    Distributed algorithms are a mature field of study [Lynch 1996],

  • Highlight, page 3
    Not all graphs are large enough to necessitate distributed processing, and not all graph problems need the whole graph to be computed iteratively.

转载于:https://my.oschina.net/u/178169/blog/675198

 类似资料: