Gunrock: A High-Performance Graph Processing Library on the GPU Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens University of California, Davis

Why is Gunrock fast? How does Gunrock express graph algorithms?

Overview Gunrock is a stable, powerful, forward-looking, open-source substrate for GPU-based graph-centric research and development. Gunrock offers:

# ## import libraries from ctypes import * gunrock = cdll . LoadLibrary ( ’ ../../ build / lib / libgunrock . so ’) # ## read row_list for x in col_list for x in

// Compute delta values while (! f o r w a r d _ q u e u e _ o f f s e t s . empty ()) { // Compute delta values BCEnactor :: gunrock :: oprtr :: advance < BCProblem , BackwardEnactor >(); }

# ## output array scores = pointer (( c_float * nodes )()) # ## call gunrock function on device gunrock . bc ( scores , nodes , edges , row , col , 0) # ## sample results print ’ node bc scores : ’ , for idx in range ( nodes ): print scores [0][ idx ] ,

BFSProblem :: Extract (); // Get result

(a) Compute BC in Python. (b) Develop BC using Gunrock. Figure: Code snapshot of working with Gunrock and using Gunrock.

What is Gunrock’s Data-centric Programming Model?

Block0

t0 t1

Block1

t0 t1

Block255

t0 t1

frontier

BFS:

Advance Update Label Value

Filter Remove Redundant

BC:

Advance Accumulate Sigma Value

Filter Remove Redundant

CC:

functor

Advance Update Label Value

SSSP:

Filter Remove Redundant

Advance Compute BC Value

Filter For e=(v1,v2), assign c[v1] to c[v2]. Remove e when c[v1]==c[v2]

Near/Far Pile

Traversal Computation

Filter For v, assign c[v] to c[c[v]]. Remove v when c[v]==c[c[v]]

...

tn t0

t1

.. .. .. tn t0 t1

...

tn

tn

t0 t1

...

tn

t0

...

tn

t1

t1

...

t1

tn

t0

t1

...

7435.21

1000

4155.72

Speedup-Cusha

Speedup-Ligra

PR:

Speedup-hardwired GPU

t0

t1

...

t31 t0

t1

t0

t1

warp0

PageRank

CC

roadnet

kron

bitc

soc

roadnet

kron

bitc

soc

roadnet

kron

bitc

soc

roadnet

kron

bitc

soc

roadnet

bitc

kron

0.1 soc

tn

t0

t1

t2

t3

t4

...

t31 t0

warp1

t1

... ... t31

t0

t1

t31

warp31

Warp cooperative Advance of medium neighbor lists; t0

t1

...

tn

Per-thread Advance of small neighbor lists.

Website: http://gunrock.github.io/ • Author’s Email: [email protected]

1

BC

...

Funding Agencies

• Gunrock

Speedup-MapGraph

10

SSSP

t1

DARPA XDATA W911QX-12-C-0059, STTR D14PC00023; NSF OCI-1032859, CCF-1017399.

100

BFS

t0

Block cooperative Advance of large neighbor lists;

Contact Information Speedup-BGL

tn

- Scale to multiple GPUs/nodes; - Asynchronous model; - Out-of-core and streaming support; - Expand core operators and new primitives; - In-depth performance characterization.

Advance Filter Distribute Update PR value. PR value to Remove when Neighbors PR value converge

Figure: Several graph primitives in Gunrock.

Compute

...

...

tn t0 t1

Future Work

traversal-based: breadth-first search, single-source shortest path; node-ranking: HITS, SALSA, PageRank, betweenness centrality; global: connected component, minimum spanning tree.

advance generate a new frontier from the edges or vertices of the current frontier filter generate a new frontier from a current frontier using a user-specified predicate compute run a user-specified computation in parallel on each element in the current

...

(a) Load-balanced traversal. (b) Dynamic-grouped traversal. Figure: Two core load-balancing strategies in Gunrock.

Primitives in Gunrock:

A frontier is a compact queue of nodes or edges. Gunrock’s three operators (below) manipulate frontiers.

Filter

in input CSR arrays from files = [ int ( x . strip ()) open ( ’ path / to / rowoffsets / r_file ’ )] = [ int ( x . strip ()) open ( ’ path / to / columnindices / c_file ’ )]

# ## convert CSR graph inputs for gunrock input row = pointer (( c_int * len ( row_list ))(* row_list )) col = pointer (( c_int * len ( col_list ))(* col_list )) nodes = len ( row_list ) - 1 edges = len ( col_list )

- the best performance on GPU graph analytics; - a high-level abstraction for graph algorithms on the GPU; and - the widest range of primitives.

Advance

BCProblem :: Init (); // Init ializati on // Accumulate sigma values while ( f r o n t i e r _ q u e u e _ l e n g t h > 0) { f o r w a r d _ q u e u e _ o f f s e t s . push ( new_offsets ); // Get neighbors and update scores BCEnactor :: gunrock :: oprtr :: advance < BCProblem , ForwardEnactor >(); // Cenerate new vertex frontier BCEnactor :: gunrock :: oprtr :: filter < BCProblem , ForwardEnactor >(); }

Powerful load-balancing capabilities that effectively address the inherent irregularity in graphs:

A High-Performance Graph Processing Library on the ...

Gunrock of- fers: -the best performance on GPU graph analytics; ... build/lib/libgunrock.so'). ### read in .... •Gunrock Website: http://gunrock.github.io/. •Author's ...

279KB Sizes 2 Downloads 234 Views

Recommend Documents

On Effective Presentation of Graph Patterns: A ... - ACM Digital Library
Oct 30, 2008 - to mine frequent patterns over graph data, with the large spectrum covering many variants of the problem. However, the real bottleneck for ...

Scale-up Graph Processing: A Storage-centric View - Semantic Scholar
fetched from storage: either from disk into memory or from memory into ... Permission to make digital or hard copies of all or part of this work for personal or ... Proc of the First International Workshop on Graph Data Management Ex- perience ...

A Graph-theoretic perspective on centrality
measures of centrality assess a node's involvement in the walk structure of a network. Measures vary along. 10 .... We can express this in matrix notation as CDEG = A1, where 1 is a column vector of ones. 106 ...... Coleman, J.S., 1973. Loss of ...

pdf-1843\a-java-library-of-graph-algorithms-and-optimization ...
Try one of the apps below to open or edit this item. pdf-1843\a-java-library-of-graph-algorithms-and-optimization-discrete-mathematics-and-its-applications.pdf.

On the automorphism group of a Johnson graph
Dec 11, 2014 - the automorphism group of the Johnson graph J(n, i) is Sn × 〈T〉, where T is the complementation .... Since A ∩ B and Ac ∩ Bc have the same cardinality, the complementation ... We call each clique Yp a clique of the first kind.

On the automorphism group of a Johnson graph
n = 2i cases was already determined in [7], but the proof given there uses. *Department of Electronics and Telecommunication Engineering, Vidyalankar Insti-.

On Efficient Graph Substructure Selection
Abstract. Graphs have a wide range of applications in many domains. The graph substructure selection problem is to find all subgraph isomor- phic mappings of ...

On Effective Presentation of Graph Patterns: A ...
Oct 30, 2008 - niques give is a lengthy list of exact patterns, which are undesirable ..... and enumerated supports, in order to delegate p1 by p2 so that p1 can ...

Pursuit on a Graph Using Partial Information
instrumented node, the UGS therein informs the pursuer if ... If this happens, the. UGS is triggered and this information is instantaneously relayed to the pursuer, thereby enabling capture. On the other hand, if the evader reaches one of the exit no

Efficient processing of graph similarity queries with edit ...
DISK. LE. CP Disp.:2013/1/28 Pages: 26 Layout: Large. Author Proof. Page 2. uncorrected proof. X. Zhao et al. – Graph similarity search: find data graphs whose edit dis-. 52 .... tance between two graphs is proved to be NP-hard [38]. For. 182.

Physical Processing Procedures for the Harvard ... - Harvard Library
Dec 6, 2011 - Only one barcode (no larger than 2 ½” x ¾”) should be applied to an ... from: http://hul.harvard.edu/ois/systems/aleph/f-barcodes.html (#30517,.

Neural basis of the non-attentional processing ... - Wiley Online Library
1Beijing Normal University, Beijing, China. 2Beijing 306 Hospital, Beijing, China. 3The University of Hong Kong, Hong Kong. 4University of Pittsburgh, Pittsburgh, Pennsylvania. ♢. ♢. Abstract: The neural basis of the automatic activation of words

Knowledge Transfer on Hybrid Graph
use of the labeled data from other domain to dis- criminate those unlabeled data in the target do- main. In this paper, we propose a transfer learn- ing framework ...

Lecture Notes on Graph Theory.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Lecture Notes on Graph Theory.pdf. Lecture Notes on Graph Theory.pdf. Open. Extract. Open with. Sign In. Mai

On the SES-Optimality of Regular Graph Designs
http://www.jstor.org/about/terms.html. JSTOR's Terms ... LET 2 denote the class of all connected block designs having v treatments arranged in b blocks of size k.