
Motivo: fast motif counting via succinct color coding and adaptive sampling
The randomized technique of color coding is behind stateoftheart algo...
read it

Efficiently Counting Vertex Orbits of All 5vertex Subgraphs, by EVOKE
Subgraph counting is a fundamental task in network analysis. Typically, ...
read it

Sampling connected subgraphs: nearlyoptimal mixing time bounds, nearlyoptimal εuniform sampling, and perfect uniform sampling
We study the connected subgraph sampling problem: given an integer k ≥ 3...
read it

Counting fivenode subgraphs
We propose exact count formulae for the 21 topologically distinct nonin...
read it

PRESTO: Simple and Scalable Sampling Techniques for the Rigorous Approximation of Temporal Motif Counts
The identification and counting of small graph patterns, called network ...
read it

Provably and Efficiently Approximating Nearcliques using the Turán Shadow: PEANUTS
Clique and nearclique counts are important graph properties with applic...
read it

Polynomial Anonymous Dynamic Distributed Computing without a Unique Leader
Counting the number of nodes in Anonymous Dynamic Networks is enticing f...
read it
Faster motif counting via succinct color coding and adaptive sampling
We address the problem of computing the distribution of induced connected subgraphs, aka graphlets or motifs, in large graphs. The current stateoftheart algorithms estimate the motif counts via uniform sampling, by leveraging the color coding technique by Alon, Yuster and Zwick. In this work we extend the applicability of this approach, by introducing a set of algorithmic optimizations and techniques that reduce the running time and space usage of color coding and improve the accuracy of the counts. To this end, we first show how to optimize color coding to efficiently build a compact table of a representative subsample of all graphlets in the input graph. For 8node motifs, we can build such a table in one hour for a graph with 65M nodes and 1.8B edges, which is 2000 times larger than the state of the art. We then introduce a novel adaptive sampling scheme that breaks the “additive error barrier” of uniform sampling, guaranteeing multiplicative approximations instead of just additive ones. This allows us to count not only the most frequent motifs, but also extremely rare ones. For instance, on one graph we accurately count nearly 10.000 distinct 8node motifs whose relative frequency is so small that uniform sampling would literally take centuries to find them. Our results show that color coding is still the most promising approach to scalable motif counting.
READ FULL TEXT
Comments
There are no comments yet.