This page is for quickSpan, the fast and reliable gSpan implementation by the TAMIS team at Inria Rennes. The source code, documentation, and data sets are available via at the here.
quickSpan is fast:
- Adopting task parallel programming.
- Incorporates C++11 hastable and hashset.
- Uses contiguous memory storage.
- Uses partial pruning.
quickSpan is memory efficient:
- Incorporates C++11 emplace_back method.
- Reconstructs graphs with frequent edges and nodes before mining.
- Reuse memory to store temporal data.
- Does not store or build unnecessary intermediate structures.
Comparison with other algorithms
The comparison runs over the accompanying data sets sourced from various academic works. The implementations below are:
gBolt – original gBolt implementation
GLP – Graph Learning Package/mathlab library
quickSpan – the implementation here
quiskSpan1 – implementation here with “-threads 1” for single threading forced
SFS – the implementation used by the following two papers:
B. Bringmann, A. Zimmermann, L. D. Raedt, and S. Nijssen. Don’t be afraid of simpler patterns. In PKDD, pages 55-66, 2006.
B. Bringmann and S. Nijssen. What is frequent in a single graph? In PAKDD, pages 858-863, 2008.
The results of comparing the above algorithms over several publicly available (and included with the source code) data sets are shown below.
Data set name Support gBolt GLP quickSpan quickSpan1 SFS BrainNet ADHD 0.10 0.27 1.77 0.11 0.13 T 0.15 0.27 0.14 0.10 0.03 T 0.20 0.28 0.07 0.08 0.02 T 0.50 0.13 0.07 0.07 0.02 T Hyper/Impulsive 0.10 14.12 T 11.48 117.54 T 0.15 0.67 54.06 0.67 1.44 T 0.20 0.39 2.47 0.19 0.44 T 0.50 0.19 0.11 0.07 0.06 T Gender 0.10 0.34 11.16 0.16 0.52 T 0.15 0.30 1.35 0.10 0.12 T 0.20 0.30 0.10 0.11 0.05 T 0.50 0.16 0.05 0.08 0.03 T Chemical 340 0.10 0.37 1.06 0.18 0.21 0.23 0.15 0.34 0.68 0.14 0.12 0.14 0.20 0.39 0.37 0.13 0.09 0.13 0.50 0.28 0.18 0.10 0.06 0.06 Mutagen 0.10 0.40 0.70 0.23 0.28 T 0.15 0.40 0.40 0.16 0.24 T 0.20 0.37 0.32 0.15 0.19 T 0.50 0.22 0.13 0.12 0.15 T NCI AIDS 0.10 5.73 90.96 5.75 15.15 T 0.15 5.21 48.11 5.36 11.39 T 0.20 4.84 34.45 4.78 9.75 T 0.50 4.19 4.33 4.19 5.16 T AIDS Active 0.10 0.59 18.62 0.36 1.20 2.53 0.15 0.45 4.70 0.28 0.58 0.96 0.20 0.43 1.52 0.25 0.26 0.53 0.50 0.32 0.29 0.18 0.11 0.12 AIDS + Cancer 0.10 3.34 57.28 3.32 8.37 T 0.15 3.10 29.51 3.05 6.32 T 0.20 2.80 20.66 2.86 5.26 T 0.50 2.55 6.78 2.47 3.13 T Cancer 0.10 19.05 363.91 20.44 89.04 T 0.15 15.11 170.93 15.50 51.63 T 0.20 13.60 108.72 14.04 38.19 T 0.50 10.42 29.90 10.77 14.99 T Social DBLP 0.10 0.84 1.51 0.80 0.77 T 0.15 0.84 1.48 0.85 0.78 T 0.20 0.85 1.51 0.82 0.75 T 0.50 0.82 1.50 0.89 0.75 T Twitter 0.10 1.61 3.01 1.61 1.62 T 0.15 1.66 3.06 1.61 1.52 T 0.20 1.63 3.06 1.60 1.49 T 0.50 1.70 3.04 1.78 1.48 T Pathological 2 1.00 X X 84.54(24.16) 81.21(23.38) T Pathological 20 0.10 X X 86.51(291.51) 86.11(289.36) T 0.15 X X 88.98(291.98) 84.91(288.22) T 0.20 X X 88.08(293.15) 85.86(288.75) T 0.50 X X 87.22(292.51) 86.58(289.31) T 1.00 X T 88.11(292.68) 84.24(295.06) T
For further questions, feedback, support, or collaboration please send an email to thomas.given-wilson@inria.fr .