#### Second International Workshop on ## High Performance Big Graph Data ## Management, Analysis, and Mining **November 1, 2015** To be held in conjunction with the [2015 IEEE International Conference on Big Data **(IEEE BigData 2015)**](http://cci.drexel.edu/bigdata/bigdata2015/) [Hyatt Regency Santa Clara](http://santaclara.hyatt.com/en/hotel/home.html), Santa Clara, CA, USA ### Workshop Description Modern Big Data increasingly appears in the form of complex graphs and networks. Examples include the physical Internet, the world wide web, online social networks, phone networks, and biological networks. In addition to their massive sizes, these graphs are dynamic, noisy, and sometimes transient. They also conform to all five Vs (Volume, Velocity, Variety, Value and Veracity) that define Big Data. However, many graph-related problems are computationally difficult, and thus big graph data brings unique challenges, as well as numerous opportunities for researchers, to solve various problems that are significant to our communities. Big graph problems are currently solved using several complementary paradigms. The most popular approach is perhaps by exploiting parallelism, through specialized algorithms for supercomputers, shared-memory multicore and manycore systems, and heterogeneous CPU-GPU systems. However, since real-world graphs are sparse and highly irregular, there are very few parallel implementations that can actually deliver high performance. The major challenges to scaling and efficiency include irregular data dependencies, poor locality, and high synchronization costs of current approaches. In addition to parallelism, researchers are developing approximation algorithms that use sampling for compressing and summarizing graph data. Streaming algorithms are also being considered for scenarios where the rate of updates is too fast to process the entire graph in a single pass. Further, out-of-core algorithms are necessary for massive graphs that do not fit in the main memory of a typical system. Researchers can use graph-based solutions for solving problems from many diverse disciplines, including routing and transportation, social networks, bioinformatics, computational science, health care, security and intelligence analysis. This workshop aims to bring together researchers from different paradigms solving big graph problems under a unified platform for sharing their work and exchanging ideas. We are soliciting novel and original research contributions related to big graph data management, analysis, and mining (algorithms, software systems, applications, best practices, performance). Significant work-in-progress papers are also encouraged. Papers can be from any of the following areas, including but not limited to: * Parallel algorithms for big graph analysis on HPC systems * Heterogeneous CPU-GPU solutions to solve big graph problems * Extreme-scale computing for large graph, tensor, and network problems * Sampling and summarization of large graphs * Graph algorithms for large-scale scientific computing problems * Graph clustering, partitioning, and classification methods * Scalable graph topology measurement: diameter approximation, eigenvalues, triangle and graphlet counting * Parallel algorithms for computing graph kernels * Inference on large graph data * Graph evolution and dynamic graph models * Graph databases, novel querying and indexing strategies for RDF data * Novel applications of big graph problems in bioinformatics, health care, security, and social networks * New software systems and runtime systems for big graph data mining Submissions must be at most 8 pages long, including all figures, tables, and references. They must be formatted according to the style files used by the IEEE BigData 2015 conference proceedings. Papers must be submitted online through the workshop submission page by 11.59 pm PDT (Pacific Daylight Time) on September 5, 2015. ### Past Workshops [BigGraphs 2014](workshop2014.html) ### Important Dates * Sep 5, 2015: Submission deadline * Sep 25, 2015: Notification of paper acceptance to authors * Oct 5, 2015: Camera-ready submissions due * Nov 1, 2015: Workshop date ### Keynote **[Mohammed J. Zaki](http://www.cs.rpi.edu/~zaki/)** ![Mohammed J. Zaki](zaki.jpg "Mohammed J. Zaki") Professor, Dept. of Computer Science and Engineering Rensselaer Polytechnic Institute **Distributed Graph Mining in Massive Networks** **Abstract:** Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms are not a good match for distributed graph mining problems like mining frequent subgraphs. Given an input graph, these problems require exploring a very large number of subgraphs and finding patterns that match some "interestingness" criteria desired by the user. These algorithms are very important for areas such as social networks, semantic web, and bioinformatics. In this talk I will present Arabesque, a distributed data processing platform for implementing graph mining algorithms. Arabesque defines a high-level API that explores subgraphs and passes them to the application, which must simply compute outputs and decide whether the subgraph should be further extended. We use Arabesque's API to produce distributed solutions to three fundamental graph mining problems: frequent subgraph mining, counting motifs, and finding cliques. Our implementations require a handful of lines of code, scale to trillions of subgraphs, and represent in some cases the first available distributed solutions. I will also talk about techniques to scale graph mining methods to web-scale networks with billions of nodes and edges. [Keynote talk slides](Arabesque-BigGraphW.pdf) ### Papers Guoyao Feng, Xiao Meng, and Khaled Ammar [DISTINGER: A Distributed Graph Data Structure for Massive Dynamic Graph Processing](http://dx.doi.org/10.1109/BigData.2015.7363954) Shaikh Arifuzzaman, Maleq Khan, and Madhav Marathe [A Fast Parallel Algorithm for Counting Triangles in Graphs using Dynamic Load Balancing](http://dx.doi.org/10.1109/BigData.2015.7363957) Janani Balaji and Rajshekhar Sunderraman [Scalable Storage Structure for Pattern Matching on Big Graph Data]( http://dx.doi.org/10.1109/BigData.2015.7363958) Olivier Cure, Hubert Naacke, Tenfry Randriamalala, and Bernd Amann [LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs](http://dx.doi.org/10.1109/BigData.2015.7363955) Harris Lin, Ngot Bui, and Vasant Honavar [Learning Classifiers from Remote RDF Data Stores Augmented with RDFS Subclass Hierarchies](http://dx.doi.org/10.1109/BigData.2015.7363953) Alireza Rezaei Mahdiraji and Peter Baumann [MQuery: A Query Language for Scientific Meshes](http://dx.doi.org/10.1109/BigData.2015.7363956) ### Workshop Organizers [Mohammad Al Hasan](http://cs.iupui.edu/~alhasan/) ![Mohammad Al Hasan](hasan.jpg "Mohammad Al Hasan") Department of Computer and Information Science Indiana University - Purdue University Indianapolis, IN 46202 [Kamesh Madduri](http://www.cse.psu.edu/~madduri) ![Kamesh Madduri](madduri.jpg "Kamesh Madduri") Department of Computer Science and Engineering The Pennsylvania State University University Park, PA 16802 [Fengguang Song](http://cs.iupui.edu/~fgsong/) ![Fengguang Song](song.jpg "Fengguang Song") Department of Computer and Information Science Indiana University - Purdue University Indianapolis, IN 46202 ### Program Committee Leman Akoglu (Stony Brook University) Medha Atre (University of Pennsylvania) Juan Colmenares (Samsung Research America) Oded Green (Georgia Institute of Technology) Lian Liu (University of Kentucky) Mahantesh Halappanavar (Pacific Northwest National Laboratory) Mohammad Al Hasan (Indiana University Purdue University) Kamesh Madduri (The Pennsylvania State University) Erik Saule (University of North Carolina at Charlotte) Fengguang Song (Indiana University Purdue University) Chen Tian (Huawei Technologies USA) Stanimire Tomov (University of Tennessee Knoxville) Yinglong Xia (IBM Research) Mohammed J. Zaki (Rensselaer Polytechnic Institute) ### Contact Please send email to one of the workshop organizers.