2014 BigGraphs Workshop at IEEE BigData'14

#### First International Workshop on ## High Performance Big Graph Data ## Management, Analysis, and Mining **October 27, 2014** Full-day Workshop at [2014 IEEE International Conference on Big Data **(IEEE BigData 2014)**](http://cci.drexel.edu/bigdata/bigdata2014/) **Hyatt Regency Bethesda**, Bethesda, MD, USA ### Workshop Description Modern Big Data increasingly appears in the form of complex graphs and networks. Examples include the physical Internet, the world wide web, online social networks, phone networks, and biological networks. In addition to their massive sizes, these graphs are dynamic, noisy, and sometimes transient. They also conform to all five Vs (Volume, Velocity, Variety, Value and Veracity) that define Big Data. However, many graph-related problems are computationally difficult, and thus big graph data brings unique challenges, as well as numerous opportunities for researchers, to solve various problems that are significant to our communities. Big graph problems are currently solved using several complementary paradigms. The most popular approach is perhaps by exploiting parallelism, through specialized algorithms for supercomputers, shared-memory multicore and manycore systems, and heterogeneous CPU-GPU systems. However, since real-world graphs are sparse and highly irregular, there are very few parallel implementations that can actually deliver high performance. The major challenges to scaling and efficiency include irregular data dependencies, poor locality, and high synchronization costs of current approaches. In addition to parallelism, researchers are developing approximation algorithms that use sampling for compressing and summarizing graph data. Streaming algorithms are also being considered for scenarios where the rate of updates is too fast to process the entire graph in a single pass. Further, out-of-core algorithms are necessary for massive graphs that do not fit in the main memory of a typical system. Researchers can use graph-based solutions for solving problems from many diverse disciplines, including routing and transportation, social networks, bioinformatics, computational science, health care, security and intelligence analysis. This workshop aims to bring together researchers from different paradigms solving big graph problems under a unified platform for sharing their work and exchanging ideas. We are soliciting novel and original research contributions related to big graph data management, analysis, and mining (algorithms, software systems, applications, best practices, performance). Significant work-in-progress papers are also encouraged. Papers can be from any of the following areas, including but not limited to: * Parallel algorithms for big graph analysis on HPC systems * Heterogeneous CPU-GPU solutions to solve big graph problems * Extreme-scale computing for large graph, tensor, and network problems * Sampling and summarization of large graphs * Graph algorithms for large-scale scientific computing problems * Graph clustering, partitioning, and classification methods * Scalable graph topology measurement: diameter approximation, eigenvalues, triangle and graphlet counting * Parallel algorithms for computing graph kernels * Inference on large graph data * Graph evolution and dynamic graph models * Graph databases, novel querying and indexing strategies for RDF data * Novel applications of big graph problems in bioinformatics, health care, security, and social networks * New software systems and runtime systems for big graph data mining Submissions must be at most 8 pages long, including all figures, tables, and references. They must be formatted according to the [IEEE Computer Society Proceedings manuscript](ftp://pubftp.computer.org/Press/Outgoing/proceedings/instruct8.5x11x2.pdf) preparation guidelines. ### Important Dates * Sep 6, 2014: Submission deadline * Sep 27, 2014: Notification of paper acceptance to authors * Oct 5, 2014: Camera-ready submissions due * Oct 5, 2014: Conference early registration deadline * Oct 27, 2014: Workshop date ### Keynote **[Srinivasan Parthasarathy]( http://web.cse.ohio-state.edu/~srini/)** Professor, Dept. of Computer Science and Engineering and Dept. of Biomedical Informatics The Ohio State University **Large Scale Data Analytics: Challenges, and the role of Stratified Data Placement** **Abstract:** With the increasing popularity of XML data stores, social networks and Web 2.0 and 3.0 applications, complex data formats, such as trees and graphs, are becoming ubiquitous. Managing and processing such large and complex data stores, on modern computational eco-systems, to realize actionable information efficiently, is daunting. In this talk I will begin with discussing some of these challenges. Subsequently I will discuss a critical element at the heart of this challenge relates to the placement, storage and access of such tera- and peta- scale data. In this work we develop a novel distributed framework to ease the burden on the programmer and propose an agile and intelligent placement service layer as a flexible yet unified means to address this challenge. Central to our framework is the notion of stratification which seeks to initially group structurally (or semantically) similar entities into strata. Subsequently strata are partitioned within this ecosystem according to the needs of the application to maximize locality, balance load, minimize data skew or even take into account energy consumption. Results on several real-world applications validate the efficacy and efficiency of our approach. ### Papers Amlan Chatterjee, Sridhar Radhakrishnan, and Chandra N. Sekharan [Connecting the dots: Triangle completion and related problems on large data sets using GPUs](http://dx.doi.org/10.1109/BigData.2014.7004365) Naga Shailaja Dasari, Ranjan Desh, and Zubair M [ParK: An Efficient Algorithm of k-core Decomposition on Multicore Processors](http://dx.doi.org/10.1109/BigData.2014.7004366) William Eberle and Lawrence Holder [A Partitioning Approach to Scaling Anomaly Detection in Graph Streams](http://dx.doi.org/10.1109/BigData.2014.7004367) Ghizlane Echbarthi and Hamamache Kheddochi [Fractional Greedy and Partial Restreaming Partitioning : New Methods For Massive Graph Partitioning](http://dx.doi.org/10.1109/BigData.2014.7004368) S. M. Faisal, Srinivasan Parthasarathy, and P Sadayappan [Global graphs: A middleware for large scale graph processing](http://dx.doi.org/10.1109/BigData.2014.7004369) Ronald Hagan, Charles Phillips, Kai Wang, Gary Rogers, and Michael Langston [Toward an Efficient, Highly Scalable Maximum Clique Solver for Massive Graphs](http://dx.doi.org/10.1109/BigData.2014.7004370) David Mizell, Kristyn Maschhoff, and Steve Reinhardt [Extending SPARQL with graph functions](http://dx.doi.org/10.1109/BigData.2014.7004371) Josephine Namayanja and Vandana Janeja [Change Detection in Temporally Evolving Computer Networks: A Big Data Framework](http://dx.doi.org/10.1109/BigData.2014.7004372) Christian L. Staudt, Yassine Marrakchi, and Henning Meyerhenke [Detecting Communities Around Seed Nodes in Complex Networks](http://dx.doi.org/10.1109/BigData.2014.7004373) Ichitaro Yamazaki, Theo Mary, Jakub Kurzak, Stanimire Tomov, and Jack Dongarra [Access-averse Framework for Computing Low-rank Matrix Approximations](http://dx.doi.org/10.1109/BigData.2014.7004374) Angen Zheng, Alexandros Labrinidis, and Panos Chrysanthis [Architecture-Aware Graph Repartitioning for Data-Intensive Scientific Computing](http://dx.doi.org/10.1109/BigData.2014.7004375) ### Workshop Organizers [Mohammad Al Hasan](http://cs.iupui.edu/~alhasan/) Department of Computer and Information Science Indiana University-Purdue University Indianapolis, IN 46202 [Kamesh Madduri](http://www.cse.psu.edu/~madduri) Department of Computer Science and Engineering The Pennsylvania State University University Park, PA 16802 [Fengguang Song](http://cs.iupui.edu/~fgsong/) Department of Computer and Information Science Indiana University - Purdue University Indianapolis, IN 46202 ### Program Committee Nesreen Ahmed, Purdue University Medha Atre, University of Pennsylvania Mohammad Al Hasan, Indiana University-Purdue University Aydin Buluc, Lawrence Berkeley National Laboratory Kamesh Madduri, The Pennsylvania State University David Mizell, YarcData/Cray Inc. Xia Ning, NEC Laboratories Siva Rajamanickam, Sandia National Laboratories Saeed Salem, North Dakota State University Manu Shantharam, San Diego Supercomputer Center Fengguang Song, Indiana University-Purdue University Guangming Tan, Chinese Academy of Sciences Chen Tian, Huawei Technologies USA Stanimire Tomov, University of Tennessee Knoxville Jeff Vetter, Oak Ridge National Laboratory and Georgia Tech Daniel Waddington, Samsung Research America Mohammed J. Zaki, Rensselaer Polytechnic Institute ### Contact Please send email to one of the workshop organizers.