#### Eighth International Workshop on
## High Performance Big Graph Data
## Management, Analysis, and Mining
**December 16, 2021**
To be held in conjunction with the
[2021 IEEE International Conference on Big Data **(IEEE BigData 2021)**](http://bigdataieee.org/BigData2021/)
### Workshop Description
Modern Big Data increasingly appears in the form of complex graphs and networks. Examples include the physical Internet, the world wide web, online social networks, phone networks, and biological networks. In addition to their massive sizes, these graphs are dynamic, noisy, and sometimes transient. They also conform to all five Vs (Volume, Velocity, Variety, Value and Veracity) that define Big Data. However, many graph-related problems are computationally difficult, and thus big graph data brings unique challenges, as well as numerous opportunities for researchers, to solve various problems that are significant to our communities.
Big graph problems are currently solved using several complementary paradigms. The most popular approach is perhaps by exploiting parallelism, through specialized algorithms for supercomputers, shared-memory multicore and manycore systems, and heterogeneous CPU-GPU systems. However, since real-world graphs are sparse and highly irregular, there are very few parallel implementations that can actually deliver high performance. The major challenges to scaling and efficiency include irregular data dependencies, poor locality, and high synchronization costs of current approaches. In addition to parallelism, researchers are developing approximation algorithms that use sampling for compressing and summarizing graph data. Streaming algorithms are also being considered for scenarios where the rate of updates is too fast to process the entire graph in a single pass. Further, out-of-core algorithms are necessary for massive graphs that do not fit in the main memory of a typical system. Researchers can use graph-based solutions for solving problems from many diverse disciplines, including routing and transportation, social networks, bioinformatics, computational science, health care, security and intelligence analysis.
This workshop aims to bring together researchers from different paradigms solving big graph problems under a unified platform for sharing their work and exchanging ideas. We are soliciting novel and original research contributions related to big graph data management, analysis, and mining (algorithms, software systems, applications, best practices, performance). Significant work-in-progress papers are also encouraged. Papers can be from any of the following areas, including but not limited to:
* Graph embeddings and representation learning for graph data
* Graph neural networks
* Deep Learning-based models for learning on graph data
* Extreme-scale computing for large tensor, network, and graph problems
* Parallel algorithms for big graph analysis on HPC systems
* Heterogeneous CPU-GPU solutions to solve big graph problems
* Sampling and summarization of large graphs
* Graph algorithms for large-scale scientific computing problems
* Graph clustering, partitioning, and classification methods
* Scalable graph topology measurement: diameter approximation, eigenvalues, triangle and graphlet counting
* Parallel algorithms for computing graph kernels
* Inference on large graph data
* Graph evolution and dynamic graph models
* Graph streams
* Computational methods for visualization of large-scale graphs
* Graph databases, novel querying and indexing strategies for RDF data
* Novel applications of big graph problems in bioinformatics, health care, security, and social networks
* New software systems and runtime systems for big graph data mining
**Regular paper** submissions must be at most 10 pages long, including all figures, tables, and references. They must be formatted according to the paper submission formatting guidelines provided in the [IEEE
BigData 2021 Call for Papers](http://bigdataieee.org/BigData2021/CallPapers.html). Additionally, we encourage **short paper** submissions (at most 6 pages) describing new work in progress.
### Past Workshops
[BigGraphs 2014](workshop2014.html)
[BigGraphs 2015](workshop2015.html)
[BigGraphs 2016](workshop2016.html)
[BigGraphs 2017](workshop2017.html)
[BigGraphs 2018](workshop2018.html)
[BigGraphs 2019](workshop2019.html)
[BigGraphs 2020](workshop2020.html)
### Important Dates
* Oct 1, 2021 (11.59 pm Anywhere on Earth time): [Submission](https://wi-lab.com/cyberchair/2021/bigdata21/index.php) deadline
* Nov 3, 2021: Notification of paper acceptance to authors
* Nov 15, 2021: Camera-ready submissions due
* Dec 16, 2021: Workshop to be held virtually
### Keynote
**[David A. Bader](https://davidbader.net)**

Distinguished Professor and Director of the Institute for Data Science
New Jersey Institute of Technology
**Solving Global Grand Challenges with High Performance Data Analytics**
*Abstract*: Data science aims to solve grand global challenges such as: detecting and preventing
disease in human populations; revealing community structure in large social networks; protecting
our elections from cyber-threats, and improving the resilience of the electric power grid. Unlike
traditional applications in computational science and engineering, solving these social problems
at scale often raises new challenges because of the sparsity and lack of locality in the data,
the need for research on scalable algorithms and architectures, and development of frameworks for
solving these real-world problems on high performance computers, and for improved models that capture
the noise and bias inherent in the torrential data streams. In this talk, Bader will discuss the
opportunities and challenges in massive data science for applications in social sciences, physical
sciences, and engineering.
*Bio*: David A. Bader is a Distinguished Professor and founder of the Department of Data Science
and inaugural Director of the Institute for Data Science at New Jersey Institute of Technology.
Prior to this, he served as founding Professor and Chair of the School of Computational Science and
Engineering, College of Computing, at Georgia Institute of Technology. Dr. Bader is a Fellow of the
IEEE, AAAS, and SIAM, a recipient of the IEEE Sidney Fernbach Award, and advises the White House,
most recently on the National Strategic Computing Initiative (NSCI) and Future Advanced Computing
Ecosystem (FACE). Bader is a leading expert in solving global grand challenges in science, engineering,
computing, and data science. His interests are at the intersection of high-performance computing and
real-world applications, including cybersecurity, massive-scale analytics, and computational genomics,
and he has co-authored over 300 scholarly papers and has best paper awards from ISC, IEEE HPEC, and
IEEE/ACM SC. Dr. Bader has served as a lead scientist in several DARPA programs including High
Productivity Computing Systems (HPCS) with IBM, Ubiquitous High Performance Computing (UHPC) with
NVIDIA, Anomaly Detection at Multiple Scales (ADAMS), Power Efficiency Revolution For Embedded Computing
Technologies (PERFECT), Hierarchical Identify Verify Exploit (HIVE), and Software-Defined Hardware (SDH).
Recently, Bader received an NVIDIA AI Lab (NVAIL) award, and a Facebook Research AI Hardware/Software
Co-Design award. Dr. Bader is Editor-in-Chief of the ACM Transactions on Parallel Computing, and General
Co-Chair of IPDPS 2021, and previously served as Editor-in-Chief of the IEEE Transactions on Parallel and
Distributed Systems. He serves on the leadership team of Northeast Big Data Innovation Hub as the inaugural
chair of the Seed Fund Steering Committee. In 2021, ROI-NJ recognized Bader on its inaugural list of technology
influencers, and in 2012, Bader was the inaugural recipient of University of Maryland’s Electrical and Computer
Engineering Distinguished Alumni Award. In 2014, Bader received the Outstanding Senior Faculty Research Award
from Georgia Tech. Bader has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell
Broadband Engine Processor and Director of an NVIDIA GPU Center of Excellence. In 1998, Bader built the first
Linux supercomputer that led to a high-performance computing (HPC) revolution. He is a cofounder of the Graph500
List for benchmarking “Big Data” computing platforms. He is recognized as a “RockStar” of High Performance Computing
by InsideHPC and as HPCwire’s People to Watch in 2012 and 2014.
### Workshop Program
December 16, 2021
9:15 am -- 12:25 pm U.S. Eastern Standard Time
Location: IEEE BigData 2021 virtual platform.
20-minute presentations (15-minute talk video will be broadcast and 5 minutes for live Q&A)
9:15 am Opening remarks
Workshop organizers
9:20 am Building Graphs at a Large Scale: Union Find Shuffle
Sai Gopal Thota, Mridul Jain, Sai Kiran Reddy Malikireddy, Pruthvi Raj Eranti, Albin Kuruvilla, and Nishad Kamat
9:45 am Event-based Product Carousel Recommendation with Query-Click Graph
Luyi Ma, Nimesh Sinha, Parth Vajge, Jason H.D. Cho, Sushant Kumar, and Kannan Achan
10:00 am Coffee break
10:15 am Solving Global Grand Challenges with High Performance Data Analytics (keynote)
David A. Bader
11:00 am A Pre-training Oracle for Predicting Distances in Social Networks
Gunjan Mahindre, Rasika Karkare, Randy Paffenroth, and Anura Jayasumana
11:20 am HPCGCN: A Predictive Framework on High Performance Computing Cluster Log Data Using Graph Convolutional Networks
Avishek Bose, Huichen Yang, William Hsu, and Daniel Andresen
11:40 am Correlation and pattern detection in event networks
Valerio Bellandi, Paolo Ceravolo, Samira Maghool, Margherita Pindaro, and Stefano Siccardi
12:00 pm Drug Abuse Detection in Twitter-sphere: Graph-Based Approach
Khaled Mohammed Saifuddin, Muhammad Ifte khairul Islam, and Esra Akbas
12:25 pm Closing remarks
Workshop organizers
### Workshop Organizers
[Nesreen Ahmed](http://nesreenahmed.com/)

Intel Labs
Santa Clara, CA 95054
[Mohammad Al Hasan](http://cs.iupui.edu/~alhasan/)

Department of Computer and Information Science
Indiana University - Purdue University
Indianapolis, IN 46202
[Shaikh Arifuzzaman](https://www.cs.uno.edu/~arif/)

Department of Computer Science
The University of New Orleans
New Orleans, LA 70148
[Kamesh Madduri](https://madduri.org/)

Department of Computer Science and Engineering
The Pennsylvania State University
University Park, PA 16802
### Contact
Please send email to one of the workshop organizers.