Publications – Data Lab

@inproceedings{shengmin2022interpretable,

title = {Interpretable Network Representations},

author = {Shengmin Jin and Danai Koutra and Reza Zafarani},

url = {https://shengminjin.github.io/tutorials/www2022},

year  = {2022},

date = {2022-01-01},

booktitle = {Companion Proceedings of The Web Conference 2022 (WWW)},

abstract = {Networks (or interchangeably graphs) have been ubiquitous across the globe and within science and engineering: social networks, collaboration networks, protein-protein interaction networks, infrastructure networks, among many others. Machine learning on graphs, especially network representation learning, has shown remarkable performance in tasks related to graphs, such as node/graph classification, graph clustering, and link prediction. These tasks are closely related to the Web applications, especially social network analysis and recommendation systems. For example, node classification and graph clustering are widely used for studies on community detection, and link prediction plays a vital role in friend or item recommendation. Like performance, it is equally crucial for individuals to understand the behavior of machine learning models and be able to explain how these models arrive at a certain decision. Such needs have motivated many studies on interpretability in machine learning. Specifically, for social network analysis, we may need to know the reasons why certain users (or groups) are classified or clustered together by the machine learning models, or why a friend recommendation system considers some users similar so that they are recommended to connect with each other. Under such circumstances, an interpretable network representation is necessary and it should carry the graph information to a level understandable by humans. In this tutorial, we will (1) define interpretability and go over its definitions within different contexts in studies of networks; (2) review and summarize various interpretable network representations; (3) discuss connections to network embedding, graph summarization, and network visualization methods; (4) discuss explainability in Graph Neural Networks, as such techniques are often perceived to have limited interpretability; and (5) highlight the open research problems and future research directions. The tutorial is designed for researchers, graduate students, and practitioners in areas such as graph mining, machine learning on graphs, and machine learning interpretability. Few prerequisites are required for The Web Conferenc participants to attend.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Networks (or interchangeably graphs) have been ubiquitous across the globe and within science and engineering: social networks, collaboration networks, protein-protein interaction networks, infrastructure networks, among many others. Machine learning on graphs, especially network representation learning, has shown remarkable performance in tasks related to graphs, such as node/graph classification, graph clustering, and link prediction. These tasks are closely related to the Web applications, especially social network analysis and recommendation systems. For example, node classification and graph clustering are widely used for studies on community detection, and link prediction plays a vital role in friend or item recommendation. Like performance, it is equally crucial for individuals to understand the behavior of machine learning models and be able to explain how these models arrive at a certain decision. Such needs have motivated many studies on interpretability in machine learning. Specifically, for social network analysis, we may need to know the reasons why certain users (or groups) are classified or clustered together by the machine learning models, or why a friend recommendation system considers some users similar so that they are recommended to connect with each other. Under such circumstances, an interpretable network representation is necessary and it should carry the graph information to a level understandable by humans. In this tutorial, we will (1) define interpretability and go over its definitions within different contexts in studies of networks; (2) review and summarize various interpretable network representations; (3) discuss connections to network embedding, graph summarization, and network visualization methods; (4) discuss explainability in Graph Neural Networks, as such techniques are often perceived to have limited interpretability; and (5) highlight the open research problems and future research directions. The tutorial is designed for researchers, graduate students, and practitioners in areas such as graph mining, machine learning on graphs, and machine learning interpretability. Few prerequisites are required for The Web Conferenc participants to attend.

Close

Tian, Hao; Jin, Shengmin; Zafarani, Reza

Exploiting Cross-Order Patterns and Link Prediction in Higher-Order Networks Proceedings Article

In: Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW), 2022.

Abstract | BibTeX

Abdolazimi, Reyhaneh; Zafarani, Reza

Noise Enhancement: Techniques and Applications Proceedings Article

In: Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), 2022.

Abstract | Links | BibTeX

Abdolazimi, Reyhaneh; Zafarani, Reza

Noise-Enhanced Unsupervised Link Prediction Proceedings Article

In: Proceedings of the 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2021.

Abstract | BibTeX

Li, Jiayu; Zhang, Tianyun; Tian, Hao; Jin, Shengmin; Fardad, Makan; Zafarani, Reza

Graph Sparsification with Graph Convolutional Networks Journal Article

In: International Journal of Data Science and Analytics, 2021.

Abstract | BibTeX

Yang, Chen; Zhou, Xinyi; Zafarani, Reza

CHECKED: Chinese COVID-19 Fake News Dataset Journal Article

In: Social Network Analysis and Mining, 2021.

Abstract | BibTeX

Zhou, Xinyi; Zafarani, Reza

A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities Journal Article

In: ACM Computing Surveys, vol. 53, no. 5, 2020.

BibTeX

Jin, Shengmin; Zafarani, Reza

The Spectral Zoo of Networks: Embedding and Visualizing Networks with Spectral Moments Proceedings Article

In: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020.

Abstract | BibTeX

Tian, Hao; Zafarani, Reza

Exploiting Common Neighbor Graph for Link Prediction Proceedings Article

In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM), 2020.

Abstract | BibTeX

Zhou, Xinyi; Mulay, Apurva; Ferrara, Emilio; Zafarani, Reza

ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research Proceedings Article

In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM), 2020.

Abstract | BibTeX

Abdolazimi, Reyhaneh; Jin, Shengmin; Zafarani, Reza

Noise-Enhanced Community Detection Proceedings Article

In: Proceedings of the 31st ACM Conference on Hypertext and Social Media (HT), 2020.

Abstract | BibTeX

Ma, Rui; Jin, Shengmin; Eftekharnejad, Sara; Zafarani, Reza; Philippe, Wolf Peter Jean

A Probabilistic Cascading Failure Model for Dynamic Operating Conditions Journal Article

In: IEEE Access, 2020.

BibTeX

Li, Jiayu; Zhang, Tianyun; Tian, Hao; Jin, Shengmin; Fardad, Makan; Zafarani, Reza

SGCN: A Graph Sparsiﬁer based on Graph Convolutional Networks Proceedings Article

In: Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2020.

Abstract | BibTeX

Zhou, Xinyi; Wu, Jindi; Zafarani, Reza

SAFE: Similarity-Aware Multi-Modal Fake News Detection Proceedings Article

In: Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2020.

Abstract | BibTeX

Zhou, Xinyi; Jain, Atishay; Phoha, Vir V.; Zafarani, Reza

Fake News Early Detection: A Theory-driven Model Journal Article

In: ACM Transactions on Digital Threats: Research and Practice, 2020.

BibTeX

Sitaula, Niraj; Mohan, Chilukuri K.; Grygiel, Jennifer; Zhou, Xinyi; Zafarani, Reza

Credibility-based Fake News Detection Book Section

In: Shu, Kai; Wang, Suhang; Lee, Dongwon; Liu, Huan (Ed.): Disinformation, Misinformation, and Fake News in Social Media, Springer, 2020.

BibTeX

Jin, Shengmin; Wituszynski, Richard; Caiello-Gingold, Max; Zafarani, Reza

WebShapes: Network Visualization with 3D Shapes Proceedings Article

In: Proceedings of the 13th ACM International Conference on Web Search and Data Mining (WSDM), 2020.

Abstract | BibTeX

Zhou, Xinyi; Jin, Shengmin; Zafarani, Reza

Sentiment Paradoxes in Social Networks: Why Your Friends are More Positive Than You? Proceedings Article

In: Proceedings of the 14th International AAAI Conference on Web and Social Media (ICWSM), 2020.

Abstract | BibTeX

Zhou, Xinyi; Zafarani, Reza

Network-based Fake News Detection: A Pattern-driven Approach Journal Article

In: ACM SIGKDD Explorations Newsletter, 2019.

BibTeX

Jin, Shengmin; Phoha, Vir V.; Zafarani, Reza

Network Identification and Authentication Proceedings Article

In: Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM), 2019.

Abstract | BibTeX

@inproceedings{shengmin2019network,

title = {Network Identification and Authentication},

author = {Shengmin Jin and Vir V. Phoha and Reza Zafarani},

year  = {2019},

date = {2019-01-01},

booktitle = {Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM)},

abstract = {Research on networks is commonly performed using anonymized network data for various reasons such as protecting data privacy. Under such circumstances, it is difficult to verify the source of network data, which leads to questions such as: Given an anonymized graph, can we identify the network from which it is collected? Or if one claims the graph is sampled from a certain network, can we verify it? The intuitive approach is to check for subgraph isomorphism. However, subgraph isomorphism is NP-complete; hence, infeasible for most large networks. Inspired by biometrics studies, we address these challenges by formulating two new problems: network identification and network authentication. To tackle these problems, similar to research on human fingerprints, we introduce two versions of a network identity: (1) embedding-based identity and (2) distribution-based identity. We demonstrate the effectiveness of these network identities on various real-world networks. Using these identities, we propose two approaches for network identification. One method uses supervised learning and can achieve an identification accuracy rate of 94.7%, and the other, which is easier to implement, relies on distances between identities and achieves an accuracy rate of 85.5%. For network authentication, we propose two methods to build a network authentication system. The first is a supervised learner and provides a low false accept rate and the other method allows one to control the false reject rate with a reasonable false accept rate across networks. Our study can help identify or verify the source of network data, validate network-based research, and be used for network-based biometrics.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Hozhabrierdi, Pegah; Zafarani, Reza

The Impact of Graph Structure on Small-World Shortest Paths Proceedings Article

In: Proceedings of the 2019 International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS), 2019.

Abstract | BibTeX

Zafarani, Reza; Zhou, Xinyi; Shu, Kai; Liu, Huan

Fake News Research: Theories, Detection Strategies, and Open Problems Proceedings Article

In: Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019.

Abstract | BibTeX

Shu, Kai; Zhou, Xinyi; Wang, Suhang; Zafarani, Reza; Liu, Huan

The Role of User Profiles for Fake News Detection Proceedings Article

In: Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2019.

BibTeX

Zhou, Xinyi; Zafarani, Reza

Fake News Detection: An Interdisciplinary Research Proceedings Article

In: Companion Proceedings of The Web Conference 2019 (WWW), 2019.

Abstract | Links | BibTeX

Zhou, Xinyi; Zafarani, Reza; Shu, Kai; Liu, Huan

Fake News: Fundamental Theories, Detection Strategies and Challenges Proceedings Article

In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM), 2019.

Abstract | Links | BibTeX

Jin, Shengmin; Zafarani, Reza

Representing Networks with 3D Shapes Proceedings Article

In: Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), 2018, (Code: https://github.com/shengminjin/KroneckerHull).

Abstract | Links | BibTeX

@inproceedings{shengmin2018representing,

title = {Representing Networks with 3D Shapes},

author = {Shengmin Jin and Reza Zafarani},

url = {https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8594842},

year  = {2018},

date = {2018-01-01},

booktitle = {Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM)},

abstract = {There has been a surge of interest in machine learning in graphs, as graphs and networks are ubiquitous across the globe and within science and engineering: road networks, power grids, protein-protein interaction networks, scientific collaboration networks, social networks, to name a few. Recent machine learning research has focused on efficient and effective ways to represent graph structure. Existing graph representation methods such as network embedding techniques learn to map a node (or a graph) to a vector in a low-dimensional vector space. However, the mapped values are often difficult to interpret, lacking information on the structure of the network or its subgraphs. Instead of using a low-dimensional vector to represent a graph, we propose to represent a network with a 3-dimensional shape: the network shape. We introduce the first network shape, a Kronecker hull, which represents a network as a 3D convex polyhedron using stochastic Kronecker graphs. We present a linear time algorithm to build Kronecker hulls. Network shapes provide a compact representation of networks that is easy to visualize and interpret. They captures various properties of not only the network, but also its subgraphs. For instance, they can provide the distribution of subgraphs within a network, e.g., what proportion of subgraphs are structurally similar to the whole network? Using experiments on real-world networks, we show how network shapes can be used in various applications, from computing similarity between two graphs (using the overlap between network shapes of two networks) to graph compression, where a graph with millions of nodes can be represented with a convex hull with less than 40 boundary points.},

note = {Code: https://github.com/shengminjin/KroneckerHull},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Jin, Shengmin; Zafarani, Reza

Sentiment Prediction in Social Networks Proceedings Article

In: Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), 2018, (Dataset: http://data.syr.edu/get/EmotionPatterns/).

Abstract | Links | BibTeX

@inproceedings{shengmin2018sentiment,

title = {Sentiment Prediction in Social Networks},

author = {Shengmin Jin and Reza Zafarani},

url = {https://dl.acm.org/citation.cfm?id=3132932},

year  = {2018},

date = {2018-01-01},

booktitle = {Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM)},

abstract = {Sentiment analysis research has focused on using text for predicting sentiments without considering the unavoidable peer influence on user emotions and opinions. The lack of large-scale ground-truth data on sentiments of users in social networks has limited research on how predictable sentiments are from social ties. In this paper, using a large-scale dataset on human sentiments, we study sentiment prediction within social networks. We demonstrate that sentiments are predictable using structural properties of social networks alone. With social science and psychology literature, we provide evidence on sentiments being connected to social relationships at four different network levels, starting from the ego-network level and moving up to the whole-network level. We discuss emotional signals that can be captured at each level of social relationships and investigate the importance of structural features on each network levels. We demonstrate that sentiment prediction that solely relies on social network structure can be as (or more) accurate than text-based techniques. For the situations where complete posts and friendship information are difficult to get, we analyze the trade-off between the sentiment prediction performance and the available information. When computational resources are limited, we show that using only four network properties, one can predict sentiments with competitive accuracy. Our findings can be used to (1) validate the peer influence on user sentiments, (2) improve classical text-based sentiment prediction methods, (3) enhance friend recommendation by utilizing sentiments, and (4) help identify personality traits.},

note = {Dataset: http://data.syr.edu/get/EmotionPatterns/},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Shu, Kai; Wang, Suhang; Tang, Jiliang; Zafarani, Reza; Liu, Huan

User Identity Linkage across Online Social Networks: A Review. Journal Article

In: ACM SIGKDD Explorations Newsletter, 2017.

Abstract | Links | BibTeX

Jin, Shengmin; Zafarani, Reza

Emotions in Social Networks: Distributions, Patterns, and Models Proceedings Article

In: Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM), 2017, (Dataset: http://data.syr.edu/get/EmotionPatterns/).

Abstract | Links | BibTeX

@inproceedings{shengmin2017emotions,

title = {Emotions in Social Networks: Distributions, Patterns, and Models},

author = {Shengmin Jin and Reza Zafarani},

url = {https://ieeexplore.ieee.org/document/8637419},

year = {2017},

date = {2017-01-01},

booktitle = {Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM)},

abstract = {Understanding the role emotions play in social interactions has been a central research question in the social sciences. However, the challenge of obtaining large-scale data on human emotions has left the most fundamental questions on emotions less explored: How do emotions vary across individuals, evolve over time, and are connected to social ties? We address these questions using a large-scale dataset of users that contains both their emotions and social ties. Using this dataset, we identify patterns of human emotions on five different network levels, starting from the user-level and moving up to the whole-network level. At the user-level, we identify how human emotions are distributed and vary over time. At the ego-network level, we find that assortativity is only observed with respect to positive moods. This observation allows us to introduce emotional balance, the "dual'' of structural balance theory. We show that emotional balance has a natural connection to structural balance theory. At the community-level, we find that community members are emotionally-similar and that this similarity is stronger in smaller communities. Structural properties of communities, such as their sparseness or isolatedness, are also connected to the emotions of their members. At the whole-network level, we show that there is a tight connection between the global structure of a network and the emotions of its members. As a result, we demonstrate how one can accurately predict the proportion of positive/negative users within a network by only looking at the network structure. Based on our observations, we propose the Emotional-Tie model – a network model that can simulate the formation of friendships based on emotions. This model generates graphs that exhibit both patterns of human emotions identified in this work and those observed in real-world social networks, such as having a high clustering coefficient. Our findings can help better understand the interplay between emotions and social ties.},

note = {Dataset: http://data.syr.edu/get/EmotionPatterns/},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Understanding the role emotions play in social interactions has been a central research question in the social sciences. However, the challenge of obtaining large-scale data on human emotions has left the most fundamental questions on emotions less explored: How do emotions vary across individuals, evolve over time, and are connected to social ties? We address these questions using a large-scale dataset of users that contains both their emotions and social ties. Using this dataset, we identify patterns of human emotions on five different network levels, starting from the user-level and moving up to the whole-network level. At the user-level, we identify how human emotions are distributed and vary over time. At the ego-network level, we find that assortativity is only observed with respect to positive moods. This observation allows us to introduce emotional balance, the "dual'' of structural balance theory. We show that emotional balance has a natural connection to structural balance theory. At the community-level, we find that community members are emotionally-similar and that this similarity is stronger in smaller communities. Structural properties of communities, such as their sparseness or isolatedness, are also connected to the emotions of their members. At the whole-network level, we show that there is a tight connection between the global structure of a network and the emotions of its members. As a result, we demonstrate how one can accurately predict the proportion of positive/negative users within a network by only looking at the network structure. Based on our observations, we propose the Emotional-Tie model – a network model that can simulate the formation of friendships based on emotions. This model generates graphs that exhibit both patterns of human emotions identified in this work and those observed in real-world social networks, such as having a high clustering coefficient. Our findings can help better understand the interplay between emotions and social ties.

Close

Zafarani, Reza; Liu, Huan

Users Joining Multiple Sites: Friendship and Popularity Variations across Sites Journal Article

In: Information Fusion, 2016.

Abstract | Links | BibTeX

Liu, Huan; Morstatter, Fred; Tang, Jiliang; Zafarani, Reza

The good, the bad, and the ugly: uncovering novel research opportunities in social media mining Journal Article

In: International Journal of Data Science and Analytics, vol. 1, no. 3-4, pp. 137–143, 2016.

Abstract | Links | BibTeX