My research interests span theoretical machine learning, statistical inference, optimization, and game theory. In my spare time, I like to read, run, and watch Cricket 🏏.
Submitted to IEEE Transactions on Signal and Information Processing over Networks
Due to the limited resources and the scale of the graphs in modern datasets, we often get to observe a sampled subgraph of a larger original graph of interest, whether it is the worldwide web that has been crawled or social connections that have been surveyed. Inferring a global property of the original graph from such a sampled subgraph is of a fundamental interest. In this work, we focus on estimating the number of connected components. It is a challenging problem and, for general graphs, little is known about the connection between the observed subgraph and the number of connected components of the original graph. In order to make this connection, we propose a highly redundant and large-dimensional representation of the subgraph, which at first glance seems counter-intuitive: subgraphs are represented by the counts of network motif patterns. This representation is crucial in introducing a novel estimator for the number of connected components for general graphs, under the knowledge of the spectral gap of the original graph. The connection is made precise via the Schatten \(k\)-norms of the graph Laplacian and the spectral representation of the number of connected components. We provide a guarantee on the resulting mean squared error that characterizes the bias-variance tradeoff. Experiments on synthetic and real-world graphs suggest that we improve upon competing algorithms for graphs with spectral gaps bounded away from zero.
Proceedings of the World Wide Web Conference (WWW), 2019
This paper proposes an attributed network growth model. Despite the knowledge that individuals use limited resources to form connections to similar others, we lack an understanding of how local and resource-constrained mechanisms explain the emergence of rich structural properties found in real-world networks. We make three contributions. First, we propose a parsimonious and accurate model of attributed network growth that jointly explains the emergence of in-degree distributions, local clustering, clustering-degree relationship and attribute mixing patterns. Second, our model is based on biased random walks and uses local processes to form edges without recourse to global network information. Third, we account for multiple sociological phenomena: bounded rationality, structural constraints, triadic closure, attribute homophily, and preferential attachment. Our experiments indicate that the proposed Attributed Random Walk (ARW) model accurately preserves network structure and attribute mixing patterns of six real-world networks; it improves upon the performance of eight state-of-the-art models by a statistically significant margin of 2.5-10x.