Link prediction plays an important role in multiple areas of artificial intelligence, including social network analysis and bioinformatics; however, it is often negatively affected by the data sparsity problem. In this paper, we present and validate our hypothesis, i.e., for sparse networks, incidence matrix factorization (IMF) could perform better than adjacency matrix factorization (AMF), the latter used in many previous studies. A key observation supporting our hypothesis here is that IMF models a partially observed graph more accurately than AMF. Unfortunately, a technical challenge we face in validating our hypothesis is that there is not an obvious method for making link prediction using a factorized incidence matrix, unlike the AMF approach. To this end, we developed an optimization-based link prediction method. Then we have conducted thorough experiments using both synthetic and real-world datasets to investigate the relationship between the sparsity of a network and the predictive performance of the aforementioned two factorization approaches. Our experimental results show that IMF performed better than AMF as networks became sparser, which validates our hypothesis.
ASJC Scopus subject areas
- Computer Science(all)