TY - JOUR
T1 - Link prediction in sparse networks by incidence matrix factorization
AU - Yokoi, Sho
AU - Kajino, Hiroshi
AU - Kashima, Hisashi
N1 - Funding Information:
Acknowledgments This work was supported by JSPS KA-KENHI Grant Number 15H01704.
Funding Information:
This work was supported by JSPS KAKENHI Grant Number 15H01704.
Publisher Copyright:
© 2017 Information Processing Society of Japan.
PY - 2017/7
Y1 - 2017/7
N2 - Link prediction plays an important role in multiple areas of artificial intelligence, including social network analysis and bioinformatics; however, it is often negatively affected by the data sparsity problem. In this paper, we present and validate our hypothesis, i.e., for sparse networks, incidence matrix factorization (IMF) could perform better than adjacency matrix factorization (AMF), the latter used in many previous studies. A key observation supporting our hypothesis here is that IMF models a partially observed graph more accurately than AMF. Unfortunately, a technical challenge we face in validating our hypothesis is that there is not an obvious method for making link prediction using a factorized incidence matrix, unlike the AMF approach. To this end, we developed an optimization-based link prediction method. Then we have conducted thorough experiments using both synthetic and real-world datasets to investigate the relationship between the sparsity of a network and the predictive performance of the aforementioned two factorization approaches. Our experimental results show that IMF performed better than AMF as networks became sparser, which validates our hypothesis.
AB - Link prediction plays an important role in multiple areas of artificial intelligence, including social network analysis and bioinformatics; however, it is often negatively affected by the data sparsity problem. In this paper, we present and validate our hypothesis, i.e., for sparse networks, incidence matrix factorization (IMF) could perform better than adjacency matrix factorization (AMF), the latter used in many previous studies. A key observation supporting our hypothesis here is that IMF models a partially observed graph more accurately than AMF. Unfortunately, a technical challenge we face in validating our hypothesis is that there is not an obvious method for making link prediction using a factorized incidence matrix, unlike the AMF approach. To this end, we developed an optimization-based link prediction method. Then we have conducted thorough experiments using both synthetic and real-world datasets to investigate the relationship between the sparsity of a network and the predictive performance of the aforementioned two factorization approaches. Our experimental results show that IMF performed better than AMF as networks became sparser, which validates our hypothesis.
KW - Data sparsity problem
KW - Incidence matrix
KW - Link prediction
KW - Matrix factorization
UR - http://www.scopus.com/inward/record.url?scp=85040938528&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85040938528&partnerID=8YFLogxK
U2 - 10.2197/ipsjjip.25.477
DO - 10.2197/ipsjjip.25.477
M3 - Article
AN - SCOPUS:85040938528
SN - 0387-5806
VL - 25
SP - 477
EP - 485
JO - Journal of Information Processing
JF - Journal of Information Processing
ER -