Motivation: Link prediction or recommendation is very popular in today's online services, such as recommending potential friends in Facebook, suggesting qualified employers in LinkedIn and recommending research colleagues in ResearchGate. Link recommendation is actually a personalized ranking problem - producing a list of ranked people to target user. Signed networks, e.g. Epinions or Slashdot, allows people to indicate whether they trust (positive) or distrust (negative) each other, so that the plausible recommendation result would be a list with positive relationships on the top and negative links at the bottom. Traditional evaluation measures - AUC - is no longer suitable for the case where there exist three types of relationships: positive, negative and unknown. A recently proposed generalized AUC (GAUC), although cares about both the head and tail of rank list, does not require that the head links are all positive or the tail links are all negative. Therefore, this paper starts from GAUC, deriving two lower bounds for GAUC which can be computed in linear time, and further develop two probabilistic models to infer personalized list of latent links.
Approach: In AUC, both negative and latent links are considered as negative. AUC compares each pair of positive and negative links, and calculates the fraction of pairs within which the positive one has higher rank than the negative one. Therefore, the perfect rank list (positive on top, negative at bottom) has AUC = 1, where the worst rank list (negative on top, positive at bottom) has AUC = 0. There are two major disadvantages in AUC: (i) it is designed for binary case, while not suitable for cases of three classes; (ii) the computation is slow as it requires to compare each pair of links. This paper presents two lower bounds - bound I and bound II. Bound I compares each positive link with that link from latent and negative classes holding the maximum ranking score, and meantime compare each negative link with the link from positive and latent classes holding the minimum ranking score. Bound II is a stricter bound. It compares the minimum ranking score in positive class with the maximum score from negative and latent classes, meanwhile compares the maximum ranking score in negative class with the minimum one from positive and latent classes. Figure 1 shows the values of AUC, GAUC, GAUC-bound-I and GAUC-bound-II for 4 ranking list.
With the goal of optimizing the above two bounds, this paper further proposes two probabilistic models for link recommendation - efficient latent link recommendation-I (ELLR-I) and efficient latent link recommendation-II (ELLR-II). Given a partially observed matrix X, they want to learn two low-rank matrices U and V such that the ranking lists can be optimized in the sense of lower bounds for GAUC. They employ Bayesian model aiming to maximize the following posterior distribution:
where > f denotes the orderings obtained by X.
Experiments: They compare their algorithms ELLR-I and ELLR-II with the state-of-the-art baselines, including network topology-based method common neighbor (CN), point-wise approach matrix factorization (MF), pairwise methods maximum margin matrix factorization (MMMF), Bayesian personalized ranking based matrix factorization (BPR+MF), list-wise algorithms List-MF, as well as GAUC-optimization-based method OPT-GAUC. Four signed graphs are used, including Wikipedia, Slashdot, Epinions and MovieLens10M. Results reveal that ELLR-I and ELLR-II outperforms those top-performed competing methods in signed networks.
No comments:
Post a Comment