Motivation: Collaborative websites, such as Q&A platforms, are characterized by a loose edit control, allowing users to freely edit their questions and answers. To help distinguish or rank answers, many websites incorporate functions like asker selecting the best answers, and users commenting or vote for qualified answers etc. However, such manual assessment might not scale up to the increasing volume of data, and tends to be subject to personal biases. Therefore, this paper proposes to create an automated (or semi-automated) assessment mechanism to rank answers based on their quality features.
Methods: They apply point-wise learning to rank (L2R) approach using random forests for ranking answers in Stack Overflow. In this model, they construct a large set of features from eight groups. The eight groups are user features, user graph features, review features, structure features, length features, style features, readability features and relevance features. User and user graph features describe user's activities, reputations and influence in asking-answering graphs. They introduce review features into this model because of the intuition that a content received many edits tends to be improved towards high quality. The structure, length, style and readability features capture answers' properties from different perspectives. The relevance features describe how relevant the answer is to a specific question.
Figure 1. NDCG@k for RF, RF-BaseFeatures and other methods. The RF method is proposed by this paper, and RF-BaseFeatures only use features that have already been proposed in prior works. |
Experiments: They apply the L2R model using Random Forest in Stack Overflow dataset, and find that (1) this proposed method outperforms all competing baselines regarding NDCG@k evaluation; Besides, (2) user and review features are the most significant groups of features than others.