Relation-Aware Graph Learning with Mixture-of-Experts Prediction for Cognitive Diagnosis

瞿经纬1     张铭泽1     张平顺1     陶丽1     王英1     杨照芳1     凌海滨2
1西南大学     2西湖大学
[PDF]     [Code]


Abstract

Cognitive diagnosis aims to infer students’ concept-level mastery from their exercise response logs and exercise-concept associations. Fully leveraging holistic heterogeneous relations and modeling the substantial variations in student mastery and exercise difficulty remain challenging, especially when prediction relies on a single predictor. To address these challenges, we propose RMCD, a unified cognitive diagnosis model that integrates relation-aware graph learning with Mixture-of-Experts (MoE) prediction. RMCD constructs a heterogeneous relational graph over students, exercises, and concepts with multiple relation types, and employs a relation-aware graph encoder that learns node and edge representations simultaneously. The encoder further derives relation-strength vectors from student-concept and exercise-concept edges to differentiate relation effects and refine node representations, enabling effective relation learning. On top of the learned representations, RMCD introduces an MoE-based prediction head that adaptively combines multiple expert predictors conditioned on the three-entity representations, thereby capturing diverse mastery-difficulty discrepancies and alleviating the limitation of a unified predictor. Extensive experiments on benchmark datasets demonstrate that RMCD consistently outperforms state-of-the-art cognitive diagnosis methods.


Conceptual illustration of RMCD


Architecture of RMCD


Quantitative Results

Comparison of cognitive diagnosis performance on the ASSIST17, ASSIST09, and Junyi datasets. All metrics are reported in %; lower RMSE and higher ACC/AUC are better. Numbers in bold indicate the best performance.


Ablation Study

Ablation study of the MoE head and the regularizer \(\mathcal{L}_r\).


Ablation study of the relation-aware graph encoder depth on ASSIST17 (w/o MoE).


Ablation study of the sub-layer roles on ASSIST17.


Ablation study of the sub-layer order on ASSIST17.


Ablation study of the gating input on ASSIST17.


Ablation study of the key hyperparameters \(n_l\), \(\lambda\), and \(n_e\).


Efficiency comparison between baselines and RMCD with different numbers of experts on ASSIST09.


Reference

@inproceedings{qu2026relation,
        title={Relation-Aware Graph Learning with Mixture-of-Experts Prediction for Cognitive Diagnosis},
        author={Qu, Jingwei and Zhang, Mingze and Zhang, Pingshun and Tao, Li and Wang, Ying and Yang, Zhaofang and Ling, Haibin},
        booktitle={Proceedings of the International Joint Conference on Artificial Intelligence},
        year={2026}
}

联系方式:如有任何疑问,欢迎通过邮箱qujingwei@swu.edu.cn与瞿经纬联系。