Relation-Aware Graph Learning with Mixture-of-Experts Prediction for Cognitive Diagnosis

Jingwei Qu1     Mingze Zhang1     Pingshun Zhang1     Li Tao1     Ying Wang1     Zhaofang Yang1     Haibin Ling2
1Southwest University     2Westlake University
[PDF]     [Code]


Abstract

Cognitive diagnosis aims to infer students’ concept-level mastery from their exercise response logs and exercise-concept associations. Fully leveraging holistic heterogeneous relations and modeling the substantial variations in student mastery and exercise difficulty remain challenging, especially when prediction relies on a single predictor. To address these challenges, we propose RMCD, a unified cognitive diagnosis model that integrates relation-aware graph learning with Mixture-of-Experts (MoE) prediction. RMCD constructs a heterogeneous relational graph over students, exercises, and concepts with multiple relation types, and employs a relation-aware graph encoder that learns node and edge representations simultaneously. The encoder further derives relation-strength vectors from student-concept and exercise-concept edges to differentiate relation effects and refine node representations, enabling effective relation learning. On top of the learned representations, RMCD introduces an MoE-based prediction head that adaptively combines multiple expert predictors conditioned on the three-entity representations, thereby capturing diverse mastery-difficulty discrepancies and alleviating the limitation of a unified predictor. Extensive experiments on benchmark datasets demonstrate that RMCD consistently outperforms state-of-the-art cognitive diagnosis methods.


Conceptual illustration of RMCD


Architecture of RMCD


Quantitative Results

Comparison of cognitive diagnosis performance on the ASSIST17, ASSIST09, and Junyi datasets. All metrics are reported in %; lower RMSE and higher ACC/AUC are better. Numbers in bold indicate the best performance.


Ablation Study

Ablation study of the MoE head and the regularizer \(\mathcal{L}_r\).


Ablation study of the relation-aware graph encoder depth on ASSIST17 (w/o MoE).


Ablation study of the sub-layer roles on ASSIST17.


Ablation study of the sub-layer order on ASSIST17.


Ablation study of the gating input on ASSIST17.


Ablation study of the key hyperparameters \(n_l\), \(\lambda\), and \(n_e\).


Efficiency comparison between baselines and RMCD with different numbers of experts on ASSIST09.


Reference

@inproceedings{qu2026relation,
        title={Relation-Aware Graph Learning with Mixture-of-Experts Prediction for Cognitive Diagnosis},
        author={Qu, Jingwei and Zhang, Mingze and Zhang, Pingshun and Tao, Li and Wang, Ying and Yang, Zhaofang and Ling, Haibin},
        booktitle={Proceedings of the International Joint Conference on Artificial Intelligence},
        year={2026}
}

Contact: If you have any questions, please contact Jingwei Qu at qujingwei@swu.edu.cn.