Submitted:
24 October 2024
Posted:
25 October 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- This paper presents an innovative end-to-end knowledge-aware hybrid model for course recommendation. The model multiplies head entities with relational paths, then applies a multi-head attention mechanism, followed by multiplication with the tail entities of the current hop after the softmax function. Finally, it concatenates the representations from each hop, enhancing the representation learning capability of both learners and courses;
- At the same time, the model fully leverages the advantages of graph neural networks and bi-directional long short-term memory networks, cleverly integrating the two to achieve a fusion of graph networks and temporal networks. This approach better captures the long-term dependencies in graph-structured data, resulting in a more suitable recommendation model;
- We utilize the MOOCCube dataset from XuetangX, one of China's largest online course platforms to validate the effectiveness of the proposed model. Experimental results show that our model greatly exceeds the performance of the current state-of-the-art baselines.
2. Related Work
2.1. Course Recommendations
2.2. Recommendation Method Based on Knowledge Graph
2.2.1. Embedded-Based Methods
2.2.2. Path-Based Methods
2.2.3. Graph Neural Network-Based Methods
3. Problem Definition
- Learner-Course Interaction Graph: In a typical course recommendation scenario, we use to represent the set of M learners, and to represent the set of N courses. The interaction between learners and courses is denoted by the matrix , where learner-course interactions, based on implicit feedback, can be expressed as follows:When a learner has interacted with a course (e.g., by clicking, browsing, etc.), we set to indicate that the learner has a behavioral preference for that course. Conversely, if no interaction has occurred, . It is important to note that does not necessarily imply that the learner dislikes the course. It could also be due to the vast number of available courses, where the learner has not yet encountered the course, and thus no interaction has taken place.
- Course Knowledge Graph: Knowledge graphs store rich factual information in a graph structure, which essentially forms a semantic network of relationships between learners and courses, interweaving complex relationships among learners, courses, and other entities. These relationships can be formalized in the form of triples, i.e., (head entity, relationship, tail entity). Thus, a heterogeneous knowledge graph can be represented as , wheredenotes the existence of a relationship between the head entity and the tail entity , and and represent the sets of entities and relations, respectively. For example, a course recommendation triple (Data Structures, School, Tsinghua University) indicates that Tsinghua University offers a course on data structures. However, there may be cases where different universities offer courses with the same name, such as (Data Structures, School, South China University of Technology), which can cause confusion about the entities associated with the course. Therefore, to effectively align a course with an entity in the knowledge graph, we adopt the set , which helps account for cases where different schools offer courses with the same name, ensuring accuracy and clarity in the correspondence.
- Recommendation Task: Given the learner-course interaction matrix and the course knowledge graph , the goal of the recommendation task is to predict the probability that a learner will next click on a course that they have not interacted with. Specifically, we aim to learn a prediction function, where denotes the predicted probability and represents the model parameters of the function .
4. Method
4.1. Symbol Summary
4.2. Model

4.2.1. Knowledge Graph Diffusion Module
4.2.2. Relation-Aware Multi-Head Attention Network
4.2.3. Temporal Preference Modeling
4.2.4. Prediction Module
4.2.5. Loss Function
5. Experiments
5.1. Dataset
5.2. Baselines
- LR [33]: Linear Regression has been widely employed in classification tasks, playing a significant role in industrial Click-Through Rate (CTR) prediction. This approach utilizes a weighted sum of relevant features as the input to the model.
- BPR [34]: Bayesian Personalized Ranking (BPR) is a traditional collaborative filtering technique that leverages Bayesian methods to optimize the pairwise ranking loss function in recommendation tasks.
- FM [35]: Factorization Machines (FM) are principled models that can account for interactions between features and conveniently integrate any heuristic features. In our experiments, we utilized all available information except for the secondary subjects.
- DKN [23]: The Deep Knowledge-Aware Network introduces knowledge graph representations into recommendations to predict click-through rates. The core of DKN is a multi-channel knowledge-aware convolutional neural network that integrates semantic and knowledge representations while maintaining the alignment between words and entities. In this study, course titles are treated as the textual input for DKN.
- RippleNet [27]: This is an end-to-end framework that inherently integrates knowledge graphs into recommendation systems. It simulates the propagation of ripples on water surfaces to automatically expand users' possible interests along the links of the knowledge graph, thereby facilitating the diffusion of user preferences. Multiple "ripples" activated by historical clicks accumulate to form a preference distribution for candidate items, which is then utilized to predict click probabilities.
- KGNN-LS [36]: This approach proposes a Knowledge-Aware Graph Neural Network with Label Smoothing Regularization, which computes user-specific item embeddings through a trainable function, transforms the knowledge graph into a weighted graph, and applies graph neural networks for personalized computations.
- CKAN [37]: This paper introduces a novel Cooperative Knowledge-Aware Attention Network (CKAN) that explicitly encodes cooperative signals through heterogeneous propagation strategies while distinguishing the contributions of different knowledge neighbors using a knowledge-aware attention mechanism.
- KGAN [20]: A course recommendation model based on Knowledge Group Aggregation Networks, which utilizes a heterogeneous graph iteration that describes the relationships between courses and facts to estimate learners' learning interests, projecting learner behaviors and course graphs into a unified space.
- KFGAN [38]: based on a knowledge-aware fine-grained attention network, achieves consistency and coherence between collaborative filtering and knowledge graph information, draws on graph contrastive learning methods to further uncover latent semantic information within the knowledge graph.
5.3. Implementation Details
5.4. Performance Comparison
- In prediction tasks, knowledge-aware recommendation models generally outperform classical recommendation models, with the exception of DKN. This may be attributed to the fact that knowledge-aware models effectively utilize knowledge graphs as auxiliary information, alleviating the high sparsity present in the course dataset.
- The DKN model underperformed compared to classical models such as BPR and FM in course recommendations. This may be due to the knowledge graph embedding (KGE) method employed by DKN, which is better suited for intra-graph applications rather than recommendation tasks, resulting in suboptimal entity embeddings for item recommendations.
- Among classical recommendation models, BPR demonstrated the best performance, as it leverages Bayesian methods to optimize the pairwise ranking loss function in recommendation tasks, facilitating increased attention to high-quality courses by more learners.
- Within knowledge-aware recommendation methods, DKN exhibited the poorest performance, indicating that propagation-based approaches are superior to embedding-based methods.
- The KGAN and RippleNet models significantly outperformed the CKAN and KGNN-LS models in course recommendations. A possible explanation for this is that the introduction of collaborative information may carry more noise, particularly in the face of the highly sparse nature of course recommendation data.
- Compared to these state-of-the-art baselines, PGDB markedly outperforms the latest optimal KGAN and KFGAN models. This suggests that the PGDB model is more effective at uncovering the relationships between courses while emphasizing the transmission of important knowledge, thereby enhancing the accuracy of course recommendations.
- Both PGDB and its variants consistently exceeded the performance of all baseline models, demonstrating the competitive advantage of the PGDB model in course recommendation. The superior performance of the PGDB model over PGDB-s highlights the benefits of the multi-head attention mechanism in simultaneously focusing on the transmission of multiple important information sources, which is conducive to performance enhancement. Although GRU is simpler compared to BiLSTM, this simplification may incur some performance loss.
5.5. Hyperparameter Influence

5.5.1. Number of Embedding Layers

5.5.2. Rejoin Propagation Triplet Sizes

6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Herlocker J L, Konstan J A, Borchers A; et al. An Algorithmic Framework for Performing Collaborative Filtering[J]. SIGIR Forum, 2017, 51, 227–234. [Google Scholar] [CrossRef]
- Resnick P, Iacovou N, Suchak M, et al. GroupLens: An Open Architecture for Collaborative Filtering of Netnews[C]. Acm Conference on Computer Supported Cooperative Work, 1994.
- He X, Liao L, Zhang H, et al. Neural Collaborative Filtering[C]. Proceedings of the 26th International Conference on World Wide Web, 2017: 173–182.
- He X, Chua T-S. Neural Factorization Machines for Sparse Predictive Analytics[C]. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017: 355–364.
- Hidasi B, Karatzoglou A, Baltrunas L, et al. Session-based Recommendations with Recurrent Neural Networks[J]. CoRR, 2015, abs/1511.06939.
- Zhou G, Mou N, Fan Y, et al. Deep interest evolution network for click-through rate prediction[C]. Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, 2019: Article 729.
- Duan C, Sun J, Li K; et al. A dual-attention autoencoder network for efficient recommendation system[J]. Electronics, 2021, 10, 1581. [Google Scholar] [CrossRef]
- Zheng G, Zhang F, Zheng Z, et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation[C]. Proceedings of the 2018 World Wide Web Conference, 2018: 167–176.
- Zhang F, Yuan N J, Lian D, et al. Collaborative Knowledge Base Embedding for Recommender Systems[C]. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016: 353–362.
- Zhao N, Long Z, Wang J; et al. AGRE: A knowledge graph recommendation algorithm based on multiple paths embeddings RNN encoder[J]. Knowl. Based Syst., 2022, 259, 110078. [Google Scholar] [CrossRef]
- Zhao Y, Ma W, Jiang Y, et al. A MOOCs Recommender System Based on User's Knowledge Background[C]. Knowledge Science, Engineering and Management, 2021.
- Gao C, Zheng Y, Li N; et al. A Survey of Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions[J]. ACM Trans. Recomm. Syst., 2023, 1, Article 3. [Google Scholar] [CrossRef]
- Apaza R G, Cervantes E V V, Quispe L V C, et al. Online Courses Recommendation based on LDA[C]. Symposium on Information Management and Big Data, 2014.
- Li X, Li X, Tang J, et al. Improving Deep Item-Based Collaborative Filtering with Bayesian Personalized Ranking for MOOC Course Recommendation[C]. Knowledge Science, Engineering and Management: 13th International Conference, KSEM 2020, Hangzhou, China, August 28–30, 2020, Proceedings, Part I, 2020: 247–258.
- Xu W, Zhou Y. Course video recommendation with multimodal information in online learning platforms: A deep learning framework[J]. Br. J. Educ. Technol., 2020, 51, 1734–1747. [Google Scholar] [CrossRef]
- Tian X, Liu F. Capacity Tracing-Enhanced Course Recommendation in MOOCs[J]. IEEE Transactions on Learning Technologies, 2021, 14, 313–321. [Google Scholar] [CrossRef]
- Zhu Y, Lu H, Qiu P; et al. Heterogeneous teaching evaluation network based offline course recommendation with graph learning and tensor factorization[J]. Neurocomputing, 2020, 415, 84–95. [Google Scholar] [CrossRef]
- Yang S, Cai X. Bilateral knowledge graph enhanced online course recommendation[J]. Information Systems, 2022, 107, 102000. [Google Scholar] [CrossRef]
- Wang X, Ma W, Guo L; et al. HGNN: Hyperedge-based graph neural network for MOOC Course Recommendation[J]. Inf. Process. Manage., 2022, 59, 18. [Google Scholar] [CrossRef]
- Zhang H, Shen X, Yi B; et al. KGAN: Knowledge Grouping Aggregation Network for course recommendation in MOOCs[J]. Expert Syst. Appl., 2023, 211, 9. [Google Scholar] [CrossRef]
- Deng W, Zhu P, Chen H; et al. Knowledge-aware sequence modelling with deep learning for online course recommendation[J]. Inf. Process. Manage., 2023, 60, 15. [Google Scholar] [CrossRef]
- Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion[C]. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015: 2181–2187.
- Wang H, Zhang F, Xie X, et al. DKN: Deep Knowledge-Aware Network for News Recommendation[J]. Proceedings of the 2018 World Wide Web Conference, 2018.
- Ji G, He S, Xu L, et al. Knowledge Graph Embedding via Dynamic Mapping Matrix[C]. Annual Meeting of the Association for Computational Linguistics, 2015.
- Sun Z, Yang J, Zhang J, et al. Recurrent knowledge graph embedding for effective recommendation[C]. Proceedings of the 12th ACM Conference on Recommender Systems, 2018: 297–305.
- Wang X, Wang D, Xu C, et al. Explainable reasoning over knowledge graphs for recommendation[C]. Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, 2019: Article 653.
- Wang H, Zhang F, Wang J, et al. RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems[C]. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018: 417–426.
- Wang H, Zhao M, Xie X, et al. Knowledge Graph Convolutional Networks for Recommender Systems[C]. The World Wide Web Conference, 2019: 3307–3313.
- Wang X, He X, Cao Y, et al. KGAT: Knowledge Graph Attention Network for Recommendation[C]. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019: 950–958.
- Zou D, Wei W, Wang Z, et al. Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning[C]. Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022: 2817–2826.
- Zhang Z, Zhang L, Yang D; et al. KRAN: Knowledge Refining Attention Network for Recommendation[J]. ACM Trans. Knowl. Discov. Data, 2021, 16, Article 39. [Google Scholar] [CrossRef]
- Yu J, Luo G, Xiao T, et al. MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs[C]. Annual Meeting of the Association for Computational Linguistics, 2020.
- Seber G A, Lee A J. Linear regression analysis[M]. John Wiley & Sons, 2012.
- Rendle S, Freudenthaler C, Gantner Z, et al. BPR: Bayesian personalized ranking from implicit feedback[C]. Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, 2009: 452–461.
- Rendle, S. Factorization Machines with libFM[J]. ACM Trans. Intell. Syst. Technol., 2012, 3, Article 57. [Google Scholar] [CrossRef]
- Wang H, Zhang F, Zhang M, et al. Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems[C]. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019: 968–977.
- Wang Z, Lin G, Tan H, et al. CKAN: Collaborative Knowledge-aware Attentive Network for Recommender Systems[C]. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020: 219–228.
- Wang W, Shen X, Yi B, et al. Knowledge-aware fine-grained attention networks with refined knowledge graph embedding for personalized recommendation, Expert Systems with Applications, Volume 249, Part B, 2024, 12810.
| Notation | Description |
|---|---|
| a collection of learners and courses, respectively | |
| Matrix of learner interactions with the course | |
| Knowledge graph, head entity, relational entity, tail entity | |
| The maximum number of hops | |
| Prediction function | |
| Sigmoid activation function | |
| Target loss function | |
| Weight matrix | |
| Bias |
| Dataset | Learner | Courses | Interaction | Entities | Relations | Triples |
| MOOCCube | 7156 | 219 | 32091 | 2029 | 7 | 20893 |
| Model | LR | BPR | FM | DKN | RippleNet | KGNN-LS | CKAN | KGAN | KFGAN | PGDB | PGDB-s | PGDB-g |
| AUC | 0.6283 | 0.7602 | 0.7593 | 0.7281 | 0.8516 | 0.8077 | 0.7809 | 0.8595 | 0.8564 | 0.8707 | 0.8683 | 0.8678 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).