§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2307202515501500
DOI 10.6846/tku202500609
論文名稱(中文) 基於知識圖譜嵌入的大型語言模型之文本指令式LoRA微調方法
論文名稱(英文) A LoRA-Based Instruction Fine-Tuning Framework for Large Language Models with Knowledge Graph Embeddings
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 機械與機電工程學系碩士班
系所名稱(英文) Department of Mechanical and Electro-Mechanical Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 陳威宇
研究生(英文) Wei-Yu Chen
學號 612370303
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2025-07-03
論文頁數 66頁
口試委員 指導教授 - 王銀添( ytwang@mail.tku.edu.tw)
口試委員 - 許閔傑
口試委員 - 邱銘杰
關鍵字(中) LoRA微調
知識圖譜
大型語言模型
知識圖譜嵌入訓練
關鍵字(英) LoRA Fine-Tuning
Knowledge Graph
Large Language Model
Knowledge Graph Embedding Training.
第三語言關鍵字
學科別分類
中文摘要
本研究提出一種結合知識圖譜嵌入之大型語言模型微調方法,稱為知識圖譜增強LoRA(Knowledge Graph Enhanced LoRA, KGELoRA),以強化語言模型對結構化知識的理解與應答能力。相較於傳統 LoRA (Low-Rank Adaptation)僅依賴文本進行微調,本研究額外引入企業場景中所構建之知識圖譜,並設計一套結合語意與結構的訓練流程。資料方面,本研究自建指令式資料集,涵蓋公司簡介、技術模組與應用情境等內容,並標註主概念、子概念與三元組關係。資料處理上,先透過語意錨定取得概念的語意向量,並設計雙路徑機制以輔助 LLM (Large Language Model) 微調。結構路徑用於建構圖譜結構,採用 PyKEEN 之 RotatE 模型訓練概念節點與語意關係嵌入,再經 TripleGatedController 產生控制向量,注入至 LoRA 模組之注意力投影層以調節模型語意方向。語意路徑則將錨定向量與 LLM 的hidde state向量進行 Cross-Attention 融合,並取代原始 hidden state 以影響最終輸出詞彙生成。實驗比較顯示,KGELoRA 架構在Rouge-L與Bertscore方面皆優於標準LoRA,特別在多概念推理與知識密集型查詢任務中表現更為穩定。整體而言,本研究展示了知識圖譜嵌入對微調語言模型的增強潛力,並提供一套適用於企業場域的知識導向微調策略。
英文摘要
This study proposes a novel fine-tuning method for large language models enhanced with knowledge graph embeddings, termed Knowledge Graph Enhanced LoRA (KGELoRA), aiming to improve the model’s ability to comprehend and respond using structured knowledge. In contrast to traditional LoRA approaches that rely solely on textual data, KGELoRA incorporates knowledge graphs constructed from enterprise domains and introduces a training framework that integrates both semantic and structural information.
In terms of data, we construct a custom instruction-style dataset covering company profiles, technical modules, and application scenarios, with annotations for primary concepts, sub-concepts, and (head, relation, tail) triples. For preprocessing, semantic anchoring is first employed to obtain semantic vectors for each concept. A dual-path mechanism is then designed to assist LLM fine-tuning. The structural path constructs graph topology using RotatE embeddings trained with the PyKEEN framework, and generates control vectors via a TripleGatedController, which are injected into the LoRA attention projection layers to guide semantic modulation. Meanwhile, the semantic path fuses the anchored semantic vectors with the LLM's hidden states through a Cross-Attention mechanism, replacing the original hidden states to influence the final output token generation.
Experimental comparisons show that the KGELoRA framework outperforms standard LoRA on both Rouge-L and BERTScore metrics, especially demonstrating superior stability in multi-concept reasoning and knowledge-intensive query tasks. Overall, this study showcases the potential of knowledge graph embeddings to enhance LoRA-based fine-tuning and presents a practical, knowledge-guided adaptation strategy tailored for enterprise-level applications.
第三語言摘要
論文目次
目錄
致謝	III
圖目錄	VI
表目錄	VII
第 1 章 緒論	1
1.1 研究動機	1
1.2 研究目的	2
1.3 文獻探討	4
1.3.1 大型語言模型Transfomer相關論文	4
1.3.2  LoRA微調相關文獻	5
1.3.3 指令式微調相關論文	6
1.3.4 知識圖譜相關論文	7
1.3.5  知識圖譜嵌入相關論文	8
1.4 研究範圍	10
1.5 論文貢獻	10
1.6 論文架構	10
第2章 LoRA資料集與處理	11
2.1 資料設計	11
2.2 資料來源與建構過程	13
2.3 資料格式與內容結構	15
2.4 概念標註與圖譜結構整合	18
2.5 圖譜嵌入與微調應用	20
2.6 資料統計與版本說明	22
第3章 KGELoRA模型架構與方法設計	24
3.1 整體架構概述	24
3.2 雙路徑機制設計	27
3.2.1 語意錨定向量之建構	29
3.2.2 結構性嵌入向量之建構	31
3.2.3 語意注入路徑	35
3.2.4 三元組控制向量之生成	37
3.3 概念向量預訓練與損失設計	39
3.3.2 語意對齊損失	40
3.3.3 總損失設計	41
3.4 模型訓練流程總覽	42
第4章 實驗設計與結果分析	45
4.1 實驗環境與硬體配置	45
4.2 評估設定與實驗任務格式	47
4.3 比較模型與消融設計	49
4.4 實驗結果分析	51
4.5 小結	56
第5章 結論與未來工作	57
5.1 研究總結	57
5.2 貢獻與意義	59
5.3 未來研究方向	61
參考文獻	63


	
 
圖目錄
圖 2.1知識圖譜概念圖	12
圖 2.2指令式資料格式示意圖(包含Instruction、Input與Output三欄位)。	17
圖 2.3概念標註欄位格式結構圖	19
圖 2.4 graph_links欄位三元組視覺化	19
圖 3.1 KGELoRA模型之整體架構圖	26
圖 3.2 RotatE隨機訓練(PCA)	33
圖 3.3 語意錨定向量幫RotatE初始化訓練(PCA)	33
圖 3.4 RotatE模型訓練出的三元組到TripleGatedController	34
圖 3.5 語意錨定向量和LLM hidden state做Cross-Attention	36
圖 3.6 TripleGatedController模組生成控制向量	38
圖 3.7 具體流程圖	44
圖 4.1資料一之50輪測試平均評估分數(柱狀圖)	53
圖 4.2 資料二之50輪測試平均評估分數(柱狀圖)	54
圖 4.3 資料三之50輪測試平均評估分數(柱狀圖)	55

 
表目錄
表 2.1 Baseline與KGELoRA模型之輸入格式與知識結構比較	23
表 4.1 訓練超參數	46
表 4.2 資料一之50輪測試平均評估分數	53
表 4.3 資料二之50輪測試平均評估分數	54
表 4.4 資料三之50輪測試平均評估分數	55

參考文獻
[1]	E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-Rank Adaptation of Large Language Models,” In Proc. Int. Conf. Learn. Representations (ICLR), 2022.
[2]	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin, "Attention is all you need, " In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010, 2017.
[3]	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 4171–4186, 2019.
[4]	Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei,  "Language models are few shot learners," In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), pp. 1877–1901, 2020.
[5]	Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu,  "Exploring the limits of transfer learning with a unified text-to-text transformer, "  In Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020.
[6]	Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le, "Finetuned language models are zero shot learners, " In Proceedings of the 10th International Conference on Learning Representations (ICLR), 2022.
[7]	Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, José Emilio Labra Gayo, Sabrina Kirrane, Sebastian Neumaier, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann, "Knowledge graphs," ACM Computing Surveys, vol. 54, no. 4, article 71, pp. 1–37, 2021.
[8]	Shangbin Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, Philip S. Yu, "A survey on knowledge graphs: Representation, acquisition and applications, " IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 494–514, 2021. DOI: 10.1109/TNNLS.2021.3070843
[9]	Zhiqing Sun, Zhi Hong Deng, Jian Yun Nie, and Jian Tang, "RotatE: Knowledge graph embedding by relational rotation in complex space, " In Proceedings of the 7th International Conference on Learning Representations (ICLR), 2019.
[10]	Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko, "Translating embeddings for modeling multi-relational data, " In Advances in Neural Information Processing Systems, vol. 26, pp. 2787–2795, 2013.
[11]	Théo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, Guillaume Bouchard, " Complex embeddings for simple link prediction, " In Proceedings of the 33rd International Conference on Machine Learning ,PMLR 48:2071-2080, 2016.
[12]	P. Jiang, L. Cao, C. Xiao, P. Bhatia, J. Sun, and J. Han, " KG-FIT: Knowledge graph fine-tuning upon open-world knowledge," arXiv preprint arXiv:2405.16412, 2024.
[13]	Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, and the LLaMA Team, "The LLaMA 3 herd of models, " CoRR, abs/2407.21783, 2024.
[14]	S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[15]	Meta AI, " LLaMA 3.2 3B: Multilingual large language model, " Hugging Face, 2024. [Online]. Available:https://huggingface.co/meta-llama/Llama-3.2-3B.
[16]	J. Chen, S. Xiao, P. Zhang, K. Luo, D. Lian, and Z. Liu, "BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation," arXiv preprint arXiv:2402.03216, 2024.
[17]	Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd, "The PageRank citation ranking: Bringing order to the web, " Stanford InfoLab, Technical Report, 1998. [Online]. Available: http://ilpubs.stanford.edu:8090/422/
[18]	Jonathon Shlens, "A tutorial on principal component analysis, " CoRR, abs/1404.1100, 2014.
[19]	S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Millican, G. van den Driessche, J.-B. Lespiau, B. Damoc, A. Clark, D. de Las Casas, A. Guy, J. Menick, R. Ring, T. Hennigan, S. Huang, L. Maggiore, C. Jones, A. Cassirer, A. Brock, M. Paganini, G. Irving, O. Vinyals, S. Osindero, K. Simonyan, J. Rae, E. Elsen, and L. Sifre, "Improving language models by retrieving from trillions of tokens," In Proceedings of the 39th International Conference on Machine Learning (ICML), pp. 2206–2240, 2022.
[20]	D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533–536, Oct. 1986.
[21]	A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala,"PyTorch: An imperative style, high-performance deep learning library," in Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8024–8035, 2019.
[22]	E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, and W. Chen, "Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment," arXiv preprint arXiv:2303.05350, 2023.
[23]	M. Ali, M. Berrendorf, C. T. Hoyt, L. Vermue, M. Galkin, S. Sharifzadeh, A. Fischer, T. Chwala, K. M. Asmat, N. Atanasova, and J. Lehmann, "PyKEEN 1.0: A Python library for training and evaluating knowledge graph embeddings," Journal of Machine Learning Research, vol. 22, no. 82, pp. 1–6, 2021.
[24]	C.-Y. Lin, "ROUGE: A package for automatic evaluation of summaries," in Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, Spain, 2004, pp. 74–81.
[25]	T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi,"BERTScore: Evaluating text generation with BERT,"arXiv preprint arXiv:1904.09675, 2019.
[26]	D. M. W. Powers, "Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation," Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信