§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2906202516005800
DOI 10.6846/tku202500431
論文名稱(中文) 以知識圖譜強化RAG技術之交通法規問答系統
論文名稱(英文) Enhancing Traffic Regulation Question-Answering Systems with Knowledge Graphs for Robust Retrieval-Augmented Generation
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士在職專班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 連挺安
研究生(英文) Ting-An Lien
學號 712410058
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2025-06-14
論文頁數 40頁
口試委員 口試委員 - 張志勇(cychang@mail.tku.edu.tw)
指導教授 - 武士戎(wushihjung@mail.tku.edu.tw)
口試委員 - 廖文華
關鍵字(中) 知識圖譜
對比學習
向量RAG
關鍵字(英) Knowledge Graph
Contrastive Learning
RAG (Retrieval-Augmented Generation)
第三語言關鍵字
學科別分類
中文摘要
交通法規具高度結構性與法律嚴謹性,其條文繁複且專業詞彙豐富,對一般民眾而言往往難以理解,而在日常生活中交通規則違規與疑義事件頻繁發生,若能即時獲得準確且具法律依據的解答,將對駕駛人與執法單位帶來實質助益。隨著大語言模型(LLM)技術的快速發展,其自然語言理解與生成能力已可應用於法規問答場域,但在交通法規應用中仍面臨諸多挑戰,例如民眾提問語言與法條用語之落差、條文規定間的語意模糊與重疊性,以及相似情境下條文適用的歧異等。
本研究提出結合知識圖譜與對比學習以強化檢索式生成(RAG)技術,提升交通法規問答系統的準確率與語意理解能力。研究共分四階段進行:第一階段,建構交通法規條文與罰則對應之結構式知識圖譜,輔以SVM模型進行初步條文過濾;第二階段,基於實際交通問答情境構建情境式知識圖譜,以彌補用語與規範間的語意落差;第三階段,結合RoBERTa與GCN進行對比學習,強化語義向量的區辨力與條文匹配的準確性;第四階段,導入Self-Instruct進行資料增廣與語言模型微調,使LLM更貼近交通法規領域知識。
本研究為台灣首次針對交通法規領域導入結合知識圖譜與對比學習之RAG強化架構,不僅建構在地化的交通法規知識圖譜與問答資料集,亦提出具可行性的語義檢索與生成整合機制。實驗結果顯示,結合向量RAG與對比學習後之模型在Precision@3、BLEU及ROUGE-L等指標皆有顯著提升,整體問答效果較傳統文字檢索方法提升達26.85%

關鍵字:知識圖譜、對比學習、向量RAG
英文摘要
Abstract:
Traffic regulations are characterized by high structural complexity and legal rigor, with intricate provisions and a wealth of specialized terminology. For the general public, such content is often difficult to comprehend. In daily life, violations of traffic rules and disputes over their interpretation are frequent. If accurate, legally grounded responses can be provided in real time, it would offer substantial benefits to both drivers and law enforcement agencies. With the rapid advancement of Large Language Models (LLMs), their capabilities in natural language understanding and generation have made them applicable in legal question-answering scenarios. However, applying LLMs to the domain of traffic regulations still faces several challenges. These include discrepancies between colloquial user queries and formal legal language, semantic ambiguity and overlap among provisions, and divergent interpretations of applicable regulations in similar situations.
This study proposes an enhanced Retrieval-Augmented Generation (RAG) framework that integrates Knowledge Graphs and Contrastive Learning to improve the accuracy and semantic understanding of traffic law question-answering systems. The research is conducted in four phases:
Construction of a structured knowledge graph mapping traffic regulation articles to corresponding penalties, with initial article filtering supported by an SVM model.
Development of scenario-based knowledge graphs based on real-world traffic inquiries, aimed at bridging the semantic gap between user language and legal provisions.
Implementation of contrastive learning combining RoBERTa and Graph Convolutional Networks (GCN) to enhance the discriminative power of semantic vectors and improve article matching accuracy.
Application of Self-Instruct techniques for data augmentation and fine-tuning of LLMs to better align with domain-specific knowledge in traffic regulations.
This study is the first in Taiwan to introduce an RAG-enhanced architecture combining knowledge graphs and contrastive learning in the traffic regulation domain. It constructs a localized traffic regulation knowledge graph and Q&A dataset, and proposes a practical semantic retrieval and generation integration mechanism. Experimental results show that the model integrating Vector-based RAG and contrastive learning significantly improves metrics such as Precision@3, BLEU, and ROUGE-L, with overall performance surpassing traditional keyword-based retrieval methods by up to 26.85%.
















Keywords: Knowledge Graph, Contrastive Learning, Vector-based RAG
第三語言摘要
論文目次
目錄
 
誌謝	I
目錄	VI
圖目錄	VIII
表目錄	X
第一章 簡介	1
第二章 相關研究	4
2.1	機器學習與深度學習在交通法規問答系統的應用	4
2.2	 大語言模型與 RAG	5
2.3	交通法規知識圖譜與檢索結合	7
2.4	本研究與現有方法比較	8
第三章 系統架構與背景知識	9
3.1	大語言模型與預訓練基礎	9
3.2	文字型RAG	14
3.3	向量型RAG	16
3.4	對比學習	18
3.5	Self-Instruct 任務擴增技術	22
第四章 系統設計	24
4.1	整體架構	24
4.2	交通法規知識圖譜構建	25
4.3	 對比學習與語義檢索設計	26
4.4	LLM微調訓練流程	28
第五章 實驗分析	30
5.1	資料集	30
5.2	實驗環境與參數設定	31
5.3	實驗結果	32
5.4	分析與討論	36
第六章 結論	38
參考文獻	40
 
圖目錄
圖一、現有交通法律客服漏洞圖	2
圖二、RAG架構示意圖	6
圖三、Transformer架構圖	11
圖四、LLAMA3架構圖	13
圖五、BM25公式圖	15
圖六、SVM公式圖	16
圖七、餘弦相似度公式圖	17
圖八、對比學習基本架構圖	19
圖九、圖卷積網絡多層次文本學習模型架構圖	20
圖十、節點特徵更新公式圖	21
圖十一、系統架構圖	24
圖十二、三元組損失函數圖	27
 
表目錄
表一、相關研究比較表	8
表二、交通法規條文資料集	30
表三、問答語料集	31
表四、系統實驗環境與套件版本	31
表五、RALM 微調訓練參數設定表	32
表六、問答正確率比較表	32
表七、檢索模組 Top‑5 成功率與精準度	33
表八、系統回答之評分平均值	34
表九、Top‑K召回效果比較	34
表十、回答一致性與穩定性測試結果	35
表十一、少樣本能力下的性能比較	35
表十二、端到端查詢效能	36
參考文獻
[1] A. Radford, et al., “Robust Speech Recognition via Large-Scale Weak Supervision (Whisper),” OpenAI, 2022. [Online]. Available: https://openai.com/research/whisper

[2] H. Bredin, et al., “pyannote.audio: Neural Building Blocks for Speaker Diarization,” in *Proc. Interspeech*, 2020, pp. 712–716.

[3] H. Zhang, et al., “Improving Dialogue Summarization via Joint Learning with Discourse and Semantic Structures,” in *Findings of ACL*, 2021, pp. 474–485.

[4] M. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” *IEEE Trans. Big Data*, vol. 7, no. 3, pp. 535–547, 2019.

[5] P. Lewis, et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” in *NeurIPS*, vol. 33, 2020.

[6] OpenAI, “GPT-3.5/4 Language Models and Chat API,” OpenAI Platform, 2023. [Online]. Available: https://platform.openai.com/docs

[7] LangChain Team, “LangChain: Building LLM-powered Applications,” 2023. [Online]. Available: https://docs.langchain.com

[8] T. Wolf, et al., “Transformers: State-of-the-Art Natural Language Processing,” in *Proc. EMNLP: System Demonstrations*, 2020, pp. 38–45.

[9] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in *Proc. EMNLP*, 2019, pp. 3982–3992.

[10] K. Kinoshita, et al., “Streaming End-to-End Speech Recognition for Real-Time Applications,” *IEEE J. Sel. Topics Signal Process.*, vol. 15, no. 4, pp. 1032–1046, 2021.

[11] Beijing Academy of Artificial Intelligence (BAAI), “BGE-Large-ZH: Chinese Embedding Model,” HuggingFace, 2023. [Online]. Available: https://huggingface.co/BAAI/bge-large-zh

[12] LangChain Team, “RetrievalQA Chain Documentation,” 2023. [Online]. Available: https://docs.langchain.com/docs/modules/chains/popular/retrieval-qa
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信