§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0209202410002100
DOI 10.6846/tku202400736
論文名稱(中文) 使用上線學習並結合雙樹堆和乘法權重更新演算法實現動態搜尋最佳化
論文名稱(英文) Dynamic Search Optimization: Double Treaps and MWU Algorithm for Efficient Online Learning
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 112
學期 2
出版年 113
研究生(中文) 盧禹丞
研究生(英文) Yu-Cheng Lu
學號 612410125
學位類別 碩士
語言別 英文
第二語言別
口試日期 2024-07-18
論文頁數 25頁
口試委員 指導教授 - 林莊傑(josephcclin@mail.ntou.edu.tw)
口試委員 - 吳孟倫
口試委員 - 陳柏安
關鍵字(中) 演算法
資料結構
樹堆
乘法權重更新演算法
上線學習
關鍵字(英) Algorithm
Data Structure
Treap
MWU Algorithm
Online Learning
第三語言關鍵字
學科別分類
中文摘要
隨著數位資料的快速增長,搜尋演算法的效率與可擴展性變得愈發重要。傳統方法經常難以在速度與資源效率之間取得平衡,尤其是在包含不同頻率元素的大型且複雜的數據集上。本文提出了一種新的方法,通過結合隨機化與上線學習樹堆結構以及乘法權重更新(MWU)演算法來優化文字搜尋效能。其主要目標是在不論單詞熱門程度的情況下,提升搜尋速度並減少資源消耗。我們的方法利用乘法權重更新演算法在學習樹堆和隨機樹堆之間做動態選擇,確保搜尋效率。實驗結果顯示,與其他方法相比,我們的方法顯著縮短了搜尋時間。此方法能有效適應單詞頻率的變化,對於大規模數據集具有較強的適應性。總之,結合隨機化與學習樹堆以及乘法權重更新演算法,為各種應用中的搜尋效能優化提供了一個具有潛力的解決方案。
英文摘要
The rapid growth of digital data necessitates efficient and scalable search algorithms. Traditional methods often struggle with balancing speed and resource efficiency, particularly in large and complex datasets with varying element frequencies. This paper proposes a new approach to optimize word search performance by integrating randomized and learned treaps with the Multiplicative Weights Update (MWU) algorithm. The primary objective is to enhance the speed of word searches while minimizing resource usage, regardless of the words' popularity or obscurity. Our method leverages the MWU algorithm to dynamically select between the learned treap and the randomized treap, ensuring efficient searches. Experimental results demonstrate that our approach significantly reduces search times compared to other methods. The proposed method effectively adapts to the frequency of words, making it robust for diverse and large-scale datasets. In conclusion, integrating randomized and learned treaps with the MWU algorithm offers a promising solution for optimizing search performance in various applications, paving the way for future enhancements in search algorithms.
第三語言摘要
論文目次
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Preliminaries 4
2.1 Treap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Multiplicative Weights Update (MWU) Algorithm . . . . . . . . . . . 4
2.3 Red-Black Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 AVL Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 B-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Splay Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7 Learning Augmented . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7.1 Learning-Augmented Binary Search Tree . . . . . . . . . . . . 8
2.7.2 Learning-Augmented k-means Clustering . . . . . . . . . . . . 8
2.7.3 Learning-Augmented Data Stream Algorithms . . . . . . . . . 8
3 Methodology 9
3.1 Na¨ıve Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 First Attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.2 Second Attempt . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 The Entire Process . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Search, Insert and Update Process . . . . . . . . . . . . . . . 11
V
3.3 Treaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 Random Treap . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.2 Learned Treap . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Multiplicative Weights Update (MWU) Algorithm . . . . . . . . . . . 12
4 Experiment 17
4.1 Dataset and Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Conclusion 23
References 24
VI
List of Figures
3.1 The flowchart of the entire algorithm . . . . . . . . . . . . . . . . . . 15
3.2 The flowchart of the search, insert, and update process . . . . . . . . 16
4.1 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Mix: average comparison times with randomly chosen words . . . . . 20
4.3 High: average comparison times with high-frequency words . . . . . . 20
4.4 Low: average comparison times with low-frequency words . . . . . . . 21
4.5 Adversarial: average comparison times with adversarial case . . . . . 21
參考文獻
References
[1] Treap (a randomized binary search tree). https://www.geeksforgeeks.org/treapa-randomized-binary-search-tree/, 2022.
[2] Implementation of searching, inserting and deleting in treap.
https://www.geeksforgeeks.org/implementation-of-search-insert-and-delete-intreap/, 2023.
[3] Insertion in an avl tree. https://www.geeksforgeeks.org/insertion-in-an-avltree/, 2023.
[4] Introduction of b-tree. https://www.geeksforgeeks.org/introduction-of-b-tree2/, 2023.
[5] Insertion in splay tree. https://www.geeksforgeeks.org/insertion-in-splay-tree/,
2024.
[6] Introduction to red-black tree. https://www.geeksforgeeks.org/introduction-tored-black-tree/, 2024.
[7] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update
method: a meta-algorithm and applications. Theory of computing, 8(1):121–
164, 2012.
[8] Erik D Demaine, John Iacono, Grigorios Koumoutsos, and Stefan Langerman.
Belga b-trees. Theory of Computing Systems, 65:541–558, 2021.
[9] Jon C Ergun, Zhili Feng, Sandeep Silwal, David P Woodruff, and Samson Zhou.
Learning-augmented k-means clustering. arXiv preprint arXiv:2110.14094,
2021.
24
[10] Leo J Guibas and Robert Sedgewick. A dichromatic framework for balanced
trees. In 19th Annual Symposium on Foundations of Computer Science (sfcs
1978), pages 8–21. IEEE, 1978.
[11] Elad Hazan et al. Introduction to online convex optimization. Foundations and
Trends® in Optimization, 2(3-4):157–325, 2016.
[12] David P Helmbold, Robert E Schapire, Yoram Singer, and Manfred K Warmuth. On-line portfolio selection using multiplicative updates. Mathematical
Finance, 8(4):325–347, 1998.
[13] Chen-Yu Hsu, Piotr Indyk, Dina Katabi, and Ali Vakilian. Learning-based frequency estimation algorithms. In International Conference on Learning Representations, 2019.
[14] Honghao Lin, Tian Luo, and David Woodruff. Learning augmented binary
search trees. In International Conference on Machine Learning, pages 13431–
13440. PMLR, 2022.
[15] Francesco Orabona. A modern introduction to online learning. arXiv preprint
arXiv:1912.13213, 2019.
[16] Rahul Shah, Cheng Sheng, Sharma Thankachan, and Jeffrey Vitter. Ranked
document retrieval in external memory. ACM Trans. Algorithms, 19(1), mar
2023.
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信