系統識別號 | U0002-1807201201535300 |
---|---|
DOI | 10.6846/TKU.2012.00749 |
論文名稱(中文) | 基於MapReduce的影像處理系統加入DSRF優先權排程機制 |
論文名稱(英文) | MapReduce-based Image Processing System with Priority-based DSRF Algorithm |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 電機工程學系碩士班 |
系所名稱(英文) | Department of Electrical and Computer Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 100 |
學期 | 2 |
出版年 | 101 |
研究生(中文) | 郭玲裳 |
研究生(英文) | Ling Shang Kuo |
學號 | 699470042 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | |
口試日期 | 2012-06-14 |
論文頁數 | 63頁 |
口試委員 |
指導教授
-
李維聰(wtlee@mail.tku.edu.tw)
委員 - 朱國志(kcchu@mail.lhu.edu.tw) 委員 - 衛信文(141131@mail.tku.edu.tw) 委員 - 劉豐豪(lfh123@gmail.com) 委員 - 吳庭育(tyw@mail.tku.edu.tw) |
關鍵字(中) |
3D影像 雲端系統 排程機制 |
關鍵字(英) |
3D image Cloud System Schedule Algorithm |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
MapReduce是由Google所提出用於處理大量密集式資料的分散式平行運算架構。使用MapReduce的使用者只要撰寫Map以及Reduce這兩個Function,並輸入待處理的資料,即可以自動的完成需要大量運算資源的工作,如網頁搜尋、數值統計等。但是較少使用者將2D to 3D這種高複雜度且運算時間較長的影像處理應用放置於MapReduce處理,然而這種高複雜度以及高運算量的應用若是透過使用MapReduce來進行運算則可以用來增加其處理效能。 此外,在較早的文獻中,有許多的研究以及應用都被拿來建置在MapReduce上,但是這些研究以及應用所存在的環境需要擁有完整的運算資訊才能被正確地運行。而本論文的運作環境為一即時性的影像處理環境,所以如果沒有一個較佳的排程機制負責處理影像即時地被儲存在系統裡,將無法有效地將即時性的應用程序移植到MapReduce系統上。 本論文實作一個運作在MapReduce上的多使用者(Multi-user)的2D to 3D系統,並讓Map負責處理高複雜度以及高運算量的影像處理程序,而Reduce負責收集Map處理後的Intermediate Data並輸出。當多個使用者同時競爭MapReduce的資料來處理2D to 3D的應用時,就會造成等待Map處理的資料被正在處理的使用者所延誤,而Reduce必需等待所有Map處理完成後才能產生輸出結果的情形,。因此,本論文提出了一個新的排程機制,讓MapReduce能夠自動依照任務的處理進度自動切換下一個任務執行,並減少Reduce的閒置時間,我們稱之為Dynamic Switch of Reduce Function (DSRF) Algorithm。 |
英文摘要 |
MapReduce, a programming model proposed by Google, is designed for distributed parallel computing to process vast amounts of data. MapReduce users write the Map and Reduce functions, input the data to be processed and the task will be finished automatically. Hadoop, a distributed file system designed for implementing Google MapReduce, is adopted by many enterprises for daily data-intensive applications. Most users process short tasks using MapReduce; in other words, most tasks handled by the Map and Reduce functions require low response time. Currently, quite few users use MapReduce for 2D to 3D image processing, which is highly complicated and requires long execution time. However, in our opinion, MapReduce is exactly suitable for processing applications of high complexity and high computation. The other researches use MapReduce to build their applications. In the above researches, they will store the complete data into their file system. In our paper, our system is a real-time image processing system and the file system will get the real-time image continually. By the way, the system doesn’t have a schedule algorithm to solve the real-time application problem. This paper implements MapReduce on an integrated 2D to 3D multi-user system, in which Map is responsible for image processing procedures of high complexity and high computation, and Reduce is responsible for integrating the intermediate data processed by Map for the final output. Different from short tasks, when several users compete simultaneously to acquire data from MapReduce for 2D to 3D applications, data that waits to be processed by Map will be delayed by the current user and Reduce has to wait until the completion of all Map tasks to generate the final result. Therefore, a novel scheduling scheme, Dynamic Switch of Reduce Function (DSRF) Algorithm. |
第三語言摘要 | |
論文目次 |
目錄 第一章 緒論 1 1.1 前言 1 1.2 動機與目的 1 1.3 論文章節架構 6 第二章 背景知識與相關研究 8 2.1 Google File System分散式檔案系統 8 2.1.1 BigTable 11 2.2 MapReduce 11 2.3 Hadoop 15 2.3.1 Hadoop MapReduce 17 2.3.1.1 JobNode 18 2.3.1.2 TaskNode 19 2.3.2 HDFS 20 2.3.2.1 NameNode 22 2.3.2.2 DataNode 22 2.3.3 HBase 23 2.3.4 ZooKeeper 25 2.4 影像處理程序 26 2.4.1 Harris Corner Detector 26 2.4.1.1 Moravec Corner Detection Algorithm 27 2.4.1.2 Harris Corner Detection Algorithm 28 2.4.2 Sum of Absolute Difference(SAD) 28 2.4.3 Model Reconstruction 29 2.5 相關研究 29 第三章 新的排程演算法DSRF Algorithm 33 3.1 DSRF(Dynamic Switch of Reduce Function) Algorithm 33 3.1.1 MapReduce System with 2D to 3D Application 33 3.1.2 DSRF(Dynamic Switch of Reduce Function) Algorithm Overview 36 3.2 DSRF Algorithm with MapReduce 39 3.3 Algorithm Working principle 41 第四章 模擬結果與討論 44 4.1 Implementation 44 4.1.1 Implementation Environment 44 4.1.2 Performance Analysis and Evaluation 46 第五章 結論與未來展望 60 參考文獻 62 圖目錄 圖2.1 MapReduce架構圖 14 圖2.2 Hadoop MapReduce 架構 18 圖2.3 JobNode & DataNode 關係圖 20 圖2.4 NameNode & DataNode 關係圖 23 圖2.5 影像骨架重建流程圖 26 圖2.6 Moravec Algorithm 示意圖 28 圖3.1 MapReduce Runtime System with 2D to 3D Application 34 圖3.2 DSRF Algorithm Overview 38 圖3.3 DSRF Algorithm Working 流程圖 39 圖4.1 DSRF in Hadoop 45 圖4.2 輸入圖片以及模擬圖片 46 圖4.3 User數量對平均運算時間的影響 48 圖4.4 User數量對平均運算時間的影響(放大) 48 圖4.5 User數量增加對DSRF Algorithm切換次數的影響 49 圖4.6 User數量增加對合併任務的延遲以及切換次數的關係 50 圖4.7 User數量增加對系統的輸出量影響 52 圖4.8 User數量增加對使用者平均等待時間的影響 53 圖4.9 User數量對系統最大排隊長度的影響(User Number = 10) 55 圖4.10 User數量對系統最大排隊長度的影響(User Number = 20) 55 圖4.11 User數量對系統最大排隊長度的影響(User Number = 30) 56 圖4.12 User數量對系統最大排隊長度的影響(User Number = 40) 56 圖4.13 User數量對系統最大排隊長度的影響(User Number = 50) 57 圖4.14 User數量對系統最大排隊長度的影響(User Number = 60) 58 圖4.15 User數量對系統最大排隊長度的影響(User Number = 70) 58 圖4.16 User數量對系統最大排隊長度的影響(User Number = 80) 59 |
參考文獻 |
郭玲裳,李維聰, “基於雲端系統Map-Reduce 模組加入資料緩衝區及優先權機制”, National Symposium on Telecommunication (NST), 18-19 NovQinlu He,Zhanhuai Li’Xiao Zhang, “Study on Cloud Storage System based on Distributed Storage Systems”, Computational and Information Sciences( ICCIS), 17-19 Dec 2010, pp. 1332 - 1335 Kevin D.Bowers,Ari Juels, and Alina Oprea., “HAIL: A HighAvailability and Integrity Layer for Cloud Storage”, Computer and communications security (CCS), Nov 2009, pp. 187-198 Mingyue Luo,Gang Liu, “Distributed log information processing with Map-Reduce: A case study from raw data to final models”, Information Theory and Information Security(ICITIS), 17-19 Dec. 2010, pp.1143-1146 Afrati, Foto N. Ullman, Jeffrey D., “Optimizing Multiway Joins in a Map-Reduce Environment”, IEEE Transactions on Knowledge and Data Engineering, vol. 23, Issue. 9, Sept. 2011, pp. 1282-1298 Chen Zhang ,De Sterck, H.,”CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase”, Cloud Computing Technology and Science (CloudCom), Nov. 30 2010-Dec. 3 2010, pp. 368-375 Jorda Polo, David Carrera, Yolanda Becerra, Malgorzata Steinder, and Ian Whalley., “Performance-driven task co-scheduling for mapreduce environments”, Network Operations and Management Symposium(NOMS), 19-23 Apr 2010, pp.373 –380 Thomas Sandholm and Kevin Lai., “Dynamic proportional share scheduling in hadoop”, Job Scheduling Stragies For Parallel Processing (JSSPP), vol. 6253, 2010, pp. 110-131 Jia Yu and Rajkumar Buyya. “A budget constrained scheduling of workflow applications on utility grids using genetic algorithms”, High Performance Distributed Computing (HPDC), 19-23 June 2006, pp.1-10 Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy,Scott Shenker, and Ion Stoica, “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In EuroSys ’10:Proceedings of the 5th European conference on Computer systems, 2010, pp. 265–278. J. Ekanayake, S. Pallickara, and G. Fox, "MapReduce for Data Intensive Scientific Analyses." (ESCIENCE) ,2008, pp. 277-284. Qiu X., Ekanayake J., Gunarathne T. et al., "Using MapReduce Technologies in Bioinformatics and Medical Informatics." Lee, Kuan-Rong ; Fu, Meng-Hsuan ; Kuo, Yau-Hwang, “A hierarchical scheduling strategy for the composition services architecture based on cloud computing”, Next Generation Information Technology (ICNIT), 21-23 June 2011, pp. 163-169 Thilina Gunarathne, Tak-Lon Wu, Judy Qiu et al., “Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications”, in Proceedings of the Emerging Computational Methods for the Life Sciences Workshop of ACM HPDC 2010 conference, vol.23, issue. 17, 27 JUN 2011, pp. 2338-2354 J. Ekanayake, T. Gunarathne, J. Qiu et al., “Cloud Technologies for Bioinformatics Applications”, IEEE Transactions on Parallel and Distributed Systems, vol. 22, issue. 6, June 2011, pp. 998-1011 Chris Harris, Mike Stephens, “A Combined Corner and Edge Detector” Proceedings of 4^thAlvery Vision Conference, pp. 147-151 Moravec, H, “Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover”, Tech Report CMU-RI-TR-3, Sep 1980 Sadasivam, G.S.; Selvaraj, D. “A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids”, Nature and Biologically Inspired Computing (NaBIC) , 15-17 Dec. 2010 , pp. 377 - 382 P. R. Jelenkovic, X. Kang, J. Tan, “Adaptive and scalable comparison scheduling”, in: SIGMETRICS ’07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2007, pp. 215–226. A. Wierman, M. Nuyens, “Scheduling despite inexact job-size information”, in: SIGMETRICS ’08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2008, pp. 25–36. Apache Software Foundation, Hadoop on demand. URL http://hadoop.apache.org/core/docs/r0.20.0/hod_user_guide.html M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, I. Stoica, “Job scheduling for multi-user MapReduce clusters”, Tech. Rep. UCB/EECS-2009-55, Apr 2009 |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信