電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2012-07-23起於校外公開使用
本論文紙本於2012-07-23起公開使用

系統識別號	U0002-1807201201535300
DOI	10.6846/TKU.2012.00749
論文名稱(中文)	基於MapReduce的影像處理系統加入DSRF優先權排程機制
論文名稱(英文)	MapReduce-based Image Processing System with Priority-based DSRF Algorithm
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	電機工程學系碩士班
系所名稱(英文)	Department of Electrical and Computer Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	100
學期	2
出版年	101
研究生(中文)	郭玲裳
研究生(英文)	Ling Shang Kuo
學號	699470042
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2012-06-14
論文頁數	63頁
口試委員	指導教授 - 李維聰(wtlee@mail.tku.edu.tw) 委員 - 朱國志(kcchu@mail.lhu.edu.tw) 委員 - 衛信文(141131@mail.tku.edu.tw) 委員 - 劉豐豪(lfh123@gmail.com) 委員 - 吳庭育(tyw@mail.tku.edu.tw)
關鍵字(中)	3D影像雲端系統排程機制
關鍵字(英)	3D image Cloud System Schedule Algorithm
第三語言關鍵字
學科別分類
中文摘要	MapReduce是由Google所提出用於處理大量密集式資料的分散式平行運算架構。使用MapReduce的使用者只要撰寫Map以及Reduce這兩個Function，並輸入待處理的資料，即可以自動的完成需要大量運算資源的工作，如網頁搜尋、數值統計等。但是較少使用者將2D to 3D這種高複雜度且運算時間較長的影像處理應用放置於MapReduce處理，然而這種高複雜度以及高運算量的應用若是透過使用MapReduce來進行運算則可以用來增加其處理效能。此外，在較早的文獻中，有許多的研究以及應用都被拿來建置在MapReduce上，但是這些研究以及應用所存在的環境需要擁有完整的運算資訊才能被正確地運行。而本論文的運作環境為一即時性的影像處理環境，所以如果沒有一個較佳的排程機制負責處理影像即時地被儲存在系統裡，將無法有效地將即時性的應用程序移植到MapReduce系統上。本論文實作一個運作在MapReduce上的多使用者(Multi-user)的2D to 3D系統，並讓Map負責處理高複雜度以及高運算量的影像處理程序，而Reduce負責收集Map處理後的Intermediate Data並輸出。當多個使用者同時競爭MapReduce的資料來處理2D to 3D的應用時，就會造成等待Map處理的資料被正在處理的使用者所延誤，而Reduce必需等待所有Map處理完成後才能產生輸出結果的情形，。因此，本論文提出了一個新的排程機制，讓MapReduce能夠自動依照任務的處理進度自動切換下一個任務執行，並減少Reduce的閒置時間，我們稱之為Dynamic Switch of Reduce Function (DSRF) Algorithm。
英文摘要	MapReduce, a programming model proposed by Google, is designed for distributed parallel computing to process vast amounts of data. MapReduce users write the Map and Reduce functions, input the data to be processed and the task will be finished automatically. Hadoop, a distributed file system designed for implementing Google MapReduce, is adopted by many enterprises for daily data-intensive applications. Most users process short tasks using MapReduce; in other words, most tasks handled by the Map and Reduce functions require low response time. Currently, quite few users use MapReduce for 2D to 3D image processing, which is highly complicated and requires long execution time. However, in our opinion, MapReduce is exactly suitable for processing applications of high complexity and high computation. The other researches use MapReduce to build their applications. In the above researches, they will store the complete data into their file system. In our paper, our system is a real-time image processing system and the file system will get the real-time image continually. By the way, the system doesn’t have a schedule algorithm to solve the real-time application problem. This paper implements MapReduce on an integrated 2D to 3D multi-user system, in which Map is responsible for image processing procedures of high complexity and high computation, and Reduce is responsible for integrating the intermediate data processed by Map for the final output. Different from short tasks, when several users compete simultaneously to acquire data from MapReduce for 2D to 3D applications, data that waits to be processed by Map will be delayed by the current user and Reduce has to wait until the completion of all Map tasks to generate the final result. Therefore, a novel scheduling scheme, Dynamic Switch of Reduce Function (DSRF) Algorithm.
第三語言摘要
論文目次	目錄第一章緒論 1 1.1 前言 1 1.2 動機與目的 1 1.3 論文章節架構 6 第二章背景知識與相關研究 8 2.1 Google File System分散式檔案系統 8 2.1.1 BigTable 11 2.2 MapReduce 11 2.3 Hadoop 15 2.3.1 Hadoop MapReduce 17 2.3.1.1 JobNode 18 2.3.1.2 TaskNode 19 2.3.2 HDFS 20 2.3.2.1 NameNode 22 2.3.2.2 DataNode 22 2.3.3 HBase 23 2.3.4 ZooKeeper 25 2.4 影像處理程序 26 2.4.1 Harris Corner Detector 26 2.4.1.1 Moravec Corner Detection Algorithm 27 2.4.1.2 Harris Corner Detection Algorithm 28 2.4.2 Sum of Absolute Difference(SAD) 28 2.4.3 Model Reconstruction 29 2.5 相關研究 29 第三章新的排程演算法DSRF Algorithm 33 3.1 DSRF(Dynamic Switch of Reduce Function) Algorithm 33 3.1.1 MapReduce System with 2D to 3D Application 33 3.1.2 DSRF(Dynamic Switch of Reduce Function) Algorithm Overview 36 3.2 DSRF Algorithm with MapReduce 39 3.3 Algorithm Working principle 41 第四章模擬結果與討論 44 4.1 Implementation 44 4.1.1 Implementation Environment 44 4.1.2 Performance Analysis and Evaluation 46 第五章結論與未來展望 60 參考文獻 62 圖目錄圖2.1 MapReduce架構圖 14 圖2.2 Hadoop MapReduce 架構 18 圖2.3 JobNode & DataNode 關係圖 20 圖2.4 NameNode & DataNode 關係圖 23 圖2.5 影像骨架重建流程圖 26 圖2.6 Moravec Algorithm 示意圖 28 圖3.1 MapReduce Runtime System with 2D to 3D Application 34 圖3.2 DSRF Algorithm Overview 38 圖3.3 DSRF Algorithm Working 流程圖 39 圖4.1 DSRF in Hadoop 45 圖4.2 輸入圖片以及模擬圖片 46 圖4.3 User數量對平均運算時間的影響 48 圖4.4 User數量對平均運算時間的影響(放大) 48 圖4.5 User數量增加對DSRF Algorithm切換次數的影響 49 圖4.6 User數量增加對合併任務的延遲以及切換次數的關係 50 圖4.7 User數量增加對系統的輸出量影響 52 圖4.8 User數量增加對使用者平均等待時間的影響 53 圖4.9 User數量對系統最大排隊長度的影響(User Number = 10) 55 圖4.10 User數量對系統最大排隊長度的影響(User Number = 20) 55 圖4.11 User數量對系統最大排隊長度的影響(User Number = 30) 56 圖4.12 User數量對系統最大排隊長度的影響(User Number = 40) 56 圖4.13 User數量對系統最大排隊長度的影響(User Number = 50) 57 圖4.14 User數量對系統最大排隊長度的影響(User Number = 60) 58 圖4.15 User數量對系統最大排隊長度的影響(User Number = 70) 58 圖4.16 User數量對系統最大排隊長度的影響(User Number = 80) 59
參考文獻	郭玲裳,李維聰, “基於雲端系統Map-Reduce 模組加入資料緩衝區及優先權機制”, National Symposium on Telecommunication (NST), 18-19 NovQinlu He,Zhanhuai Li’Xiao Zhang, “Study on Cloud Storage System based on Distributed Storage Systems”, Computational and Information Sciences( ICCIS), 17-19 Dec 2010, pp. 1332 - 1335 Kevin D.Bowers,Ari Juels, and Alina Oprea., “HAIL: A HighAvailability and Integrity Layer for Cloud Storage”, Computer and communications security (CCS), Nov 2009, pp. 187-198 Mingyue Luo,Gang Liu, “Distributed log information processing with Map-Reduce: A case study from raw data to final models”, Information Theory and Information Security(ICITIS), 17-19 Dec. 2010, pp.1143-1146 Afrati, Foto N. Ullman, Jeffrey D., “Optimizing Multiway Joins in a Map-Reduce Environment”, IEEE Transactions on Knowledge and Data Engineering, vol. 23, Issue. 9, Sept. 2011, pp. 1282-1298 Chen Zhang ,De Sterck, H.,”CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase”, Cloud Computing Technology and Science (CloudCom), Nov. 30 2010-Dec. 3 2010, pp. 368-375 Jorda Polo, David Carrera, Yolanda Becerra, Malgorzata Steinder, and Ian Whalley., “Performance-driven task co-scheduling for mapreduce environments”, Network Operations and Management Symposium(NOMS), 19-23 Apr 2010, pp.373 –380 Thomas Sandholm and Kevin Lai., “Dynamic proportional share scheduling in hadoop”, Job Scheduling Stragies For Parallel Processing (JSSPP), vol. 6253, 2010, pp. 110-131 Jia Yu and Rajkumar Buyya. “A budget constrained scheduling of workflow applications on utility grids using genetic algorithms”, High Performance Distributed Computing (HPDC), 19-23 June 2006, pp.1-10 Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy,Scott Shenker, and Ion Stoica, “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In EuroSys ’10:Proceedings of the 5th European conference on Computer systems, 2010, pp. 265–278. J. Ekanayake, S. Pallickara, and G. Fox, "MapReduce for Data Intensive Scientific Analyses." (ESCIENCE) ,2008, pp. 277-284. Qiu X., Ekanayake J., Gunarathne T. et al., "Using MapReduce Technologies in Bioinformatics and Medical Informatics." Lee, Kuan-Rong ; Fu, Meng-Hsuan ; Kuo, Yau-Hwang, “A hierarchical scheduling strategy for the composition services architecture based on cloud computing”, Next Generation Information Technology (ICNIT), 21-23 June 2011, pp. 163-169 Thilina Gunarathne, Tak-Lon Wu, Judy Qiu et al., “Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications”, in Proceedings of the Emerging Computational Methods for the Life Sciences Workshop of ACM HPDC 2010 conference, vol.23, issue. 17, 27 JUN 2011, pp. 2338-2354 J. Ekanayake, T. Gunarathne, J. Qiu et al., “Cloud Technologies for Bioinformatics Applications”, IEEE Transactions on Parallel and Distributed Systems, vol. 22, issue. 6, June 2011, pp. 998-1011 Chris Harris, Mike Stephens, “A Combined Corner and Edge Detector” Proceedings of 4^thAlvery Vision Conference, pp. 147-151 Moravec, H, “Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover”, Tech Report CMU-RI-TR-3, Sep 1980 Sadasivam, G.S.; Selvaraj, D. “A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids”, Nature and Biologically Inspired Computing (NaBIC) , 15-17 Dec. 2010 , pp. 377 - 382 P. R. Jelenkovic, X. Kang, J. Tan, “Adaptive and scalable comparison scheduling”, in: SIGMETRICS ’07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2007, pp. 215–226. A. Wierman, M. Nuyens, “Scheduling despite inexact job-size information”, in: SIGMETRICS ’08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2008, pp. 25–36. Apache Software Foundation, Hadoop on demand. URL http://hadoop.apache.org/core/docs/r0.20.0/hod_user_guide.html M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, I. Stoica, “Job scheduling for multi-user MapReduce clusters”, Tech. Rep. UCB/EECS-2009-55, Apr 2009
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權予資料庫廠商校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信