淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-1807201201535300
中文論文名稱 基於MapReduce的影像處理系統加入DSRF優先權排程機制
英文論文名稱 MapReduce-based Image Processing System with Priority-based DSRF Algorithm
校院名稱 淡江大學
系所名稱(中) 電機工程學系碩士班
系所名稱(英) Department of Electrical Engineering
學年度 100
學期 2
出版年 101
研究生中文姓名 郭玲裳
研究生英文姓名 Ling Shang Kuo
學號 699470042
學位類別 碩士
語文別 中文
口試日期 2012-06-14
論文頁數 63頁
口試委員 指導教授-李維聰
委員-朱國志
委員-衛信文
委員-劉豐豪
委員-吳庭育
中文關鍵字 3D影像  雲端系統  排程機制 
英文關鍵字 3D image  Cloud System  Schedule Algorithm 
學科別分類 學科別應用科學電機及電子
中文摘要 MapReduce是由Google所提出用於處理大量密集式資料的分散式平行運算架構。使用MapReduce的使用者只要撰寫Map以及Reduce這兩個Function,並輸入待處理的資料,即可以自動的完成需要大量運算資源的工作,如網頁搜尋、數值統計等。但是較少使用者將2D to 3D這種高複雜度且運算時間較長的影像處理應用放置於MapReduce處理,然而這種高複雜度以及高運算量的應用若是透過使用MapReduce來進行運算則可以用來增加其處理效能。
此外,在較早的文獻中,有許多的研究以及應用都被拿來建置在MapReduce上,但是這些研究以及應用所存在的環境需要擁有完整的運算資訊才能被正確地運行。而本論文的運作環境為一即時性的影像處理環境,所以如果沒有一個較佳的排程機制負責處理影像即時地被儲存在系統裡,將無法有效地將即時性的應用程序移植到MapReduce系統上。
本論文實作一個運作在MapReduce上的多使用者(Multi-user)的2D to 3D系統,並讓Map負責處理高複雜度以及高運算量的影像處理程序,而Reduce負責收集Map處理後的Intermediate Data並輸出。當多個使用者同時競爭MapReduce的資料來處理2D to 3D的應用時,就會造成等待Map處理的資料被正在處理的使用者所延誤,而Reduce必需等待所有Map處理完成後才能產生輸出結果的情形,。因此,本論文提出了一個新的排程機制,讓MapReduce能夠自動依照任務的處理進度自動切換下一個任務執行,並減少Reduce的閒置時間,我們稱之為Dynamic Switch of Reduce Function (DSRF) Algorithm。
英文摘要 MapReduce, a programming model proposed by Google, is designed for distributed parallel computing to process vast amounts of data. MapReduce users write the Map and Reduce functions, input the data to be processed and the task will be finished automatically. Hadoop, a distributed file system designed for implementing Google MapReduce, is adopted by many enterprises for daily data-intensive applications. Most users process short tasks using MapReduce; in other words, most tasks handled by the Map and Reduce functions require low response time. Currently, quite few users use MapReduce for 2D to 3D image processing, which is highly complicated and requires long execution time. However, in our opinion, MapReduce is exactly suitable for processing applications of high complexity and high computation.
The other researches use MapReduce to build their applications. In the above researches, they will store the complete data into their file system. In our paper, our system is a real-time image processing system and the file system will get the real-time image continually. By the way, the system doesn’t have a schedule algorithm to solve the real-time application problem.
This paper implements MapReduce on an integrated 2D to 3D multi-user system, in which Map is responsible for image processing procedures of high complexity and high computation, and Reduce is responsible for integrating the intermediate data processed by Map for the final output. Different from short tasks, when several users compete simultaneously to acquire data from MapReduce for 2D to 3D applications, data that waits to be processed by Map will be delayed by the current user and Reduce has to wait until the completion of all Map tasks to generate the final result. Therefore, a novel scheduling scheme, Dynamic Switch of Reduce Function (DSRF) Algorithm.
論文目次 目錄
第一章 緒論 1
1.1 前言 1
1.2 動機與目的 1
1.3 論文章節架構 6
第二章 背景知識與相關研究 8
2.1 Google File System分散式檔案系統 8
2.1.1 BigTable 11
2.2 MapReduce 11
2.3 Hadoop 15
2.3.1 Hadoop MapReduce 17
2.3.1.1 JobNode 18
2.3.1.2 TaskNode 19
2.3.2 HDFS 20
2.3.2.1 NameNode 22
2.3.2.2 DataNode 22
2.3.3 HBase 23
2.3.4 ZooKeeper 25
2.4 影像處理程序 26
2.4.1 Harris Corner Detector 26
2.4.1.1 Moravec Corner Detection Algorithm 27
2.4.1.2 Harris Corner Detection Algorithm 28
2.4.2 Sum of Absolute Difference(SAD) 28
2.4.3 Model Reconstruction 29
2.5 相關研究 29
第三章 新的排程演算法DSRF Algorithm 33
3.1 DSRF(Dynamic Switch of Reduce Function) Algorithm 33
3.1.1 MapReduce System with 2D to 3D Application 33
3.1.2 DSRF(Dynamic Switch of Reduce Function) Algorithm Overview 36
3.2 DSRF Algorithm with MapReduce 39
3.3 Algorithm Working principle 41
第四章 模擬結果與討論 44
4.1 Implementation 44
4.1.1 Implementation Environment 44
4.1.2 Performance Analysis and Evaluation 46
第五章 結論與未來展望 60
參考文獻 62

圖目錄
圖2.1 MapReduce架構圖 14
圖2.2 Hadoop MapReduce 架構 18
圖2.3 JobNode & DataNode 關係圖 20
圖2.4 NameNode & DataNode 關係圖 23
圖2.5 影像骨架重建流程圖 26
圖2.6 Moravec Algorithm 示意圖 28
圖3.1 MapReduce Runtime System with 2D to 3D Application 34
圖3.2 DSRF Algorithm Overview 38
圖3.3 DSRF Algorithm Working 流程圖 39
圖4.1 DSRF in Hadoop 45
圖4.2 輸入圖片以及模擬圖片 46
圖4.3 User數量對平均運算時間的影響 48
圖4.4 User數量對平均運算時間的影響(放大) 48
圖4.5 User數量增加對DSRF Algorithm切換次數的影響 49
圖4.6 User數量增加對合併任務的延遲以及切換次數的關係 50
圖4.7 User數量增加對系統的輸出量影響 52
圖4.8 User數量增加對使用者平均等待時間的影響 53
圖4.9 User數量對系統最大排隊長度的影響(User Number = 10) 55
圖4.10 User數量對系統最大排隊長度的影響(User Number = 20) 55
圖4.11 User數量對系統最大排隊長度的影響(User Number = 30) 56
圖4.12 User數量對系統最大排隊長度的影響(User Number = 40) 56
圖4.13 User數量對系統最大排隊長度的影響(User Number = 50) 57
圖4.14 User數量對系統最大排隊長度的影響(User Number = 60) 58
圖4.15 User數量對系統最大排隊長度的影響(User Number = 70) 58
圖4.16 User數量對系統最大排隊長度的影響(User Number = 80) 59
參考文獻 郭玲裳,李維聰, “基於雲端系統Map-Reduce 模組加入資料緩衝區及優先權機制”, National Symposium on Telecommunication (NST), 18-19
NovQinlu He,Zhanhuai Li’Xiao Zhang, “Study on Cloud Storage System based on Distributed Storage Systems”, Computational and Information Sciences( ICCIS), 17-19 Dec 2010, pp. 1332 - 1335
Kevin D.Bowers,Ari Juels, and Alina Oprea., “HAIL: A HighAvailability and Integrity Layer for Cloud Storage”, Computer and communications security (CCS), Nov 2009, pp. 187-198
Mingyue Luo,Gang Liu, “Distributed log information processing with Map-Reduce: A case study from raw data to final models”, Information Theory and Information Security(ICITIS), 17-19 Dec. 2010, pp.1143-1146
Afrati, Foto N. Ullman, Jeffrey D., “Optimizing Multiway Joins in a Map-Reduce Environment”, IEEE Transactions on Knowledge and Data Engineering, vol. 23, Issue. 9, Sept. 2011, pp. 1282-1298
Chen Zhang ,De Sterck, H.,”CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase”, Cloud Computing Technology and Science (CloudCom), Nov. 30 2010-Dec. 3 2010, pp. 368-375
Jorda Polo, David Carrera, Yolanda Becerra, Malgorzata Steinder, and Ian Whalley., “Performance-driven task co-scheduling for mapreduce environments”, Network Operations and Management Symposium(NOMS), 19-23 Apr 2010, pp.373 –380
Thomas Sandholm and Kevin Lai., “Dynamic proportional share scheduling in hadoop”, Job Scheduling Stragies For Parallel Processing (JSSPP), vol. 6253, 2010, pp. 110-131
Jia Yu and Rajkumar Buyya. “A budget constrained scheduling of workflow applications on utility grids using genetic algorithms”, High Performance Distributed Computing (HPDC), 19-23 June 2006, pp.1-10
Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy,Scott Shenker, and Ion Stoica, “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In EuroSys ’10:Proceedings of the 5th European conference on Computer systems, 2010, pp. 265–278.
J. Ekanayake, S. Pallickara, and G. Fox, "MapReduce for Data Intensive Scientific Analyses." (ESCIENCE) ,2008, pp. 277-284.
Qiu X., Ekanayake J., Gunarathne T. et al., "Using MapReduce Technologies in Bioinformatics and Medical Informatics."
Lee, Kuan-Rong ; Fu, Meng-Hsuan ; Kuo, Yau-Hwang, “A hierarchical scheduling strategy for the composition services architecture based on cloud computing”, Next Generation Information Technology (ICNIT), 21-23 June 2011, pp. 163-169
Thilina Gunarathne, Tak-Lon Wu, Judy Qiu et al., “Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications”, in Proceedings of the Emerging Computational Methods for the Life Sciences Workshop of ACM HPDC 2010 conference, vol.23, issue. 17, 27 JUN 2011, pp. 2338-2354
J. Ekanayake, T. Gunarathne, J. Qiu et al., “Cloud Technologies for Bioinformatics Applications”, IEEE Transactions on Parallel and Distributed Systems, vol. 22, issue. 6, June 2011, pp. 998-1011
Chris Harris, Mike Stephens, “A Combined Corner and Edge Detector” Proceedings of 4^thAlvery Vision Conference, pp. 147-151
Moravec, H, “Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover”, Tech Report CMU-RI-TR-3, Sep 1980
Sadasivam, G.S.; Selvaraj, D. “A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids”, Nature and Biologically Inspired Computing (NaBIC) , 15-17 Dec. 2010 , pp. 377 - 382
P. R. Jelenkovic, X. Kang, J. Tan, “Adaptive and scalable comparison scheduling”, in: SIGMETRICS ’07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2007, pp. 215–226.
A. Wierman, M. Nuyens, “Scheduling despite inexact job-size information”, in: SIGMETRICS ’08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2008, pp. 25–36.
Apache Software Foundation, Hadoop on demand. URL http://hadoop.apache.org/core/docs/r0.20.0/hod_user_guide.html
M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, I. Stoica, “Job scheduling for multi-user MapReduce clusters”, Tech. Rep. UCB/EECS-2009-55, Apr 2009
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2012-07-23公開。
  • 同意授權瀏覽/列印電子全文服務,於2012-07-23起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信