淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-2402200909052800
中文論文名稱 雲端運算於企業應用之研究
英文論文名稱 A study of cloud computing in enterprise application
校院名稱 淡江大學
系所名稱(中) 資訊管理學系碩士班
系所名稱(英) Department of Information Management
學年度 97
學期 1
出版年 98
研究生中文姓名 黃獻輝
研究生英文姓名 Hsien-Hui Huang
學號 695630193
學位類別 碩士
語文別 中文
口試日期 2009-01-09
論文頁數 45頁
口試委員 指導教授-蕭瑞祥
委員-翁頌舜
委員-邱光輝
中文關鍵字 雲端運算  資訊基礎架構 
英文關鍵字 Cloud Computing  IT Infrastructure  Enterprise Application 
學科別分類 學科別社會科學管理學
學科別社會科學資訊科學
中文摘要 雲端運算(Cloud Computing)從廣義上來說,代表了一種現在的網路使用者對於網路的使用方法,主要是利用分散式系統架構因應企業巨量資料處理需求。雲端技術的主要架構概念,是藉由一個分散式的檔案系統架構、MapReduce程式設計方法,以及低硬體等級的環境來達成高容錯,高延展的計算環境。

企業在建置資訊基礎架構時,經常會因為企業規模改變而需要更動資訊基礎架構規模,本研究嘗試提出一個企業運用雲端運算解決此類問題之解決方案。本研究藉由一個開放原始碼的「Hadoop」作為解決方案之基礎,運用「Hadoop」架構中的分散式檔案系統「Hadoop Distributed File System」提供資訊資料的備份、效能及運算監控、回復機制等,可以有效地降低硬體成本,卻仍能保持良好的效能。本研究實驗測試企業常見之高階單一伺服器硬體與雲端運算中低階雲端技術架構,實驗結果顯示雲端運算架構對於巨量檔案的資料處理,較高階單一伺服器具有優秀的運作效能。此外在實驗中發現,企業在建置雲端運算環境時,需要注意Map與Reduce執行緒之比例,否則效能仍有可能降低。
英文摘要 Cloud computing can be a representative of the current users of the Internet for use. In the Gartner study, cloud computing will be further classified as Cloud Computing Services and Cloud computing technologies. The main Cloud computing technologies is the use of the technology of distributed system architecture to deal with when there are massive corporate demands for data processing, designed by one of the solutions. The main structure of the Cloud computing technologies concepts, is distributed by a file system architecture and the use of MapReduce programming methods, as well as to low-level hardware environment to achieve high fault-tolerant, high extension of the computing environment.
Enterprises in building information infrastructure, often because of changes in the size of enterprise IT infrastructure need to change the size, so the cloud of this study was to propose a technical solution to solve this problem. We are by an open-source "Hadoop" to as our solutions by "Hadoop" structure Distributed File System "Hadoop Distributed File System" to provide information to back up data, monitor performance and computing, there are back mechanism can successfully lower hardware costs but still maintain a good performance. We have actually tested a single high-end enterprise server common hardware architecture, as well as low-level clouds clouds technology technical architecture, technical architecture found in a huge cloud of data processing files are indeed able to provide high quality are the operation of a single server performance In addition, we are the experiment found that the cloud technology enterprises in the build environment needs to pay attention to during the implementation of the Map and Reduce the proportion of order, otherwise the performance may be lower still.
論文目次 目錄 I
圖目錄 II
表目錄 III
第一章 緒論 1
第一節 研究背景與動機 1
第二節 研究目的 3
第二章 文獻探討 4
第一節 叢集式多處理系統 4
第二節 平行分散式運算架構 6
第三節 平行程式設計 8
第四節 Google File System(GFS) 11
第五節 效能比較方法 13
第三章 企業運用雲端運算之系統架構 17
第一節 企業運用雲端運算建議架構 17
第二節 Hadoop簡介 18
第三節 Hadoop Distributed File System 19
第四節 MapReduce 25
第四章 實驗系統實做與比較 31
第一節 實驗設計 31
第二節 硬體環境與研究限制 34
第三節 實驗結果 36
第五章 研究結論 40
第一節 實驗結論 40
第二節 未來研究方向 41
參考文獻 43
中文 43
英文 43

圖目錄
圖1 雲端運算架構示意圖 1
圖2 SMP(共用記憶體架構) 7
圖3 DMP(分散式記憶體架構) 8
圖4 Perfect Parallelism 資料結構示意圖 9
圖5 Pipeline Parallelism 資料結構示意圖 10
圖6 Fully Synchronous Parallelism 資料結構示意圖 10
圖7 Google File System 12
圖 8 CPU Load History Queue 16
圖 9 企業運用雲端運算建議架構 18
圖10 HDFS架構 21
圖11 MapReduce 架構 26
圖12 MapReduce架構概觀 27
圖 13 實驗系統架構圖 31
圖 14 測試資料範例 32
圖 15 程式執行結果 33
圖 16 雲端低階環境與傳統高階環境執行時間圖 36
圖 17 記憶體使用量 37
圖 18 中央處理器使用率之比較 38
圖 19 Map函式量調整結果 39
圖 20 Reduce 函式量調整圖 39

表目錄
表1 Linux之系統負載指數評估標準 15
表2測試硬體規格 34
表3硬體運算效能 35

參考文獻 中文

1. 陳曉莉,2007,紅帽及Amazon攜手提供Linux隨選服務,IThome,即時新聞。
2. 鄭守成,1996,漫談平行電腦與平行計算,國家高速電腦中心,高速計算世界,第七卷第四期。
3. 王佩玉,2001,分散式平行計算環境之負載分配,義守大學,資訊工程研究所碩士論文。
4. 陳伯文,2001,代理人架構下分散式平行運算平台之設計與建構,元智大學,資訊管理學系碩士論文。
5. 柳美貞,2000,在NT 上的PVM 環境之工作自動分派系統,國立台灣大學,資訊工程學研究所碩士論文。
6. 林東清,2007,資訊管理的科技觀點 - 資訊管理,,台灣:智勝。
7. 張西亞、蔡佳璋,2006,『NCHC PC Cluster 簡介.』,國網中心,http://hpcserv2.nchc.gov.tw/project/lecture_notes/NCHC_PC_Cluster_簡介.pdf。

英文
1. Almasi, G. S. and Gottlieb, A., Highly Parallel Computing (2nd ed.), Benjamin-Cummings Publishing, 1990.
2. Blaise, B., "Introduction to Parallel Computing," 2008/10 Retrieved From High Performance Computing Technical: https://computing.llnl.gov/tutorials/parallel_comp/.
3. Boral, H. et al., "Prototyping Bubba, A Highly Parallel Database System," IEEE Transactions on Knowledge and Data Engineering, Vol. 2, Issue 1, March 1990, pp. 4-24.
4. Brodkin, J., "Cloud Computing Hype Spurs Confusion, Gartner says," 2008/10 Retrieved From Computer World: http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9115904&source=rss_news.
5. Brodkin, J., "Gartner: Seven Cloud-Computing Security Risks," 2008/10 Retrieved From InfoWorld: http://www.infoworld.com/article/08/07/02/Gartner_Seven_cloudcomputing_security_risks_1.html.
6. Christiansen, B. O. et al., "Javelin: Internet-Based Parallel Computing Using Java," 2008/10 Retrieved From CiteSeerX: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.4186.
7. DeWitt, D. J. and Gray, J., "Parallel Database System: The Future of High Performance Database Systems," CACM, Vol. 35, No. 6, June 1992, pp. 85-98.
8. DeWitt, D. J. and Gerber, R. H., "Multiprocessor Hash-Based Join Algorithms," Proceedings of the 1985 VLDB Conference, 1985.
9. Dhruba, B., "The Hadoop Distributed File System: Architecture and Design," 2008/10/5 Retrieved From Apache Hadoop: http://hadoop.apache.org/core/docs/r0.18.1/hdfs_design.html.
10. DeWitt, D., "MapReduce: A Major Step Backwards, " 2008/10 Retrieved From The Database Column: http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html.
11. Hennessy, L., David, A., and Patterson, J., Computer Architecture: A Quantitative Approach (3rd ed.), Morgan Kaufmann Publishers, 2004.
12. Hennessy, L., David, A., and Patterson, J., Computer Organization and Design (2nd ed.), Morgan Kaufmann Publishers, 1998.
13. Intel Corporation, "iCOMP Index 3.0 Performance Brief,", 2008/10 Retrieved From: http://download.intel.com/design/PentiumII/perfbref/24339301.PDF.
14. Kemal, A., Delic, A., and Walker, M., "Emergence of the Academic Computing Cloud," ACM Ubiquity, Vol. 9, Issue 31, August 5 – 11, 2008.
15. Kitsuregawa, M. and Hidehiko, T., "Application of Hash to Data Base Machine and Its Architecture, "New Generation Compute. Vol. 1, No. 1, 1983, pp. 63-74.
16. Luiz André Barroso, "The Price of Performance," Multiprocessors Vol. 3, Issue 7, 2005, pp. 48-53.
17. Laudon, C. and Laudon, J. K., Management Information System: Managing the Digital Firm (9th ed.), Pearson Education, Inc., 2006.
18. Miller, R., "What's In A Name? Utility vs. Cloud vs. Grid, " 2008/3. Retrieved From Knowledge Data Center: http://www.datacenterknowledge.com/archives/2008/03/25/whats-in-a-name-utility-vs-cloud-vs-grid
19. Olston, C. et al., "Pig," 2008/10 Retrieved From Yahoo! Research: http://research.yahoo.com/node/90.
20. Pike, R. et al., "Interpreting the Data: Parallel Analysis with Sawzall,". Scientific Programming Journal, Vol. 13, No. 4, 2003, pp. 227-298.
21. Stonebraker, M., "The Case for Shared-Nothing," 2008/10 Retrieved From CiteSeerX: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.5370.
22. Shatdal, A. and Naughton, J. F., "Adaptive Parallel Aggregation Algorithms," Proceedings of the 1995 SIGMOD Conference, 1995.
23. Sanjay, G. and Dean, J., "MapReduce: Simplified Data Processing on Large Clusters, " OSDI'04: Sixth Symposium on Operating System Design and Implementation, 2004.
24. Sanjay, G. G. and Leung, H., "The Google File System," Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03), October 2003.
25. Wikipedia, "Cloud Computing," 2008/10 Retrieved From Wikipedia: http://en.wikipedia.org/wiki/Cloud_computing.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2012-02-26公開。
  • 同意授權瀏覽/列印電子全文服務,於2012-02-26起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信