電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2019-08-15起於校外公開使用
本論文紙本於2019-08-15起公開使用

系統識別號	U0002-3007201914252000
DOI	10.6846/TKU.2019.01018
論文名稱(中文)	應用迴歸式卷積神經網路於場景辨識之研究
論文名稱(英文)	The Study of Scene Recognition by using Regression Convolutional Neural Networks
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	電機工程學系碩士班
系所名稱(英文)	Department of Electrical and Computer Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	107
學期	2
出版年	108
研究生(中文)	林宜弘
研究生(英文)	Yi-Hung Lin
學號	606450145
學位類別	碩士
語言別	繁體中文
第二語言別	英文
口試日期	2019-06-19
論文頁數	69頁
口試委員	指導教授 - 施鴻源(hyshih.tw@gmail.com) 委員 - 夏至賢(chhsia625@gmail.com) 委員 - 江正雄(chiang@ee.tku.edu.tw) 委員 - 施鴻源(136359@mail.tku.edu.tw)
關鍵字(中)	場景識別深度學習迴歸式卷積神經網路
關鍵字(英)	Scene Recognition Deep Learning Regression Convolutional Neural Networks
第三語言關鍵字
學科別分類
中文摘要	在場景辨識系統中，一般都使用Google Cloud Vision，雖然他的偵測能力十分強大，但是在實際應用層面上有兩個問題需要被考量，(1)此應用服務僅提供該物件出現的可能性，並未提供我們想要的物件在影像中的位置資訊。(2)雖然Google Cloud Vision可以很明確地偵測到下雨的關鍵字，但並不知道人物角色是位於室內或是室外。本論文訓練一SSD模型來分析電影中“熱”“風”的場景，我們能分析熱風的位置及強弱，並根據這些檢測到的物體，透過決策樹進行更詳細的分析雨天場景，人物的位置是否會淋到雨，再由觸覺回饋裝置給予相對應的回饋，如此一來能透過觸覺回饋裝置給予相對應的不同強弱及方向的回饋，以模擬更沉浸式的虛擬電影環境。相較於之前在4D VR電影院研究中，透過Google Cloud Vision分析圖片的場景，再由觸覺回饋裝置給予相對應的回饋更完整也更符合實際狀況。
英文摘要	In general, Google Cloud Vision is utilized to be the tool of scene recognition system. Even though the recognition ability is so powerful, it still remains two practical issue to be solved. (1) This Google service only provides the possibility of the appeared- object, but it does not provide the objects’ position information in the image. (2) Although Google Cloud Vision is able to clearly recognize the keyword “raining”, it is unable to know whether the characters are indoors or outdoors. In this paper, we trained a SSD model to analyze the “hot” scene and “windy” scene from movies. We are able to analyze the position and strength of the hot air. Based on these detected objects, we could analyze the rainy scenes more detail through the decision tree, and then corresponding feedback will be triggered by the haptic feedback device, therefore, the different strength and direction of these feedbacks could deliver to the users and simulate a more immersive virtual movie. Comparing with the previous 4D VR cinema research, images analyzed by Google Cloud Vision and providing the corresponding feedbacks from the haptic device is more precisely and realistic.
第三語言摘要
論文目次	目錄致謝 I 中文摘要 II 英文摘要 III 目錄 IV 圖目錄 VI 表目錄 X 第一章緒論 1 1.1 前言 1 1.2 研究背景 2 1.3 研究目標 3 第二章基礎理論與背景知識 4 2.1 影像辨識 4 2.2 場景辨識 4 2.3 Google Cloud Vision 5 2.4 深度學習 8 2.5 影像辨識模型 9 第三章應用深度學習於電影場景辨識之研究 15 3.1 神經網路模型 17 3.2 資料集 19 3.2.1 標註工具 23 3.2.2 深度學習中的Data Augmentation 24 3.3 資料集建立方法 25 3.4 物件檢測評估指標 25 3.5 測試結果 30 第四章場景分析系統 34 4.1 火系統 34 4.2 水系統 39 4.4 風系統 52 第五章物件即時偵測演算法 56 第六章結合場景分析系統與觸覺回饋 60 第七章結論 65 參考文獻 66 圖目錄圖 2.1 室外與室內下雨場景 6 圖 2.2 火焰示意圖 7 圖 2.3卷積神經網路(CNN)架構圖[13] 9 圖 2.4 (RCNN)架構圖[15] 11 圖 2.5 (Fast RCNN)架構圖[17] 12 圖 2.6 (Faster RCNN)架構圖[19] 13 圖 2.7 (SSD)架構圖[20] 14 圖 3.1 場景分析架構圖 16 圖 3.2 模型訓練圖 18 圖 3.3 模型訓練圖 18 圖 3.6 部分dataset 21 圖 3.7 部分dataset 22 圖 3.8 LabelImg使用介面 23 圖 3.10採用位移轉換來擴增資料庫 24 圖 3.11 Confusion Matrix[26] 25 圖 3.12 IoU (Intersection over Union) [29] 27 圖 3.13 Pixabay搜索圖庫 28 圖 3.14 Google搜索圖庫 29 圖 3.17 訓練資料的TotalLoss 31 圖 3.18 驗證資料的Loss 31 圖 3.19 驗證資料的mAP 32 圖 3.20 測試資料的mAP 32 圖 4.1火系統流程圖 34 圖 4.2 模型IOU測試流程圖 35 圖 4.3 火焰面積與照片大小比例大於50%圖 36 圖 4.4 20%<火焰面積<50%圖 36 圖 4.5 火焰面積<20%圖 37 圖 4.6火焰位置與強弱輸出圖 38 圖 4.7火焰位置與強弱輸出圖 38 圖 4.8水系統流程圖 39 圖 4.9 Google Cloud Vision[14] 40 圖 4.10 使用Google Cloud Vision分析場景畫面後的結果 41 圖 4.11下雨的室內與下雨的室外比較圖 42 圖 4.12下雨的車內場景的label 43 圖 4.13下雨的車內場景的label 43 圖 4.14下雨的車內場景的label 44 圖 4.15下雨的車內場景的label 44 圖 4.16室內雨場景 45 圖 4.17室內雨場景 46 圖 4.18室內雨場景 46 圖 4.19室內雨場景 47 圖 4.20室內雨場景 47 圖 4.21 室內雨場景 48 圖 4.22 室內雨場景 49 圖 4.23 室內雨場景 49 圖4.24 室外場景 50 圖 4.25雨的輸出圖 51 圖 4.26雨的輸出圖 51 圖 4.27風系統流程圖 52 圖 4.28風訓練圖 52 圖 4.29風的強弱輸出圖 53 圖 4.30傾倒的樹木測試圖 54 圖 4.31傾倒的樹木測試圖 54 圖 4.32傾倒的樹木測試圖 55 圖 4.33傾倒的樹木測試圖 55 圖 5.1檢測畫面流程圖 57 圖 5.2使用者所看到的影片 57 圖 5.3每一秒截取一張圖送入模型檢測的輸出圖 58 圖 5.4使用者所看到的影片 58 圖 5.5每一秒截取一張圖送入模型檢測的輸出圖 59 圖 6.7 火系統使用者畫面 60 圖 6.8 火系統使用者畫面 61 圖 6.9 火系統使用者畫面 61 圖 6.10火系統使用者畫面 62 圖 6.11風系統使用者畫面 62 圖 6.12風系統使用者畫面 63 圖 6.13雨系統使用者畫面 64 圖 6.14雨系統使用者畫面 64 表目錄表 3.4 dataset內容 19 表 3.5 類別編號圖 20 表 3.15運算環境規格 30 表 3.16運算環境規格 30
參考文獻	[1] D. Ciresan, U. Meier, J. Schmidhuber,"Multi-column Deep Neural Networks for Image Classification", CVPR 2012, p. 3642-3649, 2012. [2] Z.-J. Zha, T. Mei, J. Wang, Z.Wang, X.-S. Hua, "Graph-based semi-supervised learning with multiple labels", Journal of Visual Communication and Image Representation,Volume 20, Issue 2, Pages 97-103, February 2009 [3] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, A. Oliva, "Learning Deep Features for Scene Recognition using Places Database", Advances in Neural Information Processing Systems 27 (NIPS 2014) [4] L. Giglio, J. Descloitres, C. O. Justice, Y. J. Kaufman, "An Enhanced Contextual Fire Detection Algorithm for MODIS", Remote Sensing of Environment,Volume 87, Issues 2–3, Pages 273-282, 15 October 2003 [5] F. Sebastiani, C. Nazionale delle Ricerche, Pisa, Italy, "Machine learning in automated text categorization", ACM Computing Surveys (CSUR) Volume 34 Issue 1 , Pages 1-47, March 2002 [6] Google Cloud Vision. URL: https://cloud.google.com/ vision/docs/ 2018. [7] Y. LeCun, Y. Bengio, G. Hinton, "Deep learningNature", volume 521, pages 436–444 (28 May 2015) [8] J. McCarthy, P.J. Hayes, "Some Philosophical Problems from the Standpoint of Artificial Intelligence", Readings in Artificial Intelligence, Pages 431-450,1981 [9] Pixabay URL: https://pixabay.com/zh/ [10] 室外與室內下雨場景圖[Online]Availible: https://www.pinterest.com/pin/501025527280930227/ [11] G. E. Hinton, R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks", Issue 5786, pp. 504-507, Science 28 Jul 2006:Vol. 313 [12] A. Krizhevsky, I. Sutskever, G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", Advances in Neural Information Processing Systems 25 (NIPS 2012) [13] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, "Large-scale Video Classification with Convolutional Neural Networks", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725-1732, 2014 [14] S. Gidaris, N. Komodakis, "Object Detection via a Multi-Region and Semantic Segmentation-Aware CNN Model", The IEEE International Conference on Computer Vision (ICCV), pp. 1134-1142, 2015 [15] R. Rouhi, M. Jafari, S. Kasaei, P. Keshavarzian, "Benign and malignant breast tumors classification based on region growing and CNN segmentation", Expert Systems with Applications Volume 42, Issue 3, Pages 990-1002, 15 February 2015 [16] R. Girshick, "Fast R-CNN", The IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448, 2015 [17] X. Wang, A. Shrivastava, A. Gupta, "A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2606-2615, 2017 [18] S. Ren, K. He, R. Girshick, J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", Advances in Neural Information Processing Systems 28 (NIPS 2015) [19] A. Salvador, X. Giro-i-Nieto, F. Marques, S. Satoh, "Faster R-CNN Features for Instance Search", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 9-16, 2016 [20] W. Liu, D. Anguelov, D. Erhan , C. Szegedy, S. Reed, C. -Y. Fu , A. C. Berg, "SSD: Single Shot MultiBox Detector", Lecture Notes in Computer Science book series (LNCS, volume 9905) [21] P. Mustamo , " Object detection in sports: TensorFlow Object Detection API case study", University of Oulu, Degree Programme in Mathematical Sciences.Bachelor’s Thesis , p . 43 , 2018 [22] 資料處理流程圖[Online]Availible: https://chtseng.wordpress.com/2019/02/16/%E5%A6%82%E4%BD%95%E4%BD%BF%E7%94%A8google-object-detection-api%E8%A8%93%E7%B7%B4%E8%87%AA%E5%B7%B1%E7%9A%84%E6%A8%A1%E5%9E%8B/ [23] D. A. Dyk , X. –L. Meng, "The Art of Data Augmentation", Journal of Computational and Graphical Statistics Volume 10, 2001 - Issue 1 [24] Pixabay搜索圖庫[Online]Availible: https://pixabay.com/zh/images/search/fire/ [25] Google搜索圖庫[Online]Availible: https://www.google.com.tw/search/fire/ [26] J. Davis, M. Goadrich, "The relationship between Precision-Recall and ROC curves",ICML '06 Proceedings of the 23rd international conference on Machine learning Pages 233-240 [27] P. Henderson, V. Ferrari, "End-to-End Training of Object Class Detectors for Mean Average Precision",Lecture Notes in Computer Science book series (LNCS, volume 10115) [28] M. A. Rahman, Y. Wang, "Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation", Lecture Notes in Computer Science book series (LNCS, volume 10072) [29] IoU (Intersection over Union) [Online]Availible: https://blog.csdn.net/Gentleman_Qin/article/details/84519388 [30] G. Barequet, S. Har-Peled, "Efficiently Approximating theMinimum-Volume Bounding Box of a Point Set in Three Dimensions", Journal of Algorithms Volume 38, Issue 1, Pages 91-109, January 2001 [31] Y.-S. Chen, P.-H. Han, J.-C. Hsiao, K.-C. Lee, C.-E. Hsieh, K.-Y. Lu, C.-H. Chou, Y.-P. Hung, "SoEs: Attachable Augmented Haptic on Gaming Controller for Immersive Interaction", pp. 71-72. , ACM UIST 2016
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信