電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2020-09-29起於校外公開使用
本論文紙本於2020-09-29起公開使用

系統識別號	U0002-2909202000342900
DOI	10.6846/TKU.2020.00877
論文名稱(中文)	基於深度學習的物件偵測演算法之探討
論文名稱(英文)	Investigation of Deep Learning Based Object Detection Algorithms
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系資訊網路與多媒體碩士班
系所名稱(英文)	Master's Program in Networking and Multimedia, Department of Computer Science and Information Engine
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	108
學期	2
出版年	109
研究生(中文)	楊鎔謙
研究生(英文)	RONG-QIAN YANG
學號	607420170
學位類別	碩士
語言別	繁體中文
第二語言別	英文
口試日期	2020-07-14
論文頁數	74頁
口試委員	指導教授 - 洪文斌委員 - 彭建文委員 - 范俊海
關鍵字(中)	深度學習卷積神經網路 YOLO mAP IOU
關鍵字(英)	Deep Learning CNN YOLO mAP IOU
第三語言關鍵字
學科別分類
中文摘要	本論文討論如何解決礙子異常放電的問題，主要以如何準確定位礙子、及判斷礙子是否放電為主要討論方向。首先透過放電偵測儀搭配空拍機及固定的攝影機擷取影片，本論文有兩個實驗，實驗一為使用捲積神經網路(Convolutional Neural Network, CNN)來定位礙子，實驗二利用YOLOv3來定位礙子。利用CNN訓練出來的準確率約為0.95至0.98之間，由於利用CNN來定位礙子，最後會在判斷放電區域是否在礙子上出現問題，為追求更精確的實驗結果，後來透過YOLOv3定位方法直接定位礙子設備的正確位置，判斷紅色像素的數量是否超過標準值，此方法的定位效果佳，也能實作出系統。本論文利用傳統檢測的標準(mAP)分析YOLOv3的定位結果，模組的信心分數平均約莫在0.98～1.00之間，平均密度精度為(mAP)98.68%。
英文摘要	This article discusses how to solve the problem of abnormal discharge of Insulators, mainly focusing on how to accurately locate the Insulator and determine whether the Insulator is discharged. First, an infrared detector with an aerial camera and a fixed camera is used to capture the video. There are two experiments in this article. The first experiment is to use Convolutional Neural Network (CNN) to locate Insulators, and the second experiment is to use YOLOv3. Look for Insulators . The accuracy of CNN training is about 0.95 to 0.98. Since CNN is used to locate Insulators , it will finally determine whether there is a problem with the emission area. In order to obtain more accurate experimental results, the YOLOv3 positioning method was later used to directly locate the correct position of the blocking sub-device and determine whether the number of red pixels exceeds the standard value. This method has good positioning effect and can also be implemented as a system. This article uses traditional detection standards (mAP) to analyze the positioning results of YOLOv3. The confidence score of this module is about 0.98-1.00 on average, and the average density accuracy (mAP) is 98.68%.
第三語言摘要
論文目次	目錄中文摘要 I 英文摘要 II 圖目錄 VI 表目錄 X 第一章緒論 1 1.1 前言 1 1.2 研究動機與目的 1 1.3 論文架構 2 第二章相關背景和研究 3 2.1 電塔礙子 3 2.2 物件偵測 4 2.2.1 CNN 4 2.2.2 Sliding Window Detection 6 2.2.3 R-CNN 7 2.2.4 Selective Search 8 2.2.5 Fast R-CNN 9 2.2.6 Faster R-CNN 10 2.3 相關方法和研究 12 2.3.1檢測標準 12 2.3.2 YOLO 15 2.3.3 YOLOv2 17 2.3.4 YOLOv3 19 2.3.5 YOLOv4 22 2.3.6 OpenCV 25 2.3.7影像標註工具 26 第三章研究方法 28 3.1實驗設備與環境 30 3.2實驗一 31 3.2.1實驗過程 31 3.2.2 CNN訓練架構 34 3.2.3實驗一結果分析 35 3.3 實驗二 38 3.3.1 系統實作 41 3.3.2 紅色區塊判斷 42 3.3.3 斜向礙子 45 3.2.4 小方框實驗結果 47 第四章實驗結果分析 48 4.1 MAP分析結果 49 第五章結論 57 參考文獻 58 附錄：英文論文 62 圖目錄圖1：電塔上的礙子 3 圖2：CNN架構 5 圖3：大小不同方框掃描圖片 6 圖4：R-CNN 7 圖5：Graph base image segmentation 8 圖6：Fast R-CNN架構 9 圖7：Faster R-CNN架構 10 圖8：RPN架構 11 圖9：IOU概念圖 12 圖10：YOLO概念 15 圖11：YOLO架構 16 圖12：YOLOv2相比YOLO的改進策略 17 圖13：Darknet-19 18 圖14：YOLOv3檢測速度比較 19 圖15：Darknet53架構 20 圖16：YOLO3精度比較 21 圖17：物體檢測器 23 圖18：YOLOv4與其他最新物體檢測器比較 23 圖 19：(a)~(s)為Case1~Case19擷取圖片 30 圖20：實驗一流程圖 31 圖21：(a)~(d)為Case1~Case4所擷取之礙子 32 圖22：(a)~(d)為Case1~Case4背景 33 圖23：CNN訓練架構圖 34 圖24：CNN訓練準確率 36 圖25：分類正確之圖片 36 圖26：辨識錯誤之圖片 37 圖27：利用LabelImg標註礙子 38 圖28：實驗二流程圖 40 圖29：Insulator_model定位結果 41 圖30：系統架構圖 42 圖31：擷取Bounding box內圖片 42 圖32：RGB channel 43 圖33：判斷紅色區塊流程 44 圖34：斜向礙子 45 圖35：小方框標註方式 46 圖36：礙子定位結果 48 圖37：Case3_86 49 圖38：Case3_86 IOU數值 50 圖39：case9_255 50 圖40：case17_264 51 圖41：本實驗TP、FP 52 圖42：本實驗ground truth 52 圖43：Case5_63 53 圖44：Case5_289 54 圖45：Precision-Recall曲線 55 圖46：mAP、AP 56 表目錄表1：各個Case所標記數量 39 表2：小方框統計表 46 表3：本實驗TP、FP、FN、TN 53
參考文獻	[1] B. Knyazev, R. Shvetsov, N. Efremova and A.Kuharenko, “Leveraging Large Face Recognition Data for Emotion Classification,” In IEEE International Conference on Automatic Face & Gesture Recognition, 2018. [2] S. Sadhu, R. Li, and H. Hermansky, “M-vectors: Sub-band Based Energy Modulation Features for Multi-stream Automatic Speech Recognition,” In IEEE International Conference on Acoustics, Speech and Signal Processing, 2019. [3] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber, “High Performance Convolution Neural Networks For Images Classification,” In Twenty-second International Joint Conference on Artificial Intelligence, 2011. [4] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies For Accurate Object Detection And Semantic Segmentation,” In CVPR, 2014. [5] P. Felzenswalb and D. Huttenlocher, “Efficient Graph Based Image Segmentation,” In International Journal on Computer Vision, pp.169-181, 2004. [6] R. Girshick, “Fast R-CNN,” In IEEE International Conference on Computer Vision (ICCV), 2015. [7] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection With Region Proposal Networks,”In NIPS, 2015. [8] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection.” arXiv preprint arXiv:1506.02640, 2015. [9] J. Redmon, and A. Farhadi, “YOLO9000:Better, Faster, Stronger,” In IEEE International Conference on Computer Vision, 2017. [10] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv preprint arXiv:1804.02767, 2018. [11]A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” In arXiv:2004.10934v1,2020. [12] V.Le, “A Regularization Method for Convolutional Networks,” In Advances in Neural Information Processing Systems (NIPS), pp.10727–10737, 2018. [13] C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.H. Yeh, “In A New Backbone That Can Enhance Learning Capability of CNN,” In IEEE Conference on Computer Vision and Pattern Recognition Workshop,(CVPR Workshop), 2020. [14] M. Tan and Q. V Le, “Mixed Depthwise Convolutional Kernels. ” In British Machine Vision Conference (BMVC), 2019. [15] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “More Features From Cheap Operations.,” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [16] K. He, X. Zhang, S. Ren, and J. Sun, “ Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. [17] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “ Path Aggregation Network for Instance Segmentation. ” In the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759–8768, 2018. [18] T.Y. Lin, P. Dollar, R. Girshick, K.g He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks For Object Detection. ”In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125, 2017. [19] S. Liu, D. Huang, and Y. Wang, “Learning Spatial Fusion for Single-Shot Object Detection. ”In arXiv preprint arXiv:1911.09516, 2019. [20] M. Tan, R. Pang, and Q. V Le, “ EfficientDet: Scalable and Efficient Object Detection. ”In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權予資料庫廠商校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信