§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2909202000342900
DOI 10.6846/TKU.2020.00877
論文名稱(中文) 基於深度學習的物件偵測演算法之探討
論文名稱(英文) Investigation of Deep Learning Based Object Detection Algorithms
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系資訊網路與多媒體碩士班
系所名稱(英文) Master's Program in Networking and Multimedia, Department of Computer Science and Information Engine
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 108
學期 2
出版年 109
研究生(中文) 楊鎔謙
研究生(英文) RONG-QIAN YANG
學號 607420170
學位類別 碩士
語言別 繁體中文
第二語言別 英文
口試日期 2020-07-14
論文頁數 74頁
口試委員 指導教授 - 洪文斌
委員 - 彭建文
委員 - 范俊海
關鍵字(中) 深度學習
卷積神經網路
YOLO
mAP
IOU
關鍵字(英) Deep Learning
CNN
YOLO
mAP
IOU
第三語言關鍵字
學科別分類
中文摘要
本論文討論如何解決礙子異常放電的問題,主要以如何準確定位礙子、及判斷礙子是否放電為主要討論方向。首先透過放電偵測儀搭配空拍機及固定的攝影機擷取影片,本論文有兩個實驗,實驗一為使用捲積神經網路(Convolutional Neural Network, CNN)來定位礙子,實驗二利用YOLOv3來定位礙子。利用CNN訓練出來的準確率約為0.95至0.98之間,由於利用CNN來定位礙子,最後會在判斷放電區域是否在礙子上出現問題,為追求更精確的實驗結果,後來透過YOLOv3定位方法直接定位礙子設備的正確位置,判斷紅色像素的數量是否超過標準值,此方法的定位效果佳,也能實作出系統。本論文利用傳統檢測的標準(mAP)分析YOLOv3的定位結果,模組的信心分數平均約莫在0.98~1.00之間,平均密度精度為(mAP)98.68%。
英文摘要
This article discusses how to solve the problem of abnormal discharge of Insulators, mainly focusing on how to accurately locate the Insulator and determine whether the  Insulator is discharged. First, an infrared detector with an aerial camera and a fixed camera is used to capture the video. There are two experiments in this article. The first experiment is to use Convolutional Neural Network (CNN) to locate Insulators, and the second experiment is to use YOLOv3. Look for Insulators . The accuracy of CNN training is about 0.95 to 0.98. Since CNN is used to locate Insulators , it will finally determine whether there is a problem with the emission area. In order to obtain more accurate experimental results, the YOLOv3 positioning method was later used to directly locate the correct position of the blocking sub-device and determine whether the number of red pixels exceeds the standard value. This method has good positioning effect and can also be implemented as a system. This article uses traditional detection standards (mAP) to analyze the positioning results of YOLOv3. The confidence score of this module is about 0.98-1.00 on average, and the average density accuracy (mAP) is 98.68%.
第三語言摘要
論文目次
目錄
中文摘要	I
英文摘要	II
圖目錄	VI
表目錄	X
第一章 緒論	1
1.1 前言	1
1.2 研究動機與目的	1
1.3 論文架構	2
第二章 相關背景和研究	3
2.1 電塔礙子	3
2.2 物件偵測	4
2.2.1 CNN	4
2.2.2 Sliding Window Detection	6
2.2.3 R-CNN	7
2.2.4 Selective Search	8
2.2.5 Fast R-CNN	9
2.2.6 Faster R-CNN	10
2.3 相關方法和研究	12
2.3.1檢測標準	12
2.3.2 YOLO	15
2.3.3 YOLOv2	17
2.3.4 YOLOv3	19
2.3.5 YOLOv4	22
2.3.6 OpenCV	25
2.3.7影像標註工具	26
第三章 研究方法	28
3.1實驗設備與環境	30
3.2實驗一	31
3.2.1實驗過程	31
3.2.2 CNN訓練架構	34
3.2.3實驗一結果分析	35
3.3 實驗二	38
3.3.1	系統實作	41
3.3.2	紅色區塊判斷	42
3.3.3 斜向礙子	45
3.2.4 小方框實驗結果	47
第四章 實驗結果分析	48
4.1 MAP分析結果	49
第五章 結論	57
參考文獻	58
附錄:英文論文	62









圖目錄
圖1:電塔上的礙子	3
圖2:CNN架構	5
圖3:大小不同方框掃描圖片	6
圖4:R-CNN	7
圖5:Graph base image segmentation	8
圖6:Fast R-CNN架構	9
圖7:Faster R-CNN架構	10
圖8:RPN架構	11
圖9:IOU概念圖	12
圖10:YOLO概念	15
圖11:YOLO架構	16
圖12:YOLOv2相比YOLO的改進策略	17
圖13:Darknet-19	18
圖14:YOLOv3檢測速度比較	19
圖15:Darknet53架構	20
圖16:YOLO3精度比較	21
圖17:物體檢測器	23
圖18:YOLOv4與其他最新物體檢測器比較	23
圖 19:(a)~(s)為Case1~Case19擷取圖片	30
圖20:實驗一流程圖	31
圖21:(a)~(d)為Case1~Case4所擷取之礙子	32
圖22:(a)~(d)為Case1~Case4背景	33
圖23:CNN訓練架構圖	34
圖24:CNN訓練準確率	36
圖25:分類正確之圖片	36
圖26:辨識錯誤之圖片	37
圖27:利用LabelImg標註礙子	38
圖28:實驗二流程圖	40
圖29:Insulator_model定位結果	41
圖30:系統架構圖	42
圖31:擷取Bounding box內圖片	42
圖32:RGB channel	43
圖33:判斷紅色區塊流程	44
圖34:斜向礙子	45
圖35:小方框標註方式	46
圖36:礙子定位結果	48
圖37:Case3_86	49
圖38:Case3_86 IOU數值	50
圖39:case9_255	50
圖40:case17_264	51
圖41:本實驗TP、FP	52
圖42:本實驗ground truth	52
圖43:Case5_63	53
圖44:Case5_289	54
圖45:Precision-Recall曲線	55
圖46:mAP、AP	56














表目錄
表1:各個Case所標記數量	39
表2:小方框統計表	46
表3:本實驗TP、FP、FN、TN	53
參考文獻
[1]	B. Knyazev, R. Shvetsov, N. Efremova and A.Kuharenko, “Leveraging Large Face Recognition Data for Emotion Classification,” In IEEE International Conference on Automatic Face & Gesture Recognition, 2018.
[2]	S. Sadhu, R. Li, and H. Hermansky, “M-vectors: Sub-band Based Energy Modulation Features for Multi-stream Automatic Speech Recognition,” In IEEE International Conference on Acoustics, Speech and Signal Processing, 2019.
[3]	D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber,  “High Performance Convolution Neural Networks For Images Classification,” In Twenty-second International Joint Conference on Artificial Intelligence, 2011. 
[4]	R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies For Accurate Object Detection And Semantic Segmentation,” In CVPR, 2014.
[5]	P. Felzenswalb and D. Huttenlocher, “Efficient Graph Based Image              Segmentation,” In International Journal on Computer Vision, pp.169-181, 2004.
[6]	R. Girshick, “Fast R-CNN,” In IEEE International Conference on Computer Vision (ICCV), 2015.
[7]	S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time   Object Detection With Region Proposal Networks,”In NIPS, 2015.
[8]	J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection.” arXiv preprint arXiv:1506.02640, 2015.
[9]	J. Redmon, and A. Farhadi, “YOLO9000:Better, Faster, Stronger,” In IEEE International Conference on Computer Vision, 2017.
[10]	J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv preprint arXiv:1804.02767, 2018.
[11]A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed
   and Accuracy of Object Detection,” In arXiv:2004.10934v1,2020.
[12] V.Le, “A Regularization Method for Convolutional Networks,” In Advances in Neural Information Processing Systems (NIPS), pp.10727–10737, 2018.


[13] C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.H.
Yeh, “In A New Backbone That Can Enhance Learning Capability of CNN,” In IEEE Conference on Computer Vision and Pattern Recognition Workshop,(CVPR Workshop), 2020.
[14] M. Tan and Q. V Le, “Mixed Depthwise Convolutional Kernels. ”  In British Machine Vision Conference (BMVC), 2019.
[15] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “More Features From 
Cheap Operations.,” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[16] K. He, X. Zhang, S. Ren, and J. Sun, “ Spatial Pyramid Pooling in Deep 
Convolutional Networks for Visual Recognition,” In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015.
[17] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “ Path Aggregation Network for
 	Instance Segmentation. ” In the IEEE Conference on Computer Vision and 
    Pattern Recognition (CVPR), pp. 8759–8768, 2018.
[18] T.Y. Lin, P. Dollar, R. Girshick, K.g He, B. Hariharan, and S. Belongie,
  	“Feature Pyramid Networks For Object Detection. ”In Proceedings of the 
   IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
   2117–2125, 2017.
[19] S. Liu, D. Huang, and Y. Wang, “Learning Spatial Fusion for Single-Shot
    Object Detection. ”In arXiv preprint arXiv:1911.09516, 2019.
[20] M. Tan, R. Pang, and Q. V Le, “ EfficientDet: Scalable and Efficient Object 
    Detection. ”In Proceedings of the IEEE Conference on Computer Vision 
	and Pattern Recognition (CVPR), 2020.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信