§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1506202116300300
DOI 10.6846/TKU.2021.00322
論文名稱(中文) 卷積神經網絡的架構與資料擴充技術對工地人員姿勢識別之影響
論文名稱(英文) The influence of the architecture of convolutional neural networks and data augmentation on the posture recognition of workers on construction site
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 土木工程學系碩士班
系所名稱(英文) Department of Civil Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 109
學期 2
出版年 110
研究生(中文) 許宇軒
研究生(英文) U-Hin Hoi
學號 609386015
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2021-06-03
論文頁數 119頁
口試委員 指導教授 - 葉怡成(140910@mail.tku.edu.tw)
委員 - 葉怡成(140910@mail.tku.edu.tw)
委員 - 蔡明修(mht@mail.tku.edu.tw)
委員 - 連立川(lclien@cycu.edu.tw)
關鍵字(中) 深度學習
YOLO
遷移學習
資料擴充技術
工地行人檢測
行人姿勢檢測
關鍵字(英) Deep learning
YOLO
Transfer learning
Data augmentation techniques
Construction workers detection
pedestrian posture detection
第三語言關鍵字
學科別分類
中文摘要
無論是公共工程或是大型建築,透過行人檢測觀察工地行人的行為,能獲得很多重要資訊,從而有效提升施工效率或安全。但由於施工現場環境複雜,常常出現遮掩物、光暗變化以及行人姿勢變化,傳統的機器學習對此不能有效檢測。此外,過去很少探討識別行人姿勢種類的文獻。因此,本研究以YOLOv4深度學習演算法來識別三種姿勢的工地行人 (站姿、彎腰及蹲姿),並透過優化卷積神經網絡參數及架構,以提高行人檢測的準確性。優化參數及架構包括 (1) 五種資料擴充技術(Data augmentation)、(2) 激活函數(Activation function),(3) 調整遷移學習(Transfer learning)分界點 (4) 學習速率,及 (5) 最大權重更新次數,一共九個因子。並利用二水準部分因子實驗設計,有效、合理地安排出16組實驗數據,最後透過效果分析出一組最佳因子水準組合。效果分析顯示除了以下因子,其餘因子並不顯著。(1) 拼貼法 (2) 傅立葉混合 (3) 遷移學習分界點 (4) 權重更新次數。最佳因子水準組合在80張工地圖像(325個工人)中,精準度、召回率、mAP分別為67.0%, 85.0%, 83.7%。平均每張圖像處理速度0.038 sec。結果表明,通過優化卷積神經網絡參數及架構,可以提高辨識各種姿勢的工地行人的準確性。
英文摘要
Whether it is an infrastructure or a large construction project, observing the construction workers on site through pedestrian detection can provide a lot of important information that can effectively improve construction efficiency or safety. However, due to the complexity of the construction site environment, there are often occlusions, changes in light and darkness, and changes in pedestrian posture that cannot be effectively detected by traditional machine learning. Moreover, there is little literature on the identification of pedestrian posture types. Therefore, this study used the YOLOv4 deep learning algorithm to identify three types of postures (standing, bending, and crouching) of site occupants and optimized the parameters and architecture of the convolutional neural network to improve the accuracy of pedestrian posture detection. The optimized parameters and architecture include (1) five data augmentation techniques, (2) activation function, (3) transfer learning, (4) learning rate, and (5) maximum number of weight updates, with a total of nine factors. And two-level partial factorial experimental design was used to efficiently and rationally arrange 16 sets of experiments, and the optimal combination of factor levels was identified through the effect analysis. It is revealed that the other factors were not significant, except for the following four factors, including two data augmentation techniques, Mosaic and Fourier mixture, and learning rate, and maximum number of weight updates. Based on the 80 construction site images (325 workers), the optimal factor level combination obtained 67.0% of accuracy, 85.0% of recall, and 83.7% of mAP (Mean Average Precision). The average processing speed per image was 0.038 sec. The results showed that optimizing the parameters and architecture of the convolutional neural network can improve the accuracy of identifying site worker posture.
第三語言摘要
論文目次
誌謝	I
Abstract	IV
圖目錄	IX
表目錄	XIII
第一章	導論	1
1.1	研究背景	1
1.2	研究動機與目的	2
1.3	研究內容	4
第二章	文獻回顧	6
2.1	傳統機器學習目標檢測算法	6
2.2	深度學習目標檢測算法	8
2.3	YOLO演算法	13
2.3.1	YOLOv1	13
2.3.2	YOLOv2	17
2.3.3	YOLOv3	25
2.3.4	YOLOv4	29
2.3.5   總結	37
2.4	行人檢測	38
2.5	工地行人檢測	39
2.6	行人姿勢檢測	41
2.7	結語	43
第三章	研究方法	45
3.1	第1階段:準備系統與工具	45
3.2	第2階段:準備數據集(收集、標籤)	48
3.3	第3階段:準備訓練檔(設定參數、參數調整)	52
3.3.1	Data augmentation (資料擴充技術)	54
3.3.2	Activation function (激活函數)	57
3.3.3	Transfer Learning (遷移學習)	59
3.3.4	Learning cycle (學習循環)	59
3.3.5	Learning rate (學習速率)	61
3.4	第4階段:訓練模型	61
3.5	第5階段:測試模型	62
第四章	研究結果	63
4.1	性能指標	63
4.2	參數組合設計	65
4.3	驗證數據:定量評估	67
4.4	測試數據:定量評估	69
4.5	驗證數據:定性評估	71
4.6	測試數據:定性評估	75
4.7	Design of Experiment(實驗設計)	81
第五章	結論與建議	102
5.1	結論	102
5.2	建議	105
參考文獻	109


圖目錄
圖2-1 傳統的機器學習方法[28]	6
圖2-2深度學習方法[28]	6
圖2-3卷積神經網絡(Convolutional Neural Network,CNN)	9
圖2-4卷積運算	9
圖2-5各種過濾器	10
圖2-6 ReLU	10
圖2-7 ReLU實例	11
圖2-8 最大池化運算	11
圖2-9 卷積神經網絡(Convolutional Neural Network,CNN)	12
圖2-10 YOLO檢測系統[14,41,42]	14
圖2-11非極大值抑制	15
圖2-12 YOLOv1架構[14]	16
圖2-13 COCO 和VOC2007 的聚類分析結果[15]	20
圖2-14 Dimension Clusters與Direct Location Prediction	22
圖2-15 各尺寸的Multi-Scale Training	24
圖2-16 YOLOv4架構	29
圖2-17 CSPNet的簡化結構 (資料來源│中央研究院-研之有物 王建堯)	30
圖2-18 Spatial Pyramid Pooling	31
圖2-19 FPN與PANet	32
圖2-20 PAN與Yolov4 Modified PANet	32
圖2-21 CIOU Loss	33
圖2-22 CIOU Loss	34
圖2-23 CIOU Loss	34
圖2-24 DIOU Loss	35
圖2-25 DIOU Loss.	35
圖2-26 各種工地行人姿勢之偵測[7]	43

圖3-1 labelImg手動標籤	47
圖3-2 訓練資料、驗證資料、測試資料之比較	49
圖3-3 自然人群拍攝	49
圖3-4 設計姿勢拍攝	50
圖3-5 不同工地照片 (網絡下載)	51
圖3-6 generate_train.py, generate_test.py	53
圖3-7 Angle Data Augmentation	54
圖3-8 Saturation Data Augmentation	55
圖3-9 Exposure Data Augmentation	55
圖3-10 FMix Data Augmentation	56
圖3-11 Mosaic Data Augmentation	57
圖3-12 Swish Activation function	58
圖3-13 Mish Activation function	58
圖3-14 過度學習(overfitting)	60
圖3-15 訓練集與驗證集平均損失圖表	60
圖3-16 不同學習率的梯度下降	61

圖4-1 Intersection over Union(IOU)	63
圖4-2 precision-recall graph	64
圖4-3 驗證數據 標竿參數(mAP最高)	72
圖4-4 驗證數據 參數9(mAP最低)	73
圖4-5 驗證數據 標竿參數(mAP最高)	74
圖4-6 驗證數據 參數9(mAP最低)	74
圖4-7 測試數據 標竿參數	76
圖4-8 測試數據 參數9(mAP最高)	76
圖4-9 測試數據 參數7(mAP最低)	77
圖4-10 測試數據 標竿參數	78
圖4-11 測試數據 參數9(mAP最高)	78
圖4-12 測試數據 參數7(mAP最低)	79
圖4-13 測試數據 標竿參數	80
圖4-14 測試數據 參數9(mAP最高)	80
圖4-15 測試數據 參數7(mAP最低)	81
圖4-16 因子的效果	95
圖4-17 因子效果的常態機率圖	96
圖4-18 迴歸公式的mAP預測值與實驗值之散布圖	96
圖4-19 測試數據 (驗證實驗)	98
圖4-20 測試數據 (原實驗最佳(參數5組合) )	99
圖4-21 測試數據 (驗證實驗)	99
圖4-22 測試數據 (原實驗最佳(參數5組合) )	100
圖4-23 測試數據 (驗證實驗)	100
圖4-24 測試數據 (原實驗最佳(參數5組合) )	101

圖5-1反應曲面法實驗設計	106
圖5-2 檢測結果與label之數據	107
圖5-3 檢測結果轉換label圖	108


表目錄
表2-1 : Darknet-19	18
表2-2 : YOLO與YOLOv2各種設計比較[15]	25
表2-3 : Darknet-53 backbone[16]	28
表2-4 : OVGG-16, VGG-16和HMPD的性能指標[22]	41

表3-1 : 訓練與驗證資料集統計	52
表3-2 :測試資料集(網絡下載工地照片)統計	52

表4-1:Confusion Matrix	64
表4-2 : YOLOv4行人姿勢辨識模型參數	66
表4-3 : YOLOv4行人姿勢辨識結果 (驗證資料集)(173張圖像)	68
表4-4 : YOLOv4行人姿勢辨識結果 (工地測試資料集) (80張圖像)	70
表4-5 : 九因子二水準一般部分因子設計實驗	82
表4-6 :二水準代表之參數	83
表4-7 : YOLOv4行人姿勢辨識模型參數	84
表4-8 : YOLOv4行人姿勢辨識結果 (驗證資料集) (173張圖像)	86
表4-9 : YOLOv4行人姿勢辨識結果 (工地測試資料集) (80張圖像)	87
表4-10 : 效果計算表	89
表4-11:測試數據 (驗證實驗)	98

表5-1 : 反應曲面法實驗設計	106
參考文獻
[1]	Son, H., Sung, H., Choi, H., Lee, S., & Kim, C. (2017). Detection of nearby obstacles with monocular vision for earthmoving operations. In ISARC. Proceedings of the International Symposium on Automation and Robotics in Construction (Vol. 34). IAARC Publications.
[2]	Yang, J., Park, M. W., Vela, P. A., & Golparvar-Fard, M. (2015). Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future. Advanced Engineering Informatics, 29(2), 211-224.
[3]	Fang, Y., Chen, J., Cho, Y. K., & Zhang, P. (2016, January). A point cloud-vision hybrid approach for 3D location tracking of mobile construction assets. In 33rd International Symposium on Automation and Robotics in Construction (ISARC 2016). Proceedings of the International Symposiumon Automation and Robotics in Construction (Vol. 33, pp. 1-7).
[4]	Alwasel, A., Sabet, A., Nahangi, M., Haas, C. T., & Abdel-Rahman, E. (2017). Identifying poses of safe and productive masons using machine learning. Automation in Construction, 84, 345-355.
[5]	Chen, J., Fang, Y., & Cho, Y. K. (2018). Performance evaluation of 3D descriptors for object recognition in construction applications. Automation in Construction, 86, 44-52. 
[6]	Fang, W., Ding, L., Zhong, B., Love, P. E., & Luo, H. (2018). Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach. Advanced Engineering Informatics, 37, 139-149.
[7]	Son, H., Choi, H., Seong, H., Kim, C.,(2019) Detection of construction workers under varying poses and changing background in image sequences via very deep residual networks, Automation in Construction 99, pages 27-38, DOI: 10.1016/j.autcon.2018.11.033.
[8]	Szpak, Z. L., & Tapamo, J. R. (2011). Maritime surveillance: Tracking ships inside a dynamic background using a fast level-set. Expert systems with applications, 38(6), 6669-6680.
[9]	Milanés, V., Llorca, D. F., Villagrá, J., Pérez, J., Parra, I., González, C., & Sotelo, M. A. (2012). Vision-based active safety system for automatic stopping. Expert Systems with Applications, 39(12), 11234-11242.
[10]	Khare, V., Shivakumara, P., & Raveendran, P. (2015). A new histogram oriented moments descriptor for multi-oriented moving text detection in video. Expert Systems with Applications, 42(21), 7627-7640.
[11]	Serratosa, F., Alquézar, R., & Amézquita, N. (2012). A probabilistic integrated object recognition and tracking framework. Expert Systems With Applications, 39(8), 7302-7318. 
[12]	Guo, L., Ge, P. S., Zhang, M. H., Li, L. H., & Zhao, Y. B. (2012). Pedestrian detection for intelligent transportation systems combining AdaBoost algorithm and support vector machine. Expert Systems with Applications, 39(4), 4274-4286.
[13]	Seo, J., Han, S., Lee, S., & Kim, H. (2015). Computer vision techniques for construction safety and health monitoring. Advanced Engineering Informatics, 29(2), 239-251.
[14]	Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
[15]	Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271). 
[16]	Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. 
[17]	Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham. 
[18]	Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587).
[19]	Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).
[20]	Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6), 1137-1149.
[21]	Nath, N. D., & Behzadan, A. H. (2020). Deep Convolutional Networks for Construction Object Detection Under Different Visual Conditions. Frontiers in Built Environment, 6, 97.
[22]	Kim, B., Yuvaraj, N., Sri Preethaa, K. R., Santhosh, R., & Sabari, A. (2020). Enhanced pedestrian detection using optimized deep convolution neural network for smart building surveillance. Soft Computing, 24, 17081-17092.
[23]	Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham.
[24]	Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).
[25]	Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818-833). Springer, Cham.
[26]	Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497.
[27]	Zhang, H., Kyaw, Z., Yu, J., & Chang, S. F. (2017). Ppr-fcn: Weakly supervised visual relation detection via parallel pairwise r-fcn. In Proceedings of the IEEE International Conference on Computer Vision (pp. 4233-4241).
[28]	Information on: https://www.xenonstack.com/blog/log-analytics-deep-machine-learning/
[29]	Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). Ieee.
[30]	Ahonen, T., Hadid, A., & Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE transactions on pattern analysis and machine intelligence, 28(12), 2037-2041.
[31]	Lienhart, R., & Maydt, J. (2002, September). An extended set of haar-like features for rapid object detection. In Proceedings. international conference on image processing (Vol. 1, pp. I-I). IEEE.
[32]	Noble, W. S. (2006). What is a support vector machine?. Nature biotechnology, 24(12), 1565-1567.
[33]	Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics, 21(3), 660-674.
[34]	Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46).
[35]	Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1, No. 2). Cambridge: MIT press.
[36]	Wang, Y., Wang, L., Jiang, Y., & Li, T. (2020, September). Detection of Self-Build Data Set Based on YOLOv4 Network. In 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE) (pp. 640-642). IEEE.
[37]	Hongpeng, Y., Bo, C., & Yi, C. (2016). Overview of target detection and tracking based on vision [J]. Journal of Automation, 42(10), 1466-1489.
[38]	Zhou Xiaoyan, Wang Ke, Li Lingyan. Survey of target detection algorithms based on deep learning [J]. Electronic Measurement Technology, 2017(11): 89-93. 
[39]	Kim, J. A., Sung, J. Y., & Park, S. H. (2020, November). Comparison of Faster-RCNN, YOLO, and SSD for Real-Time Vehicle Type Recognition. In 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia) (pp. 1-4). IEEE.
[40]	Alexey(2019), AlexeyAB / darknet, github, from https://github.com/AlexeyAB/darknet.
[41]	Chenchou LO(2017), YOLO — You Only Look Once 介紹, medium, from  https://medium.com/@chenchoulo/yolo-%E4%BB%8B%E7%B4%B9-4307e79524fe
[42]	李謦伊(2020), YOLO演進-1, medium, from https://medium.com/ching-i/yolo%E6%BC%94%E9%80%B2-1-33220ebc1d09
[43]	Tyan(2017), 非极大值抑制(Non-Maximum Suppression), SnaiTyan, from http://noahsnail.com/2017/12/13/2017-12-13-%E9%9D%9E%E6%9E%81%E5%A4%A7%E5%80%BC%E6%8A%91%E5%88%B6(Non-Maximum%20Suppression)/
[44]	Leyan Bin Veon(2019), YOLO v1 物件偵測~論文整理, medium, from https://medium.com/%E7%A8%8B%E5%BC%8F%E5%B7%A5%E4%BD%9C%E7%B4%A1/yolo-v1-%E7%89%A9%E4%BB%B6%E5%81%B5%E6%B8%AC-%E8%AB%96%E6%96%87%E6%95%B4%E7%90%86-935bfd51d5e0
[45]	BigCowPeking(2018), Face Paper: YOLOv2論文詳解, CSND, from https://blog.csdn.net/wfei101/article/details/78944891
[46]	Leyan Bin Veon(2019), YOLO v2 物件偵測~論文整理, medium, from https://medium.com/%E7%A8%8B%E5%BC%8F%E5%B7%A5%E4%BD%9C%E7%B4%A1/yolo-v2-%E7%89%A9%E4%BB%B6%E5%81%B5%E6%B8%AC-%E8%AB%96%E6%96%87%E6%95%B4%E7%90%86-a8e11d8b4409
[47]	Algernon(2020), 【論文解讀】Yolo三部曲解讀——Yolov2, 知乎, from https://zhuanlan.zhihu.com/p/74540100
[48]	冷鋒(2018), 史上最通俗易懂的YOLOv2講解, itread01, from https://www.itread01.com/content/1544838379.html
[49]	李謦伊(2020), YOLO演進-2, medium, from https://medium.com/ching-i/yolo%E6%BC%94%E9%80%B2-2-85ee99d114a1
[50]	Allen Tzeng(2020), [論文] YOLOv3 : An Incremental Improvement, Math.py, from https://allen108108.github.io/blog/2020/02/15/[%E8%AB%96%E6%96%87]%20YOLOv3%20_%20An%20Incremental%20Improvement/
[51]	YOLOv3 目標檢測演算法詳細總結分析(one-stage)(深度學習)(CVPR 2018) (2018), itread01, from https://www.itread01.com/content/1543030155.html
[52]	Gueiyajhang(2020), Day03 YOLOv3 (即時物件偵測), coderbridge, from https://zh-tw.coderbridge.com/series/d4b5a1a1565e4e7a9cd14618ffe6146f/posts/7ac8de3dbb1b441ab1b2788386a3c349
[53]	Ivan(2019), [物件辨識] S10: YOLOv3 簡介, medium, from https://ivan-eng-murmur.medium.com/%E7%89%A9%E4%BB%B6%E8%BE%A8%E8%AD%98-s10-yolov3-%E7%B0%A1%E4%BB%8B-6d2409d4e5ee
[54]	Kevin陶民澤(2019), 目標檢測 YOLOv3 論文翻譯(高質量版), 每日頭條, from https://kknews.cc/zh-tw/code/2noxm2y.html
[55]	He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9), 1904-1916.
[56]	Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8759-8768).
[57]	Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125).
[58]	Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., ... & Murphy, K. (2017). Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7310-7311).
[59]	Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017, February). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1).
[60]	Shrivastava, A., Sukthankar, R., Malik, J., & Gupta, A. (2016). Beyond skip connections: Top-down modulation for object detection. arXiv preprint arXiv:1612.06851.
[61]	Fu, C. Y., Liu, W., Ranga, A., Tyagi, A., & Berg, A. C. (2017). Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659.
[62]	Girshick, R. B. (2015). Fast R-CNN. CoRR, abs/1504.08083.
[63]	He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[64]	Wang, C. Y., Liao, H. Y. M., Wu, Y. H., Chen, P. Y., Hsieh, J. W., & Yeh, I. H. (2020). CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 390-391).
[65]	Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. (2019). Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 658-666).
[66]	Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 07, pp. 12993-13000).
[67]	Molchanov, V. V., Vishnyakov, B. V., Vizilter, Y. V., Vishnyakova, O. V., & Knyaz, V. A. (2017, June). Pedestrian detection in video surveillance using fully convolutional yolo neural network. In Automated visual inspection and machine vision II (Vol. 10334, p. 103340Q). International Society for Optics and Photonics.
[68]	Liu, X., Yan, Y., & Gan, H. (2021). Research on pedestrian detection algorithm in driverless urban traffic environment. In MATEC Web of Conferences (Vol. 336, p. 06002). EDP Sciences.
[69]	Li, J., & Wu, Z. (2021, April). The Application of Yolov4 And A New Pedestrian Clustering Algorithm to Implement Social Distance Monitoring During The COVID-19 Pandemic. In Journal of Physics: Conference Series (Vol. 1865, No. 4, p. 042019). IOP Publishing.
[70]	Wu, J. (2018). Complexity and accuracy analysis of common artificial neural networks on pedestrian detection. In MATEC Web of Conferences (Vol. 232, p. 01003). EDP Sciences.
[71]	Son, H., Sung, H., Choi, H., Lee, S., & Kim, C. (2017). Detection of nearby obstacles with monocular vision for earthmoving operations. In ISARC. Proceedings of the International Symposium on Automation and Robotics in Construction (Vol. 34). IAARC Publications.
[72]	Park, M. W., Elsafty, N., & Zhu, Z. (2015). Hardhat-wearing detection for enhancing on-site safety of construction workers. Journal of Construction Engineering and Management, 141(9), 04015024.
[73]	Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T. M., & An, W. (2018). Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Automation in Construction, 85, 1-9.
[74]	Fang, W., Ding, L., Luo, H., & Love, P. E. (2018). Falls from heights: A computer vision-based approach for safety harness detection. Automation in Construction, 91, 53-61.
[75]	Han, S., Lee, S., & Peña-Mora, F. (2013). Vision-based detection of unsafe actions of a construction worker: Case study of ladder climbing. Journal of Computing in Civil Engineering, 27(6), 635-644.
[76]	Han, S., & Lee, S. (2013). A vision-based motion capture and recognition framework for behavior-based safety management. Automation in Construction, 35, 131-141.
[77]	Han, S., Lee, S., & Peña-Mora, F. (2014). Comparative study of motion features for similarity-based modeling and classification of unsafe actions in construction. Journal of Computing in Civil Engineering, 28(5), A4014005.
[78]	Seo, J., Starbuck, R., Han, S., Lee, S., & Armstrong, T. J. (2015). Motion data-driven biomechanical analysis during construction tasks on sites. Journal of Computing in Civil Engineering, 29(4), B4014005.
[79]	Gong, J., Caldas, C. H., & Gordon, C. (2011). Learning and classifying actions of construction workers and equipment using Bag-of-Video-Feature-Words and Bayesian network models. Advanced Engineering Informatics, 25(4), 771-782.
[80]	Memarzadeh, M., Golparvar-Fard, M., & Niebles, J. C. (2013). Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors. Automation in Construction, 32, 24-37.
[81]	Gong, J., & Caldas, C. H. (2011). An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction operations. Automation in Construction, 20(8), 1211-1226.
[82]	Yang, J., Cheng, T., Teizer, J., Vela, P. A., & Shi, Z. K. (2011). A performance evaluation of vision and radio frequency tracking methods for interacting workforce. Advanced Engineering Informatics, 25(4), 736-747.
[83]	Yang, J., Park, M. W., Vela, P. A., & Golparvar-Fard, M. (2015). Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future. Advanced Engineering Informatics, 29(2), 211-224.
[84]	Fang, Y., Chen, J., Cho, Y. K., & Zhang, P. (2016, January). A point cloud-vision hybrid approach for 3D location tracking of mobile construction assets. In 33rd International Symposium on Automation and Robotics in Construction (ISARC 2016). Proceedings of the International Symposiumon Automation and Robotics in Construction (Vol. 33, pp. 1-7).
[85]	Alwasel, A., Sabet, A., Nahangi, M., Haas, C. T., & Abdel-Rahman, E. (2017). Identifying poses of safe and productive masons using machine learning. Automation in Construction, 84, 345-355.
[86]	Chen, J., Fang, Y., & Cho, Y. K. (2018). Performance evaluation of 3D descriptors for object recognition in construction applications. Automation in Construction, 86, 44-52.
[87]	Kim, H., Kim, K., & Kim, H. (2016). Vision-based object-centric safety assessment using fuzzy inference: Monitoring struck-by accidents with moving objects. Journal of Computing in Civil Engineering, 30(4), 04015075.
[88]	Chi, S., & Caldas, C. H. (2011). Automated object identification using optical video cameras on construction sites. Computer‐Aided Civil and Infrastructure Engineering, 26(5), 368-380.
[89]	Park, M. W., & Brilakis, I. (2012). Construction worker detection in video frames for initializing vision trackers. Automation in Construction, 28, 15-25.
[90]	Lee, S., & Hong, M. (2014). Implementation of Man-Hours Measurement System for Construction Work Crews by Image Processing Technology. Applied Mathematics & Information Sciences, 8(3), 1287.
[91]	Kim, H., Kim, H., Hong, Y. W., & Byun, H. (2018). Detecting construction equipment using a region-based fully convolutional network and transfer learning. Journal of computing in Civil Engineering, 32(2), 04017082.
[92]	Rubaiyat, A. H., Toma, T. T., Kalantari-Khandani, M., Rahman, S. A., Chen, L., Ye, Y., & Pan, C. S. (2016, October). Automatic detection of helmet uses for construction safety. In 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW) (pp. 135-142). IEEE.
[93]	Yang, J., Arif, O., Vela, P. A., Teizer, J., & Shi, Z. (2010). Tracking multiple workers on construction sites using video cameras. Advanced Engineering Informatics, 24(4), 428-434.
[94]	Kim, K., Kim, H., & Kim, H. (2017). Image-based construction hazard avoidance system using augmented reality in wearable device. Automation in construction, 83, 390-403.
[95]	Please refer to nvidia Web site https://developer.nvidia.com/
[96]	Please refer to opencv Web site https://opencv.org/
[97]	Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep learning. Journal of Big Data, 6(1), 1-48.
[98]	Harris, E., Marcu, A., Painter, M., Niranjan, M., & Hare, A. P. B. J. (2020). Fmix: Enhancing mixed sample data augmentation. arXiv preprint arXiv:2002.12047, 2(3), 4.
[99]	Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.
[100]	Ramachandran, P., Zoph, B., & Le, Q. V. (2017). Searching for activation functions. arXiv preprint arXiv:1710.05941.
[101]	Misra, D. (2019). Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681, 4.
[102]	Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2017). Visualizing the loss landscape of neural nets. arXiv preprint arXiv:1712.09913.
[103]	Smith, L. N. (2017, March). Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464-472). IEEE.
[104]	葉怡成(2001/05/31), 實驗計劃法:製程與產品最佳化, 五南。
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信