電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2021-09-18起於校外公開使用
本論文紙本於2021-09-18起公開使用

系統識別號	U0002-1609201913491500
DOI	10.6846/TKU.2019.00463
論文名稱(中文)	孿生網路應用梯度傳導模板於物件追蹤
論文名稱(英文)	Siamese Network with Gradient-Based Template for Object Tracking
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	107
學期	2
出版年	108
研究生(中文)	李中睿
研究生(英文)	Chung-Jui Lee
學號	606410354
學位類別	碩士
語言別	繁體中文
第二語言別	英文
口試日期	2019-07-16
論文頁數	49頁
口試委員	指導教授 - 顏淑惠(105390@mail.tku.edu.tw) 委員 - 廖弘源(liao@iis.sinica.edu.tw) 委員 - 凃瀞珽(cttu@nchu.edu.tw)
關鍵字(中)	視覺物件追蹤孿生類神經網路
關鍵字(英)	Object tracking, Siamese network
第三語言關鍵字
學科別分類
中文摘要	本篇論文以孿生類神經網路作為追蹤的基本模型，經過大量的訓練，使得任意目標在被選定後，利用此訓練過的網路產生該目標的模板特徵，此模板特徵便可以在搜索區域內找最相似的特徵因而達到追蹤的目的。該目標的模板特徵只需產生一次便不再變更，雖然計算量大為減少但是卻無法即時對目標物的遮蔽、形變等予以呈現，對追蹤就會造成追蹤錯誤的結果。受到類神經網路視覺化方法的啟發，我們對產生出來的反應圖最大值做倒傳遞以取得類神經網路模型所主要學習得到的特徵，並將這些資訊加到原模板特徵以強化重要的特徵，使得孿生類神經網路的模板特徵可以更加專注在目標物上。由於強化過程所需的倒傳遞運算只是對一部分的追蹤結果執行，並非對整體網路參數做訓練，因此所增加的運算負擔很少，在benchmark測試中，整體平均速度在82.2 Fps。根據實驗顯示此方法可以取出對目標有效的特徵，在某些片段可以使原本追丟片段中又追回到原本的目標上。
英文摘要	In this paper, we use the Siamese network as the basic model for tracking task. After training, any target can be selected to generate the exemplar features. The exemplar features match the most similar features of search region and thus achieve tracking task. Once they are generated, the exemplar features are not changed, although the cost of calculation is greatly reduced, the features can’t adapt the appearance changes of the target, such like occlusion, deformation, to the template. This situation may affect the performance. Inspired by the visualization method of neural network, we use the gradient magnitudes at the position of the predicted target to obtain the features of the neural network model. then add this information to the exemplar features to enhance them, and make them more focused on the target. Since the backpropagation is only performed on a part of the model, and the overall network parameters are not trained, the computational burden is less, we test on the benchmark and the total average speed is 82.2 FPS. According to experiments, this method can extract the features that are effective for the target. In some sequences, the failure frame can be traced back to the target.
第三語言摘要
論文目次	目錄第一章緒論 1 第二章相關文獻回顧 4 第三章研究方法 6 3.1 孿生類神經網路 6 3.2 本篇論文架構 8 3.3 標籤權重 10 3.4 模板強化 11 3.5 訓練流程 13 3.6 追蹤流程 14 第四章實驗 17 4.1 強化模板和更新機制 18 4.2 本系統不同元件的效能 24 4.3 其他方法的比較 26 第五章結論 35 參考文獻 36 附錄：英文論文 38 圖目錄圖 1. 孿生類神經網路 7 圖 2. 本篇論文架構 8 圖 3. 模板強化圖示 12 圖 4. 測試示意圖 14 圖 5. (b-e)為以OTB-100 [21] 資料庫中四段影片的第一個畫格產生的目標區域，以(b)為例，上列圖示意選擇的channels、下列則利用完整的channels來做成Grad-CAM [15]，產生出來即左圖，右圖則是將左圖的結果疊加到原來影像的結果，圖中越紅的部分值越高。 19 圖 6. 使用強化模板的差異，綠色框是ground-truth box、紅色框是預測框，(a)第一列利用本篇論文基本架構沒有經過模板強化所產生的追蹤結果，第二列則有。(b)的箭頭分別指出最高反應位置，反應圖越紅值越大。 20 圖 7. 影片Coupon的PSR數據圖。藍線代表沒有更新條件、橘線代表有利用更新條件、紅線則是PSR門檻值。兩個插圖分別表示在第50、90個畫格。 22 圖 8. 影片Coupon畫格127 – 130，在無/有使用PSR當作更新條件時的反應圖以及追蹤結果，ground-truth box為綠框、預測框為紅框。 23 圖 9. 本篇論文架構測試，OTB-100 [21] precision plots和success plots。 26 圖 10. OTB-2013, OTB-50, OTB-100 precision plots and success plots of OPE 28 圖 11. OTB-100, SRE & TRE的測試結果 29 圖 12. OTB-100各屬性success plots of OPE 32 圖 13. 成功的結果，影片分別是BlurBody、Bolt2、CarDark、Couple、Shaking，綠色框是ground-truth box、紅色框是預測框 33 圖 14. 失敗的結果，綠色框是ground-truth box、紅色框是預測框 34 表目錄表 1. 類神經網路參數設定 9 表 2. Conv4 SE block參數設定 9 表 3. Conv5 SE block參數設定 9 表 4. 11種不同影片屬性 30
參考文獻	[1] X. Mei and H. Ling, "Robust visual tracking using l1 minimizationm," in ICCV, 2009. [2] X. Jia, H. Lu and M.-H. Yang, "Visual Tracking via Adaptive Structural Local Sparse Appearance Model," in CVPR, 2012. [3] D. Ross, J. Lim, R. Lin and M. Yang, "Incremental learning for robust visual tracking," in IJCV, 2008. [4] S. Hare, A. Saffari and P. H. S. Torr, "Struck: Structured Output Tracking with Kernels," in ICCV, 2011. [5] D. S. Bolme, J. R. Beveridge, B. A. Draper and Y. M. Lui, "Visual object tracking using adaptive correlation filters," in CVPR, 2010. [6] J. F. Henriques, R. Caseiro, P. Martins and J. Batista, "HighSpeed Tracking with Kernelized Correlation Filters," in TPAMI, 2015. [7] M. Danelljan, G. Hager, F. S. Khan and M. Felsberg, "Learning Spatially Regularized Correlation Filters for Visual Tracking," in ICCV, 2015. [8] H. Nam and B. Han, "Learning multi-domain convolutional neural networks for visual tracking," in CVPR, 2016. [9] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi and P. H. Torr., "Fully-convolutional siamese networks for object tracking," in ECCV, 2016. [10] S. Zagoruyko and N. Komodakis, "Learning to compare image patches via convolutional neural networks," in CVPR, 2015. [11] G. Koch, R. Zemel and R. Salakhutdinov, "Siamese neural networks for one-shot image recognition," in ICML Deep Learning Workshop, 2015. [12] R. Tao, E. Gavves and A. W. M. Smeulders, "Siamese instance search for tracking," in CVPR, 2016. [13] A. He, C. Luo, X. Tian and W. Zeng, "A twofold siamese network for real-time object tracking," in CVPR, 2018. [14] Q. Wang, Z. Teng, J. Xing, J. Gao, W. Hu and S. Maybank, "Learning attentions: Residual attentional siamese network for high performance online visual tracking," in CVPR, 2018. [15] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh and D. Batra, "Grad-CAM: Visual explanations from deep networks via gradient-based localization," in ICCV, 2017. [16] B. Zhou, A. Khosla, L. A., A. Oliva and A. Torralba, "Learning Deep Features for Discriminative Localization," in CVPR, 2016. [17] T.-Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollar, "Focal Loss for Dense Object Detection," in ICCV, 2017. [18] J. Hu, L. Shen and G. Sun, "Squeeze-and-excitation networks," in CVPR, 2018. [19] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fe, "ImageNet Large Scale Visual," in IJCV, 2015. [20] Y. Wu, J. Lim and M.-H. Yang, "Online Object Tracking: A Benchmark," CVPR, p. 2411–2418, 2013. [21] Y. Wu, J. Lim and M.-H. Yang, "Object Tracking Benchmark," TPAMI, vol. 37, no. 9, pp. 1834-1848, 2015. [22] Z. Hong, Z. Chen, C. Wang, X. Mei, D. Prokhorov and D. Tao, "MUlti-Store Tracker (MUSTer): a cognitive psychology inspired approach to object tracking," CVPR, 2015. [23] J. Zhang, S. Ma and S. Sclaroff, "MEEM: Robust tracking via multiple experts using entropy minimization," ECCV, 2014. [24] W. Zhong, H. Lu and M.-H. Yang, "Robust Object Tracking via Sparsity-based Collaborative Model," CVPR, 2012. [25] T. B. Dinh, N. Vo and G. Medioni, "Context Tracker: Exploring supporters and distracters in unconstrained environments," CVPR, 2011. [26] Z. Kalal, J. Matas and K. Mikolajczyk, "P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints," CVPR, 2010. [27] F. Henriques, R. Caseiro, P. Martins and J. Batista, "Exploiting the Circulant Structure of Tracking-by-Detection with Kernels," ECCV, 2012.
論文全文使用權限	校內：紙本論文於授權書繳交後2年公開同意電子論文全文授權校園內公開校內電子論文於授權書繳交後2年公開校外：同意授權校外電子論文於授權書繳交後2年公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信