淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-2002201212512500
中文論文名稱 階層式多物件影像分割應用於家用物品之研究
英文論文名稱 Research of Hierarchical Methods for Multi-Objects Segmentation in Home Environment
校院名稱 淡江大學
系所名稱(中) 電機工程學系碩士班
系所名稱(英) Department of Electrical Engineering
學年度 100
學期 1
出版年 101
研究生中文姓名 黎均毅
研究生英文姓名 Chun-I Li
學號 698450417
學位類別 碩士
語文別 中文
口試日期 2011-12-30
論文頁數 94頁
口試委員 指導教授-江正雄
委員-郭景明
委員-許明華
委員-賴永康
委員-周建興
中文關鍵字 機器視覺  立體視覺  影像分割  物件辨識 
英文關鍵字 Robotic vision  Stereo Vision  Image Segmentation  Object Recognition 
學科別分類 學科別應用科學電機及電子
中文摘要 近年來隨著科技與工業的發展,機器人的技術目前已越來越加成熟,並已成功應用在不同領域上。在居家服務與照料上,有許多項目漸漸可由機器人所取代,居家服務機器人的需求與日俱增。然而設計一位居家服務機器人,必須具有導航定位、與家人互動、與協助服務照料家人等等功能。而在解決上述問題之前,其最重要的輸入端為機器人之雙眼,機器視覺作為系統輸入端的影像分析判斷是相當重要的。在居家服務的環境中,居家物件的辨識為相當重要的一個環節。物件辨識的方法有許多種,如直方圖比對、樣版比對、特徵點提取、Adaboost、SVM、類神經網路學習演算法等等….儘管如此,現今大多的物件辨識方法都依賴已建立的資料庫或是仰賴大量樣本訓練學習才能有效進行物件辨識,一旦在居家環境中出現了系統內資料庫未建立的居家物品,就得靠off-line的手動建置模型,增加了居家辨識系統或是機器人在環境中 服務的不方便性。
為解決以上問題,本論文提出了一個結合深度影像與Grab Cut的影像分割與建置模型的方法,本系統分割物件為階層式設計,在Coarse Layer中利用深度影像找出多個物件的大概位置與大小,經由後端Fine Layer使用Grab Cut精細切割出物件邊緣並建置模型。一開始透過雙眼視覺輸入經匹配產生視差圖,以視差圖為基礎,感知環境中的立體資訊,同時濾除過遠背景與透過視差圖的直方圖分割將物品個別分割,並透過Grab Cut對遮罩影像進行收斂分割出完整的物件邊緣,最後進行SIFT/SURF特徵點提取影像辨識更新資料庫。完成全自動化且多物件的分割建置模型。透過上述程序,不同於其他物件辨識,本研究能在非固定背景下針對靜態居家物品完成全自動建模,且物件資訊完整度能接近手動建置模型的完整度,並且在更新的資料庫下辨識能保持一定的辨識率。
英文摘要 As the growth of technology and industry in the past few years, the robotic technique has been fully developed and applied in many different fields. In homecare and home service, many function can be replace by robots. To design a homecare robot must be proved with many functions, like navigation and positioning, interaction, serving for family members and so on. Before solving the following problems, robot vision is very important for system input to analysis images. There are many methods to recognize objects for robot vision, such as histogram compare, Adaboost, SVM, template matching, and etc. But most of them depend on database to classify the models in database and object recognition. If some objects which don’t exist in the database appear in images, it must construct models for database by manual. It is inconvenient to recognize objects and increase the cost of home service robots.
This paper proposed a new method which combined depth image processing and Grab Cut to segment objects and saving models from image, it can solve above problems. We present a hierarchical scheme with coarse layer segmentation and fine layer segmentation for object segmentation. It can fine the coarse location and area of multi-objects by processing depth image in coarse layer. Then, it can well segment objects and find its’ contour by using Grab Cut in fine layer. Finally, the proposed method uses SIFT/SURF to extract feature points and recognize objects. The proposed method can automatically segment static objects in different home environment and correct data of objects can be close to segmentation by manual. The automatic constructed image database can maintain good recognition accuracy rate.
論文目次 中文摘要 I
英文摘要 II
內文目錄 III
圖表目錄 VI

第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究主題與目標 3
1.3 論文架構 4

第二章 相關技術 5
2.1立體視覺概論 5
2.1.1雙眼產生立體影像 6
2.1.2 STOC雙眼模組 10
2.1.3雙眼匹配之問題 13
2.2 影像分割概論 18
2.2.1以樣板為基礎的分離法 19
2.2.2以色彩為基礎的分割法 21
2.2.3使用GrabCut的影像分割法 22
2.3 尺度不變特徵轉換(SIFT) 29
2.3.1尺度空間極值偵測 30
2.3.2定位極值點 32
2.3.3方向指定 37
2.3.4特徵點描述 38
第三章 階層式多物件分割架構 41
3.1階層式分割演算法架構 41
3.2粗糙層分割 43
3.2.1去除背景 43
3.2.2根據個別位置與深度作初步分割 47
3.2.3形態學補償與標籤化 49
3.3精細層分割架構 63
3.4應用於影像建立模型 68
第四章 實驗結果 71
4.1 正確涵蓋比例 74
4.2 涵蓋準確率 76
4.3 辨識分析 84
4.3 分析與比較 86
第五章 結論 88
5.1 結論與未來展望 88

參考文獻 90

圖目錄

圖2.1匹配前原始影像 5
圖2.2真實視差圖 6
圖2.3產生深度影像流程 7
圖2.4影像校正示意 8
圖2.5 SAD演算法產生的視差影像 9
圖2.6二維對應點對應到三維空間的示意圖 10
圖2.7 STOC外觀 12
圖2.8原始左眼影像 12
圖2.9 STOC輸出視差影像 13
圖2.10雙眼匹配所遭遇問題 17
圖2.11樣板分類 19
圖2.12多樣板偵測結果 20
圖2.13人臉物件模板建立與追蹤 21
圖2.14 HSV色彩模型 22
圖2.15原始影像與分割影像 23
圖2.16分割示意圖 26
圖2.17 Difference of Gaussian(DOG) 32
圖2.18 26相鄰取極大值與極小值 33
圖2.19 偵測極值點 37
圖2.20取直方圖中最大值為主方向 38
圖2.21由特徵點鄰域梯度訊息生成特徵向量 39
圖3.1系統流程圖 42
圖3.2像素數量與距離關係 44
圖3.3調整Xoff改變最近與最遠可視距離 45
圖3.4去除背景之視差影像 46
圖3.5物件與雜訊的直方圖統計 48
圖3.6取閥值影像 49
圖3.7像素相鄰示意圖 51
圖3.8膨脹運算後的影像 52
圖3.9侵蝕運算後的影像 53
圖3.10斷開後的二值影像 54
圖3.11閉合後的二值影像 55
圖3.12深度分割前景 57
圖3.13 Labeling collisions示意圖 59
圖3.14標籤化演算法 61
圖3.15給定前景背景是意圖 66
圖3.16 分割結果 67
圖3.17建模流程圖 69
圖3.18比對示意圖 70
圖4.1驗證流程圖 72
圖4.2物件樣本 73
圖4.3涵蓋示意圖 74
圖4.4 像素比對相同或不同之示意圖 78
圖4.5場景1與分割 79
圖4.6場景2與分割 80
圖4.7場景3與分割 81
圖4.8場景4與分割 82
圖4.9場景5與分割 83
圖4.10 SIFT/SURF特徵點與物件距離關係 85

表目錄

表2.1 STOC規格表 10
表3.1邊界權重表 64
表4.1 TP/FN涵蓋數量比 76
表4.2定義表 76
表4.3各場景與總平均的精確度 78
表4.4辨識正確率比較 84
表4.5各方法比較表 86
參考文獻 [1]S. Mattoccia, Stereo vision: algorithms and applications: VIALAB Bologna, November 2011.
[2]Available: http://www.videredesign.com/
[3]G. Wu, W. Liu, X. Xie, and Q. Wei, “A shape detection method based on the radial symmetry nature and direction-discriminated voting,” IEEE International Conference on Image Processing., vol. 6, pp. VI-169-VI-172, September 2007.
[4]N. Krahnstoever, R. Sharma, “Appearance management and cue fusion for 3D model-based tracking, ” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol.2, pp. II- 249-54, June 2003.
[5]鄧宏志,中型機器人足球系統之即時影像處理:淡江大學碩士論文,民國95年6月。
[6]S. M. Yoon and H. Kim, “Real-time multiple people detection using skin color, motion and appearance information, ” IEEE International Workshop on Robot and Human Interactive Communication, pp. 331- 334, September 2004.
[7]J.Yang, M.-J. Zhang, and J.-A. Xu, “A visual tracking method for mobile robot,” World Congress on Intelligent Control and Automation, vol. 2, pp. 9017-9021, June 2006.
[8]J. Pers and S. Kovacic, “Computer vision system for tracking players in sports games,” Processdings of International Workshop on Image and Signal Processing and Analysis, pp.177-182, June 2000
[9]R. C. Gonazlez and R. E. woods, Digital Image Processing: 2^nd edition, Addison-Wesley, 1992.
[10]連國珍,數位影像處理: 儒林圖書有限公司,1992。
[11]Available: http://en.wikipedia.org/wiki/HSV
[12]C. Rother and V. Kolmogorov, and A. Blake, ““Grabcut”: Interactive foreground extraction using iterated Graph Cuts,” ACM Transactions Graphics, vol. 23, no. 3, pp. 309-314, 2004.
[13]M.A. Ruzon and C. Tomasi,, “Alpha estimation in natural images, ” IEEE Conference on Computer Vision and Pattern Recognition, pp.18-25 vol.1, 2000.
[14]Y. Boykov and M.-P. Jolly, “Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images,” Proceeding International Conference on Computer Vision, pp. 105-112, 2001.
[15]Y. Boykov, O. Veksler, R. Zabih, “Markov random fields with efficient approximations, ”IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.648-655, Jun 1998.
[16]D. Greig, B. Porteous, and A. Seheult, “Exact maximum a posteriori estimation for Binary Images, ” J. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271-279, 1989.
[17]David G. Lowe, “Distinctive image features from scale-invariant keypoints, ” International Journal of Computer Vision, vol. 60, num. 2, 2004.
[18]A.P. Witkin, “Scale-space filtering, ” In International Joint Conference on Artificial Intelligence,pp. 1019-1022, 1983.
[19]D. G. Lowe, “Object recognition from local scale-invariant features, ” In Proceedings of the 7th International Conference on Computer Vision, pp. 1150-1157, 1999..
[20]M. Brown and D.G. Lowe, “ Invariant features from interest point groups, ”In British Machine Vision Conference, pp. 656-665, 2002.
[21]K. Suzuki, I. Horiba and N. Sugie, “ Linear-time connected-component labelingbased on sequential local operations ” In Computer Vision and Image Understanding, vol.89,pp.1-23, January 2003.
[22]H. Hedberg, F. Kristensen and V. Owall, “Implementation of a labeling algorithm based on contour tracing with feature extraction, ” IEEE International Symposium on Circuits and Systems, pp.1101-1104, May 2007
[23]H. Thomas, E. Charles, L. Ronald and C. Stein, Introduction to Algorithms, Third Edition, Boston : MIT Press,2009.
[24]C. Zhan, X. Duan, S. Xu, Z. Song, and M. Luo, “An improved moving object detection algorithm based on frame difference and edge detection,” In proceeding. IEEE International Conference on Image and Graphics, pp. 519-523, Aug. 2007.
[25]Y. J. Li, J. F. Yang, R. B.Wu, and F. X. Gong, “Efficient object tracking based on local invariant features,” IEEE Conference on SCIT, pp. 697–700, 2006.
[26]C. Stauffer and W.E.L. Grimson, “Adaptive background mixture models for real-time tracking, ” Proceeding Computer Vision and Pattern Recognition, pp. 246-252, June 1999.
[27]P. Chang and J. Krumm, “Object recognition with color cooccurrence histograms,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1063–1069, June, 1999.
[28]P. Azad, T. Asfour, R. Dillmann, “Combining Harris interest points and the SIFT descriptor for fast scale-invariant object recognition, ”International Conference on Intelligent Robots and Systems, pp.4275-4280, October 2009
[29]P.F. Felzenszwalb and D.P. Huttenlocher., “Efficient graph-based image segmentation , ” International Journal of Computer Vision, vol. 59, no. 2, pp. 167–181, 2004.
[30]W. Tao, H. J. and Y. Zhang, “Color image segmentation based on Mean Shift and Normalized Cuts, ” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics,, vol.37, no.5, pp.1382-1389, Oct. 2007
[31]Mikolajczyk, K., Schmid, C., “A performance evaluation of local descriptors, ” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, no.10, pp.1615-1630, October 2005
[32]D. Scharstein and R. Szeliski. , “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, ” International Journal of Computer Vision, 2002.
[33]H. Hirschmuller., “Accurate and efficient stereo processing by semi-global matching and mutual information. ” IEEE Conference on Computer Vision and Pattern Recognition, vil. 30, no.2, pp328-341, 2008.
[34]M. Bleyer, C. Rother, and P. Kohli., “Surface stereo with soft segmentation,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1570–1577, June 2010.
[35]D. G. Lowe, “Object Recognition from local scale–invariant features, ” International Conference on Computer Vision, pp.1150-1157, 1999.
[36]Y. Boykov and V. Kolmogorov., “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision, ” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, September 2004.
[37]H. Bay, T. Tuytelaars, and L. J. V. Gool. “SURF: Speeded up robust features,” In European Conference on Computer Vision, vol. 1, pp. 404–417, 2006.
[38]Available: http://amos.ee.tku.edu.tw/pattern/tracking/
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2017-02-21公開。
  • 同意授權瀏覽/列印電子全文服務,於2017-02-21起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信