§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2002201212512500
DOI 10.6846/TKU.2012.00816
論文名稱(中文) 階層式多物件影像分割應用於家用物品之研究
論文名稱(英文) Research of Hierarchical Methods for Multi-Objects Segmentation in Home Environment
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 電機工程學系碩士班
系所名稱(英文) Department of Electrical and Computer Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 100
學期 1
出版年 101
研究生(中文) 黎均毅
研究生(英文) Chun-I Li
學號 698450417
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2011-12-30
論文頁數 94頁
口試委員 指導教授 - 江正雄(chiang@ee.tku.edu.tw)
委員 - 郭景明(jmguo@seed.net.tw)
委員 - 許明華(sheumh@yuntech.edu.tw)
委員 - 賴永康(yklai@dragon.nchu.edu.tw)
委員 - 周建興(chchou@mail.tku.edu.tw)
關鍵字(中) 機器視覺
立體視覺
影像分割
物件辨識
關鍵字(英) Robotic vision
Stereo Vision
Image Segmentation
Object Recognition
第三語言關鍵字
學科別分類
中文摘要
近年來隨著科技與工業的發展,機器人的技術目前已越來越加成熟,並已成功應用在不同領域上。在居家服務與照料上,有許多項目漸漸可由機器人所取代,居家服務機器人的需求與日俱增。然而設計一位居家服務機器人,必須具有導航定位、與家人互動、與協助服務照料家人等等功能。而在解決上述問題之前,其最重要的輸入端為機器人之雙眼,機器視覺作為系統輸入端的影像分析判斷是相當重要的。在居家服務的環境中,居家物件的辨識為相當重要的一個環節。物件辨識的方法有許多種,如直方圖比對、樣版比對、特徵點提取、Adaboost、SVM、類神經網路學習演算法等等….儘管如此,現今大多的物件辨識方法都依賴已建立的資料庫或是仰賴大量樣本訓練學習才能有效進行物件辨識,一旦在居家環境中出現了系統內資料庫未建立的居家物品,就得靠off-line的手動建置模型,增加了居家辨識系統或是機器人在環境中 服務的不方便性。
    為解決以上問題,本論文提出了一個結合深度影像與Grab Cut的影像分割與建置模型的方法,本系統分割物件為階層式設計,在Coarse Layer中利用深度影像找出多個物件的大概位置與大小,經由後端Fine Layer使用Grab Cut精細切割出物件邊緣並建置模型。一開始透過雙眼視覺輸入經匹配產生視差圖,以視差圖為基礎,感知環境中的立體資訊,同時濾除過遠背景與透過視差圖的直方圖分割將物品個別分割,並透過Grab Cut對遮罩影像進行收斂分割出完整的物件邊緣,最後進行SIFT/SURF特徵點提取影像辨識更新資料庫。完成全自動化且多物件的分割建置模型。透過上述程序,不同於其他物件辨識,本研究能在非固定背景下針對靜態居家物品完成全自動建模,且物件資訊完整度能接近手動建置模型的完整度,並且在更新的資料庫下辨識能保持一定的辨識率。
英文摘要
As the growth of technology and industry in the past few years, the robotic technique has been fully developed and applied in many different fields. In homecare and home service, many function can be replace by robots. To design a homecare robot must be proved with many functions, like navigation and positioning, interaction, serving for family members and so on. Before solving the following problems, robot vision is very important for system input to analysis images. There are many methods to recognize objects for robot vision, such as histogram compare, Adaboost, SVM, template matching, and etc. But most of them depend on database to classify the models in database and object recognition. If some objects which don’t exist in the database appear in images, it must construct models for database by manual. It is inconvenient to recognize objects and increase the cost of home service robots.
This paper proposed a new method which combined depth image processing and Grab Cut to segment objects and saving models from image, it can solve above problems. We present a hierarchical scheme with coarse layer segmentation and fine layer segmentation for object segmentation. It can fine the coarse location and area of multi-objects by processing depth image in coarse layer. Then, it can well segment objects and find its’ contour by using Grab Cut in fine layer. Finally, the proposed method uses SIFT/SURF to extract feature points and recognize objects. The proposed method can automatically segment static objects in different home environment and correct data of objects can be close to segmentation by manual. The automatic constructed image database can maintain good recognition accuracy rate.
第三語言摘要
論文目次
中文摘要	I
英文摘要	II
內文目錄	III
圖表目錄	VI

第一章  緒論	1
1.1 研究背景與動機	1
1.2 研究主題與目標	3
1.3 論文架構	4

第二章  相關技術	5
2.1立體視覺概論	5
2.1.1雙眼產生立體影像	6
2.1.2 STOC雙眼模組	10
2.1.3雙眼匹配之問題	13
2.2 影像分割概論	18
2.2.1以樣板為基礎的分離法	19
2.2.2以色彩為基礎的分割法	21
2.2.3使用GrabCut的影像分割法	22
2.3 尺度不變特徵轉換(SIFT)	29
2.3.1尺度空間極值偵測	30
2.3.2定位極值點	32
2.3.3方向指定	37
2.3.4特徵點描述	38
第三章  階層式多物件分割架構	41
3.1階層式分割演算法架構	41
3.2粗糙層分割	43
3.2.1去除背景	43
3.2.2根據個別位置與深度作初步分割	47
3.2.3形態學補償與標籤化	49
3.3精細層分割架構	63
3.4應用於影像建立模型	68
第四章  實驗結果	71
4.1 正確涵蓋比例	74
4.2 涵蓋準確率	76
4.3 辨識分析	84
4.3 分析與比較	86
第五章  結論	88
5.1 結論與未來展望	88

參考文獻	90

圖目錄

圖2.1匹配前原始影像	5
圖2.2真實視差圖	6
圖2.3產生深度影像流程	7
圖2.4影像校正示意	8
圖2.5 SAD演算法產生的視差影像	9
圖2.6二維對應點對應到三維空間的示意圖	10
圖2.7 STOC外觀	12
圖2.8原始左眼影像	12
圖2.9 STOC輸出視差影像	13
圖2.10雙眼匹配所遭遇問題	17
圖2.11樣板分類	19
圖2.12多樣板偵測結果	20
圖2.13人臉物件模板建立與追蹤	21
圖2.14 HSV色彩模型	22
圖2.15原始影像與分割影像	23
圖2.16分割示意圖	26
圖2.17 Difference of Gaussian(DOG)	32
圖2.18 26相鄰取極大值與極小值	33
圖2.19 偵測極值點	37
圖2.20取直方圖中最大值為主方向	38
圖2.21由特徵點鄰域梯度訊息生成特徵向量	39
圖3.1系統流程圖	42
圖3.2像素數量與距離關係	44
圖3.3調整Xoff改變最近與最遠可視距離	45
圖3.4去除背景之視差影像	46
圖3.5物件與雜訊的直方圖統計	48
圖3.6取閥值影像	49
圖3.7像素相鄰示意圖	51
圖3.8膨脹運算後的影像	52
圖3.9侵蝕運算後的影像	53
圖3.10斷開後的二值影像	54
圖3.11閉合後的二值影像	55
圖3.12深度分割前景	57
圖3.13 Labeling collisions示意圖	59
圖3.14標籤化演算法	61
圖3.15給定前景背景是意圖	66
圖3.16 分割結果	67
圖3.17建模流程圖	69
圖3.18比對示意圖	70
圖4.1驗證流程圖	72
圖4.2物件樣本	73
圖4.3涵蓋示意圖	74
圖4.4 像素比對相同或不同之示意圖	78
圖4.5場景1與分割	79
圖4.6場景2與分割	80
圖4.7場景3與分割	81
圖4.8場景4與分割	82
圖4.9場景5與分割	83
圖4.10 SIFT/SURF特徵點與物件距離關係	85

表目錄

表2.1 STOC規格表	10
表3.1邊界權重表	64
表4.1 TP/FN涵蓋數量比	76
表4.2定義表	76
表4.3各場景與總平均的精確度	78
表4.4辨識正確率比較	84
表4.5各方法比較表	86
參考文獻
[1]S. Mattoccia, Stereo vision: algorithms and applications: VIALAB Bologna, November 2011.
	[2]Available: http://www.videredesign.com/
	[3]G. Wu, W. Liu, X. Xie, and Q. Wei, “A shape detection method based on the radial symmetry nature and direction-discriminated voting,” IEEE International Conference on Image Processing., vol. 6, pp. VI-169-VI-172, September 2007.
	[4]N. Krahnstoever, R. Sharma, “Appearance management and cue fusion for 3D model-based tracking, ” IEEE Computer Society Conference on  Computer Vision and Pattern Recognition, vol.2, pp. II- 249-54, June 2003.
	[5]鄧宏志,中型機器人足球系統之即時影像處理:淡江大學碩士論文,民國95年6月。
	[6]S. M. Yoon and H. Kim, “Real-time multiple people detection using skin color, motion and appearance information, ” IEEE International Workshop on Robot and Human Interactive Communication, pp. 331- 334, September 2004.
	[7]J.Yang, M.-J. Zhang, and J.-A. Xu, “A visual tracking method for mobile robot,” World Congress on Intelligent Control and Automation, vol. 2, pp. 9017-9021, June 2006.
	[8]J. Pers and S. Kovacic, “Computer vision system for tracking players in sports games,” Processdings of International Workshop on Image and Signal Processing and Analysis, pp.177-182, June 2000
	[9]R. C. Gonazlez and R. E. woods, Digital Image Processing: 2^nd edition, Addison-Wesley, 1992.
	[10]連國珍,數位影像處理: 儒林圖書有限公司,1992。
	[11]Available: http://en.wikipedia.org/wiki/HSV
	[12]C. Rother and V. Kolmogorov, and A. Blake, ““Grabcut”: Interactive foreground extraction using iterated Graph Cuts,” ACM Transactions Graphics, vol. 23, no. 3, pp. 309-314, 2004.
	[13]M.A. Ruzon and C. Tomasi,, “Alpha estimation in natural images, ” IEEE Conference on Computer Vision and Pattern Recognition, pp.18-25 vol.1, 2000.
	[14]Y. Boykov and M.-P. Jolly, “Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images,” Proceeding International Conference on Computer Vision, pp. 105-112, 2001.
	[15]Y. Boykov, O. Veksler, R. Zabih, “Markov random fields with efficient approximations, ”IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.648-655, Jun 1998.
	[16]D. Greig, B. Porteous, and A. Seheult, “Exact maximum a posteriori estimation for Binary Images, ” J. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271-279, 1989.
	[17]David G. Lowe, “Distinctive image features from scale-invariant keypoints, ” International Journal of Computer Vision, vol. 60, num. 2, 2004.
	[18]A.P. Witkin, “Scale-space filtering, ” In International Joint Conference on Artificial Intelligence,pp. 1019-1022, 1983.
	[19]D. G. Lowe, “Object recognition from local scale-invariant features, ” In Proceedings of the 7th International Conference on Computer Vision, pp. 1150-1157, 1999..
	[20]M. Brown and D.G. Lowe, “ Invariant features from interest point groups, ”In British Machine Vision Conference, pp. 656-665, 2002.
	[21]K. Suzuki, I. Horiba and N. Sugie, “ Linear-time connected-component labelingbased on sequential local operations ” In Computer Vision and Image Understanding, vol.89,pp.1-23, January 2003.
	[22]H. Hedberg, F. Kristensen and V. Owall, “Implementation of a labeling algorithm based on contour tracing with feature extraction, ” IEEE International Symposium on Circuits and Systems, pp.1101-1104, May 2007
	[23]H. Thomas, E. Charles, L. Ronald and C. Stein, Introduction to Algorithms, Third Edition, Boston : MIT Press,2009.
	[24]C. Zhan, X. Duan, S. Xu, Z. Song, and M. Luo, “An improved moving object detection algorithm based on frame difference and edge detection,” In proceeding. IEEE International Conference on Image and Graphics, pp. 519-523, Aug. 2007.
	[25]Y. J. Li, J. F. Yang, R. B.Wu, and F. X. Gong, “Efficient object tracking based on local invariant features,” IEEE Conference on SCIT, pp. 697–700, 2006.
	[26]C. Stauffer and W.E.L. Grimson, “Adaptive background mixture models for real-time tracking, ” Proceeding Computer Vision and Pattern Recognition, pp. 246-252, June 1999.
	[27]P. Chang and J. Krumm, “Object recognition with color cooccurrence histograms,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1063–1069, June, 1999.
	[28]P. Azad, T. Asfour, R. Dillmann, “Combining Harris interest points and the SIFT descriptor for fast scale-invariant object recognition, ”International Conference on Intelligent Robots and Systems, pp.4275-4280, October 2009
	[29]P.F. Felzenszwalb and D.P. Huttenlocher., “Efficient graph-based image segmentation , ” International Journal of Computer Vision, vol. 59, no. 2, pp. 167–181, 2004.
	[30]W. Tao, H. J. and Y. Zhang, “Color image segmentation based on Mean Shift and Normalized Cuts, ” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics,, vol.37, no.5, pp.1382-1389, Oct. 2007
	[31]Mikolajczyk, K., Schmid, C., “A performance evaluation of local descriptors, ” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, no.10, pp.1615-1630, October 2005
	[32]D. Scharstein and R. Szeliski. , “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, ” International Journal of Computer Vision, 2002.
	[33]H. Hirschmuller., “Accurate and efficient stereo processing by semi-global matching and mutual information. ” IEEE Conference on Computer Vision and Pattern Recognition, vil. 30, no.2, pp328-341, 2008.
	[34]M. Bleyer, C. Rother, and P. Kohli., “Surface stereo with soft segmentation,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1570–1577, June 2010.
	[35]D. G. Lowe, “Object Recognition from local scale–invariant features, ” International Conference on Computer Vision, pp.1150-1157, 1999.
	[36]Y. Boykov and V. Kolmogorov., “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision, ” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, September 2004.
	[37]H. Bay, T. Tuytelaars, and L. J. V. Gool. “SURF: Speeded up robust features,” In European Conference on Computer Vision, vol. 1, pp. 404–417, 2006.
	[38]Available: http://amos.ee.tku.edu.tw/pattern/tracking/
論文全文使用權限
校內
紙本論文於授權書繳交後5年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後5年公開
校外
同意授權
校外電子論文於授權書繳交後5年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信