淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-2807201315372800
中文論文名稱 基於隱藏式馬可夫模型與深度資訊之手語辨識 系統
英文論文名稱 Sign Language Recognition based on HMMs and Depth Information
校院名稱 淡江大學
系所名稱(中) 電機工程學系碩士班
系所名稱(英) Department of Electrical Engineering
學年度 101
學期 2
出版年 102
研究生中文姓名 林芃翔
研究生英文姓名 Peng-Hsiang Lin
學號 600440092
學位類別 碩士
語文別 中文
口試日期 2013-07-03
論文頁數 88頁
口試委員 指導教授-陳巽璋
共同指導教授-謝景棠
委員-蘇木春
委員-洪國銘
中文關鍵字 隱藏式馬可夫模型  手語辨識  深度資訊 
英文關鍵字 Hidden Markov Models  Sign Language Recognition  Depth Information 
學科別分類 學科別應用科學電機及電子
中文摘要 一個完整的手語辨識系統會受到系統參數及特徵參數影響其辨識率,而其中,後者對於辨識率的影響程度遠大於前者。因此,在手語辨識的架構中,一個好的特徵參數將可有效的提升整體之辨識率。本論文中,我們在手語辨識系統中加入深度資訊可以更有效地定位雙手的位置。然而這樣的資訊會因為手語者不同,導致無法辨識的情形發生,所以我們利用三維座標在單位時間上的變化量作為辨識特徵,藉以解決上述問題。首先我們紀錄三維座標在時間上的變化,接著利用隱藏式馬可夫模型能學習時間軸資訊的特性,以辨識在時間上動作的變化所組成之各種手語。由於加入深度資訊,可以辨識更多樣化的手語,以及解決手語者不同的問題,同時也可獲得更高的辨識率。
英文摘要 The recognition rate of the complete sign language recognition system will be influenced by the system parameters and feature parameters. And the latter is more important. Therefore, in the architecture of sign language recognition, a good feature parameter can effectively promote the recognition rate of the system. In this paper, we add the depth information to effectively locate the position of the hands in the sign language recognition system. However, the information will be changed by the different testers. Also, it will happen that we can’t do the recognition. So, we use the incremental changes of the three-dimensional coordinates on a unit time as the feature parameter to fix the above problem. First, we record the changes of the three-dimensional coordinates on time, then using the hidden Markov models with the characteristic of learning the time-axis information to recognize the variety of sign language movement changing on the time. Since we have added the depth information, we can recognize more variety of sign language and solve the problem of different signers. Moreover, we can get the higher rate of recognition.
論文目次 第一章 緒論 1
1.1 研究動機 1
1.2 研究方法 2
1.3 論文架構 3
第二章 相關研究與背景知識 4
2.1 相關研究 4
2.2 相關技術 7
2.2.1 Kinect 8
2.2.2 深度影像建立方法 8
2.2.3 OpenNI基本架構 10
2.2.4 OpenNI人體骨架追蹤系統 11
2.2.5 隱藏式馬可夫模型(Hidden Markov Models; HMMs) 14
2.2.5.1 馬可夫鏈(Markov Chains) 14
2.2.5.2 馬可夫鏈延伸(隱藏式馬可夫模型) 17
2.2.5.3 隱藏式馬可夫模型三個基本問題 19
2.2.5.4 隱藏式馬可夫模型的類型 28
2.2.5.5 連續型隱藏式馬可夫模型(Continuous Hidden Markov Model) 29
2.2.5.6 縮放(Scaling) 31
第三章 基於隱藏式馬可夫模型之手語辨識 35
3.1 系統架構 35
3.2 座標系正規化 36
3.2.1 OpenNI座標系統 36
3.2.2 齊次座標系 37
3.3 基於隱藏式馬可夫模型之手語辨識系統 40
3.3.1 特徵擷取 41
3.3.2 隱藏式馬可夫模型的訓練與辨識 45
3.3.2.1 訓練模型 45
3.3.2.2 辨識手語 47
3.3.3 實驗數據 50
第四章 實驗結果 58
4.1 實驗環境 58
4.2 去除深度辨識 58
4.3 非獨立手語者辨識(Signer-dependent) 62
4.4 獨立手語者辨識(Signer-independent) 70
第五章 結論與未來展望 83
5.1 結論 83
5.2 未來展望 84
参考文獻 85

圖1.1 體感遊戲機 XBOX 360 2
圖2.1 人類膚色之高斯分布[7] 6
圖2.2 Kinect[25] 8
圖2.3 Light Coding技術示意圖[33] 9
圖2.4 OpenNI架構圖[29] 10
圖2.5 透過NITE取得的OprnNI定義的15個關節圖[31] 12
圖2.6 人體骨架分析流程[31] 12
圖2.7 PSI[31] 13
圖2.8 中所舉例的天氣模型[3] 15
圖2.9 甕球模型[2] 17
圖2.10 前向演算法[28] 21
圖2.11 後向演算法[28] 22
圖2.12 維特比演算法[28] 23
圖2.13 參數 圖解[18] 25
圖2.14 波氏演算法之參數 [28] 26
圖2.15 遍歷模型和左至右模型[20] 28
圖3.1 系統流程圖 35
圖3.2 OpenNI座標系統示意圖[31] 37
圖3.3 座標幾何轉換之平移法示意圖[25] 38
圖3.4 以軀幹為中心的座標系統[25] 39
圖3.5 原始Kinect骨架座標點[25] 39
圖3.6 正規化後骨架座標點[25] 40
圖3.7 運動增音示意圖 41
圖3.8 雙手三維座標特徵比較 42
圖3.9 距離特徵比較 42
圖3.10 角度特徵示意圖 43
圖3.11 骨架追蹤失敗 44
圖3.13 HMMs模型訓練流程圖 45
圖3.14 辨識流程圖 48
圖3.15 手語示意圖(A) 50
圖3.16手語示意圖(B) 51
圖3.17手語示意圖(C) 51
圖3.18手語示意圖(D) 52
圖3.19手語示意圖(E) 52
圖3.20手語示意圖(F) 53
圖3.21手語類似動作示意圖(A) 54
圖3.22手語類似動作示意圖(B) 54
圖3.23手語類似動作示意圖(C) 55
圖3.24手語類似動作示意圖(D) 55
圖3.25手語類似動作示意圖(E) 56
圖3.26手語類似動作示意圖(F) 56
圖3.27手語類似動作示意圖(G) 56
圖3.28手語類似動作示意圖(H) 57
圖3.29手語類似動作示意圖(I) 57

表2.1辨識率與詞彙量多寡[7] 7
表4.1-1 A-Signer去除深度之混淆矩陣 59
表4.1-2 A-Signer去除深度之混淆矩陣 60
表4.2去除深度之5人平均辨識率 61
表4.3-1 Signer-dependent 6維特徵之混淆矩陣 63
表4.3-2 Signer-dependent 6維特徵之混淆矩陣 64
表4.3-3 Signer-dependent 6維特徵之混淆矩陣 65
表4.4-1 Signer-dependent 10維特徵之混淆矩陣 66
表4.4-2 Signer-dependent 10維特徵之混淆矩陣 67
表4.4-3 Signer-dependent 10維特徵之混淆矩陣 68
表4.5 Signer-dependent 10維特徵與6維特徵比較 69
表4.6-1 Signer-independent 6維特徵20字詞之混淆矩陣 70
表4.6-2 Signer-independent 6維特徵20字詞之混淆矩陣 71
表4.7-1 Signer-independent 10維特徵20字詞之混淆矩陣 72
表4.7-2 Signer-independent 10維特徵20字詞之混淆矩陣 73
表4.8 Signer-independent 20字詞10維特徵與6維特徵比較 74
表4.9-1 Signer-independent 6維特徵30字詞之混淆矩陣 75
表4.9-2 Signer-independent 6維特徵30字詞之混淆矩陣 76
表4.9-3 Signer-independent 6維特徵30字詞之混淆矩陣 77
表4.10-1 Signer-independent 10維特徵30字詞之混淆矩陣 78
表4.10-2 Signer-independent 10維特徵30字詞之混淆矩陣 79
表4.10-3 Signer-independent 10維特徵30字詞之混淆矩陣 80
表4.11 Signer-independent 30字詞10維特徵與6維特徵比較81

參考文獻 [1] Lang S., Block-Berlitz M., Rojas R.: “Sign Language Recognition using Kinect,” Artificial Intelligence and Soft Computing (ICAISC),2012.

[2] Lang S., Block-Berlitz M., Rojas R.:“Sign Language Recognition with Kinect,” Bachelor thesis of Freie Universitat Berlin Institut fur Informatik, 2011.

[3] Rabiner L.R.:“A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, ”IN: Proceedings of the IEEE, Vol. 77, No. 2, 1989, pp. 257-286.

[4] Rahimi A.: “An Erratum for 'A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” website of Ali Rahimi at MIT Media Laboratory http://xenia.media.mit.edu/~rahimi/rabiner/rabinererrata/rabiner-errata.html, 2000.

[5] Kelly D., McDonald J., Markham C.:“ Recognizing Spatiotemporal Gestures and Movement Epenthesis in Sign Language,” IMVIP '09, ISBN: 978-0-76953-769-2, IEEE Computer Society Washington, DC, USA, 2009.

[6] Kelly D., Delannoy J. R., McDonald J., Markham C.:“A Framework for Continuous Multimodal Sign Language Recognition,” ISBN: 978-1-60558-772-1, ACM, 2009, pp. 351-358.

[7] M. Mohandes, M. Deriche,U. Johar, S. Ilyas: “A signer-independent Arabic Sign Language recognition system using face detection, geometric features, and a Hidden Markov Model,” Computers and Electrical Engineering, 2012, pp.422–433.

[8] Hyeon-Kyu Lee and Jin H. Kim: “An HMM-Based Threshold Model Approach for Gesture Recognition,” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 21, NO. 10, 1999.

[9] Kelly D., McDonald J., Markham C.: “Continuous Recognition of Motion Based Gestures in Sign Language,” IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2009.

[10] Koki Ariga, Shinji Sako, Tadashi Kitamura,: “HMM-based Sign Recognition in Consideration of Motion Diversity,”IUCS2010, 2010.

[11] William C. Stokoe, Jr.: “Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf,” Journal of Deaf Studies and Deaf Education, v10 n1 p3-37Win 2005, 2005.

[12] Hee-Deok Yang, Stan Sclaroff, Seong-Whan Lee,: “Sign Language Spotting with a Threshold Model Based on Conditional Random Fields,” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 7, 2009.

[13] Son Lam Phung, Douglas Chai and Abdesselam Bouzerdoum,: “Skin Color Based Face Detection,” Seventh Australian and New Zealand Intelligent Information Systems Conference, 18-21 November 2001, 2001.

[14] Jiyong Ma,Wen Gao, Jiangqin Wu, Chunli Wang,: ”A Continuous Chinese Sign Language Recognition System,” Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on, p.428-433, 2000.

[15] Gaolin Fang, WenGao,:“A SRN/HMM System for Signer-independent Continuous Sign Language Recognition,” Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition (FGR02), 2002.

[16] Chunli Wang, Wen GAO, Shiguang Shan,: “An Approach Based on Phonemes to Large Vocabulary Chinese Sign Language Recognition,” Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition (FGR02), 2002.

[17] Gaolin Fang, Wen Gao, Jiyong Ma,: “Signer-Independent Sign Language Recognition Based on SOFM/HMM, ”Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 2001. Proceedings. IEEE ICCV Workshop on, p90-90, 2001.
[18] Xiaolin Li,: “Training Hidden Markov Models with Multiple Observations – A Combinatorial Method, ”IEEE Transactions on PAMI, vol. PAMI-22, no. 4, pp 371-377, April 2000, 2000.

[19] J‥org Zieren, Karl-Friedrich Kraiss,:“Robust Person-Independent Visual Sign Language Recognition,”IbPRIA 2005, LNCS 3522, pp. 520–528, 2005.

[20] Jean-Paul van Oosten,:“Can Markov properties be learned by hidden Markov modelling algorithms?,” Master Thesis Articial Intelligence Department of Articial Intelligence, University of Groningen, The Netherlands, 2010.

[21] Nianjun Liu, Richard I.A. Davis, BrianC. Lovell, Peter J. Kootsookos,: “Effect of Initial HMM Choices in Multiple Sequence Training for Gesture Recognition,” Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’04), 2004.

[22] Douglas A. Reynolds, Richard C. Rose,: “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 3, NO. 1. JANUARY 1995, 1995.

[23] Wen-Hsian L, Chang-Tsun L. Skin color-based face detection in color images. In: AVSS ‘06, IEEE international conference on video and signal based surveillance; 2006. p. 56–60.

[24] Albitar, I.C. , Graebling, P. , Doignon, C. ,: “ Robust Structured Light Coding for 3D Reconstruction” IEEE 11th International Conference on Computer Vision, p.1-6.2007.

[25] 鍾瑞琪,基於Kinect之主動關節運動復健評估系統,私立淡江大學電機工程學系碩士論文,2012。

[26] 張志瑜,基於隱藏式馬可夫模型之唇語辨識系統,私立淡江大學電機工程學系碩士論文,2009。

[27] 陳萬成,雜訊環境下強健性語者辨認的新方法,私立淡江大學電機工程學系博士論文,2009。
[28] http://www.csie.ntnu.edu.tw/~u91029/HiddenMarkovModel.html

[29] http://www.openni.org/

[30] http://www.opencv.org.cn/index.php/%E9%A6%96%E9%A1%B5

[31]https://kheresy.wordpress.com/index_of_openni_and_kinect/

[32] http://www.xbox.com/zh-TW/Kinect

[33] http://www.bb.ustc.edu.cn/jpkc/guojia/dxwlsy/kj/part2/grade3/
LaserSpeckle.html

[34] http://www.primesense.com/solutions/nite-middleware/
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2015-07-30公開。
  • 同意授權瀏覽/列印電子全文服務,於2015-07-30起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信