淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-1608200607581800
中文論文名稱 聯結k最近鄰域與支向機之模式識別
英文論文名稱 Associating kNN and SVM for Pattern Recognition
校院名稱 淡江大學
系所名稱(中) 機械與機電工程學系碩士班
系所名稱(英) Department of Mechanical and Electro-Mechanical Engineering
學年度 94
學期 2
出版年 95
研究生中文姓名 許哲彰
研究生英文姓名 Che-Chang Hsu
電子信箱 worthwhilekimo@yahoo.com.tw
學號 692342792
學位類別 碩士
語文別 英文
口試日期 2006-06-05
論文頁數 84頁
口試委員 指導教授-楊智旭
共同指導教授-楊棧雲
委員-田豐
委員-楊智旭
委員-湯敬民
委員-林智仁
委員-楊棧雲
中文關鍵字 支持向量機  k最近鄰域法  損失函數  模式識別 
英文關鍵字 Support Vector Machine(SVM)  k-Nearest-Neighbor(kNN)  Loss Function  Pattern Recognition 
學科別分類 學科別應用科學機械工程
中文摘要 本論文建立一個改良型的模式識別方法,它結合無母數k最近鄰域法(kNN)於支持向量機(SVM)用來辨識情勢特殊而又深具義意的訓練樣本,尤其是座落於兩類別重疊區域間難以辨識的樣本。本方法區分為兩階段來進行模式識別,首先使用k近鄰域加強器來過濾篩選上述類型樣本,然後根據這些過濾後的資料來進行支持向量機的分類,因此在兩階段的分類的過程中,將形成較大的懲罰值於這些需要技術性才能加以區分的困難樣本,來獲取較嚴峻的損失函數,而這樣的處裡確實在系統的最佳化過程產生不錯的效果。
在第一階段使用k最近鄰域加強器來收集訓練樣本的資訊並且產生一系列的權重值給予每一筆樣本,然後在第二階段時,由支持向量機使用參數化實數值類別標籤來攜帶各種不同層次權重的訓練樣本資訊以做分類。因此依照這種改良型分類器的演繹程序所得到的決策邊界被用來引導一個較為精準的分類狀況。採用k近鄰域加強器的緣故可在估算區域密度時所展現的優勢得知,也就是對於那些被訓練集所遮蔽的樣本點,能夠僅只考量它們本身所散落的區域位置即可。這樣的模型能夠提供一種用來強調被忽視的小區域孤立樣本以及難以區隔的樣本,也就是說主要就是用來處理這些需要較多技術性指導才能加以區隔的樣本。經由數值模擬結果可知,本方法確實可以改善被訓練集所遮蔽的樣本點以及當系統受到參數擾動下仍保持良好的運算特質。
英文摘要 This thesis conceived a new model merging a non-parametric k-nearest-neighbor (kNN) method into an underlying support vector machine (SVM) to emphasize the meaningful training examples, especially for the terms of the difficult examples which were located on the overlapping region of the training set. The model consisted of a filtering stage of kNN emphasizer and a classical classification stage. The model motivated by adding heavier penalties into the difficult examples to attain a stricter loss function for optimization could really take effect on emphasizing the difficult examples. First, the filtering stage of the kNN emphasizer was employed to collect information from training examples and to produce a set of emphasized weights for each example. Then, an underlying SVM with parameterized real-valued class labels was used to carry the information in various confidence levels to the training examples for classification. The novel idea of real-valued class labels for conveying the emphasized weights permitted an effective way to pursue the optimal solution inspirited by the additional information. A slight alteration of resultant decision boundary was therefore produced after the convex programming of the SVM, and conducted a higher accurate classification in the training. The adoption of the kNN emphasizer did make sense due to the local density estimation which had the advantage to screen out the difficult examples among the training set by considering merely the distinct situation of the examples themselves. The proposed model provided a way for emphasizing the substantial and subtle examples in the learning process, especially for the difficult examples. Moreover, discussions with detailed experimental evidence were also presented in the paper category to address the corresponding properties. All of those properties were consistent in both the theoretical derivations and related experimental results.
論文目次 Contents
中文摘要 Ⅰ
Abstract Ⅲ
List of Figures Ⅶ
List of Tables Ⅸ
List of Symbols Ⅹ
Chapter 1 Introductions 1
Chapter 2 Associating kNN with SVM 4
2.1 Loss Functions in SVMs 4
2.2 A Preprocessor Based on kNN 6
2.3 SVMs with Weighted Class Labels 10
2.4 Connecting with kNN 16
2.5 Change of Loss Due to the Association of kNN and SVM 17
Chapter 3 Experimental Results and Discussions 19
3.1 Changes in Training Phase 20
3.2 Classification Improvement in Neighborhood of Heterogeneous Examples 27
3.3 Generalization Performance via K-fold Cross Validation 37
3.4 Robustness to the Changes of C 42
3.5 Effects from either Adopting the kNN Emphasizer or Large Value of C 46
3.6 Effects on Support Vectors 52
Chapter 4 Conclusions 80
References 82
List of Figures
Fig. 1. Surrogates of loss function in statistical learning 6
Fig. 2. An example of voting kNN rule 8
Fig. 3. Model of kNN-SVM 17
Fig. 4. A comparison of decision boundary from both classical SVM and kNN-SVM with equivalent parameter settings 22
Fig. 5. The reductions of both margin and classification error with different settings of acceleration factor η (for k = 19) 25
Fig. 6. The reduction of both margin and classification error with varying k = 3~31 (η = 4) 27
Fig. 7. An illustrated instance to generate a subset of validation examples around the difficult example 29
Fig. 8. The validation set is randomly generated from the difficult examples 31
Fig. 9. A comparison instance of the classification between the classical SVM and kNN-SVM 33
Fig. 10. The comparison of the classification of the full set of validation examples 35
Fig. 11. Assessments and comparisons of generalization performance varying exponential grids of C or C˜ by the K-fold validation 40
Fig. 12. Error curves to describe the relations between classification error and model complexity 41
Fig. 13. Plot for margin sensitivity (Table 4) 45
Fig. 14. Generalization errors for the compete models 49
Fig. 15. Empirical risk in the compete models 49
Fig. 16. Margin in the compete models 50
Fig. 17. The changes in SVs and corresponding decision boundary on the dataset Twonorm using classical SVM varying values of C exponentially from 0.1 to 105 58
Fig. 18. The SVs counts of the results of Fig. 17 58
Fig. 19. The changes in SVs and corresponding decision boundary on the dataset TwoNorm using kNN-SVM varying values of C˜ exponentially from 0.1 to 105. Parameters k = 25 and η = 1 are chosen for these examples 66
Fig. 20. The SVs counts of the results of Fig. 19 66
Fig. 21. Stacked bar charts to show the variations of SVs counts. Varying odd k from 3 to 101, with different C˜ value ( C˜ = 10, 1, 0.1 ) 72
Fig. 22. Instance decision boundaries with acquires SVs 75
Fig. 23. 3-D view of Fig. 22(a) for inspecting the locations of SVs 77
Fig. 24. 3-D view of Fig. 22(b) for inspecting the locations of SVs 78
Fig. 25. 3-D view of Fig. 22(c) for inspecting the locations of SVs 79
List of Tables
Table 1. Details of Datasets 20
Table 2. Real-valued class label ỹi provides a stricter penalty in the optimization 23
Table 3. Tests of validation sets with standard deviation around the difficult examples 36
Table 4. Comparison of margin sensitivity in different settings of k. k = 3, 9, 15, 21 and η = 1 44
Table 5. Margin SVs and non-margin SVs, along with the references of margin and training error for observing the support vector variations in the classical SVM 61
Table 6. Variations of margin SVs, non-margin SVs, and total SVs in the model of kNN-SVM. Various C˜ with the fixed k = 25 and η = 1 was adopted in the experiment. 68
Table 7. Variations of margin SVs, non-margin SVs, and total SVs in the model of kNN-SVM. Various C˜ with the fixed k = 101 and η = 1 was adopted in the experiment. 69

List of Symbols
Remp Empirical Risk
L Loss Function
S Training Examples
xi Input Patterns
d Dimensional Feature Space
yi Class Labels by Classical SVM
ỹi Real-Valued Class Label by kNN-SVM
f(xi) Decision Function by Classical SVM
f˜(xi) Decision Function by kNN-SVM
P(ωi|xi) Posteriori Probability
η Acceleration Factor
ξi = ξi˜ Slack Variables
Margin by Classical SVM
Margin by kNN-SVM
C = C˜ Penalty Factor
w = w̃ A Vector Orthogonal to the Hyperplane
b˜ = b Bias Term
K K-fold Cross-Validations
K k-Nearest-Neighbors
H = H˜ Structure of Structural Risk Minimization
sρ Margin Sensitivity by Classical SVM
sρ Margin Sensitivity by kNN-SVM
參考文獻 1. Vapnik, V. N.: The Nature of Statistical Learning Theory. Springer-Verlag, Berlin Heidelberg New York (1995)
2. Vapnik, V. N.: An Overview of Statistical Learning Theory. IEEE Transactions on Neural Networks, Vol. 10 (1999) 988–999
3. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 61–67
4. Cover, T. M., Hart, P. E.: Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, Vol. 13 (1967) 21–27
5. Duda, R. O., Hart, P. E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
6. Fukunaga, K.: Statistical Pattern Recognition. 2nd edn. Academic Press, San Diego, CA (1990)
7. Cortes, C., Vapnik, V. N.: Support Vector Networks. Machine Learning, Vol. 20 (1995) 273–297
8. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 308–312, 371–380
9. Bartlett, P. L., Jordan, M. I., McAuliffe, J. D.: Convexity, Classification, and Risk Bounds. Technical Report 638, Department of Statistics, UC Berkeley, CA (2003)
10. Shawe-Taylor, J., Bartlett, P. L., Williamson, R. C., Anthony, M.: Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Theory, Vol. 44, No. 5 (1998) 1926–1940
11. Vapnik, V. N.: Statistical Learning Theory. Wiley, New York (1998)
12. Yang, C. -Y., Chou, J. -J., Yang, J. -S.: A Method to Improve Classification Performance of Ethnic Minority Classes in k-Nearest-Neighbor rule. Proceedings of IEEE International Conference on Computational Cybernetics, Siofork, Hungary (2003)
13. Yang, C. -Y.: Support Vector Classifier with a Fuzzy-Value Class Label. Lecture Notes in Computer Science, Vol. 3173, Springer-Verlag, Berlin New York (2004) 506–511
14. Breiman, L.: Bias, Variance and Arcing Classifiers. Technical Report 460, Statistics Department, University of California, CA (1996)
15. Murphy, P. M.: UCI-Benchmark Repository of Artificial and Real Data Sets, http://www.ics.uci.edu/~mlearn, University of California Irvine, CA (1995)
16. Vlachos, P., Meyer, M.: StatLib, http://lib.stat.cmu.edu/, Department of Statistics, Carnegie Mellon University (1989)
17. Schölkopf, B., Smola, A. J., Müller, K. R.: Nonlinear Component Analysis as a Kernel Eigen Value Problem. Neural Computing, Vol. 10 (1998) 1299–1319
18. Smola, A. J., Bartlett, P. L., Schölkopf, B., Schuurmans, D.: Advances in Large Margin Classifiers. MIT Press, Cambridge, MA (2000)
19. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 189–211
20. Pontil, M., Verri, A.: Properties of Support Vector Machines. Neural Computation, Vol. 10 (1998) 977–996
21. Vapnik, V. N., Chapelle, O.: Bounds on Error Expectation for SVM. In Smola, A. J., Bartlett, P. L., Scholkopf, B., Schuurmans, D. Ed., Advances in Large Margin Classifiers. (2000) 261–280
22. Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley and Sons, Inc., New York, NY (2000) 161–184
23. Webb, A.: Statistical Pattern Recognition. 2nd edn. John Wiley & Sons. (2002)
24. Dudani, S.: The Distance-Weighted k-Nearest-Neighbor rule. IEEE Transactions on Systems, Man and Cybernetics, Vol. 6. (1976) 325–327
25. Keller, J. M., Gray, M. R., Givens, J. A.: A Fuzzy k-Nearest Neighbors Algorithm, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 15. (1985) 580–585
26. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 25–60
27. Cao, R., Cuevas, A., Gonz´alez-Manteiga, W.: A Comparative Study of Several Smoothing Methods in Density Estimation. Computational Statistics and Data Analysis, 17 (1994):153–176
28. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 196–198
29. Vapnik, V. N., Bottou, L.: Local Learning Algorithms. Neural Computation, 4(6). (1992) 888–900
30. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 14–27
31. Vapnik, V. N., Golowich, S., Smola, A. J.: Support vector method for function approximation, regression estimation, and signal processing, in Mozer, M. C., Jordan, M. I., and Petsche, T., Eds., Advances in Neural Information Processing Systems, 9:281–287, MIT Press, Cambridge, MA (1997)
32. Smola, A. J., Schölkopf, B.: A tutorial on support vector regression, NeuroCOLT2 Technical Report NC2-TR-1998–030 (1998)
33. Smola, A. J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning, in Langley, P., Ed., Proceedings of ICML'00, Morgan Kaufmann, San Francisco, (2000) 911–918
34. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press. (2000) 100–101
35. Yang, C. -Y.: Highlight the importance of outliers to boost support vector machines. (2005) (Submitted)
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2008-08-16公開。
  • 同意授權瀏覽/列印電子全文服務,於2008-08-16起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信