系統識別號 | U0002-1608200607581800 |
---|---|
DOI | 10.6846/TKU.2006.00448 |
論文名稱(中文) | 聯結k最近鄰域與支向機之模式識別 |
論文名稱(英文) | Associating kNN and SVM for Pattern Recognition |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 機械與機電工程學系碩士班 |
系所名稱(英文) | Department of Mechanical and Electro-Mechanical Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 94 |
學期 | 2 |
出版年 | 95 |
研究生(中文) | 許哲彰 |
研究生(英文) | Che-Chang Hsu |
學號 | 692342792 |
學位類別 | 碩士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2006-06-05 |
論文頁數 | 84頁 |
口試委員 |
指導教授
-
楊智旭
共同指導教授 - 楊棧雲 委員 - 田豐 委員 - 楊智旭 委員 - 湯敬民 委員 - 林智仁 委員 - 楊棧雲 |
關鍵字(中) |
支持向量機 k最近鄰域法 損失函數 模式識別 |
關鍵字(英) |
Support Vector Machine(SVM) k-Nearest-Neighbor(kNN) Loss Function Pattern Recognition |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
本論文建立一個改良型的模式識別方法,它結合無母數k最近鄰域法(kNN)於支持向量機(SVM)用來辨識情勢特殊而又深具義意的訓練樣本,尤其是座落於兩類別重疊區域間難以辨識的樣本。本方法區分為兩階段來進行模式識別,首先使用k近鄰域加強器來過濾篩選上述類型樣本,然後根據這些過濾後的資料來進行支持向量機的分類,因此在兩階段的分類的過程中,將形成較大的懲罰值於這些需要技術性才能加以區分的困難樣本,來獲取較嚴峻的損失函數,而這樣的處裡確實在系統的最佳化過程產生不錯的效果。 在第一階段使用k最近鄰域加強器來收集訓練樣本的資訊並且產生一系列的權重值給予每一筆樣本,然後在第二階段時,由支持向量機使用參數化實數值類別標籤來攜帶各種不同層次權重的訓練樣本資訊以做分類。因此依照這種改良型分類器的演繹程序所得到的決策邊界被用來引導一個較為精準的分類狀況。採用k近鄰域加強器的緣故可在估算區域密度時所展現的優勢得知,也就是對於那些被訓練集所遮蔽的樣本點,能夠僅只考量它們本身所散落的區域位置即可。這樣的模型能夠提供一種用來強調被忽視的小區域孤立樣本以及難以區隔的樣本,也就是說主要就是用來處理這些需要較多技術性指導才能加以區隔的樣本。經由數值模擬結果可知,本方法確實可以改善被訓練集所遮蔽的樣本點以及當系統受到參數擾動下仍保持良好的運算特質。 |
英文摘要 |
This thesis conceived a new model merging a non-parametric k-nearest-neighbor (kNN) method into an underlying support vector machine (SVM) to emphasize the meaningful training examples, especially for the terms of the difficult examples which were located on the overlapping region of the training set. The model consisted of a filtering stage of kNN emphasizer and a classical classification stage. The model motivated by adding heavier penalties into the difficult examples to attain a stricter loss function for optimization could really take effect on emphasizing the difficult examples. First, the filtering stage of the kNN emphasizer was employed to collect information from training examples and to produce a set of emphasized weights for each example. Then, an underlying SVM with parameterized real-valued class labels was used to carry the information in various confidence levels to the training examples for classification. The novel idea of real-valued class labels for conveying the emphasized weights permitted an effective way to pursue the optimal solution inspirited by the additional information. A slight alteration of resultant decision boundary was therefore produced after the convex programming of the SVM, and conducted a higher accurate classification in the training. The adoption of the kNN emphasizer did make sense due to the local density estimation which had the advantage to screen out the difficult examples among the training set by considering merely the distinct situation of the examples themselves. The proposed model provided a way for emphasizing the substantial and subtle examples in the learning process, especially for the difficult examples. Moreover, discussions with detailed experimental evidence were also presented in the paper category to address the corresponding properties. All of those properties were consistent in both the theoretical derivations and related experimental results. |
第三語言摘要 | |
論文目次 |
Contents 中文摘要 Ⅰ Abstract Ⅲ List of Figures Ⅶ List of Tables Ⅸ List of Symbols Ⅹ Chapter 1 Introductions 1 Chapter 2 Associating kNN with SVM 4 2.1 Loss Functions in SVMs 4 2.2 A Preprocessor Based on kNN 6 2.3 SVMs with Weighted Class Labels 10 2.4 Connecting with kNN 16 2.5 Change of Loss Due to the Association of kNN and SVM 17 Chapter 3 Experimental Results and Discussions 19 3.1 Changes in Training Phase 20 3.2 Classification Improvement in Neighborhood of Heterogeneous Examples 27 3.3 Generalization Performance via K-fold Cross Validation 37 3.4 Robustness to the Changes of C 42 3.5 Effects from either Adopting the kNN Emphasizer or Large Value of C 46 3.6 Effects on Support Vectors 52 Chapter 4 Conclusions 80 References 82 List of Figures Fig. 1. Surrogates of loss function in statistical learning 6 Fig. 2. An example of voting kNN rule 8 Fig. 3. Model of kNN-SVM 17 Fig. 4. A comparison of decision boundary from both classical SVM and kNN-SVM with equivalent parameter settings 22 Fig. 5. The reductions of both margin and classification error with different settings of acceleration factor η (for k = 19) 25 Fig. 6. The reduction of both margin and classification error with varying k = 3~31 (η = 4) 27 Fig. 7. An illustrated instance to generate a subset of validation examples around the difficult example 29 Fig. 8. The validation set is randomly generated from the difficult examples 31 Fig. 9. A comparison instance of the classification between the classical SVM and kNN-SVM 33 Fig. 10. The comparison of the classification of the full set of validation examples 35 Fig. 11. Assessments and comparisons of generalization performance varying exponential grids of C or C˜ by the K-fold validation 40 Fig. 12. Error curves to describe the relations between classification error and model complexity 41 Fig. 13. Plot for margin sensitivity (Table 4) 45 Fig. 14. Generalization errors for the compete models 49 Fig. 15. Empirical risk in the compete models 49 Fig. 16. Margin in the compete models 50 Fig. 17. The changes in SVs and corresponding decision boundary on the dataset Twonorm using classical SVM varying values of C exponentially from 0.1 to 105 58 Fig. 18. The SVs counts of the results of Fig. 17 58 Fig. 19. The changes in SVs and corresponding decision boundary on the dataset TwoNorm using kNN-SVM varying values of C˜ exponentially from 0.1 to 105. Parameters k = 25 and η = 1 are chosen for these examples 66 Fig. 20. The SVs counts of the results of Fig. 19 66 Fig. 21. Stacked bar charts to show the variations of SVs counts. Varying odd k from 3 to 101, with different C˜ value ( C˜ = 10, 1, 0.1 ) 72 Fig. 22. Instance decision boundaries with acquires SVs 75 Fig. 23. 3-D view of Fig. 22(a) for inspecting the locations of SVs 77 Fig. 24. 3-D view of Fig. 22(b) for inspecting the locations of SVs 78 Fig. 25. 3-D view of Fig. 22(c) for inspecting the locations of SVs 79 List of Tables Table 1. Details of Datasets 20 Table 2. Real-valued class label ỹi provides a stricter penalty in the optimization 23 Table 3. Tests of validation sets with standard deviation around the difficult examples 36 Table 4. Comparison of margin sensitivity in different settings of k. k = 3, 9, 15, 21 and η = 1 44 Table 5. Margin SVs and non-margin SVs, along with the references of margin and training error for observing the support vector variations in the classical SVM 61 Table 6. Variations of margin SVs, non-margin SVs, and total SVs in the model of kNN-SVM. Various C˜ with the fixed k = 25 and η = 1 was adopted in the experiment. 68 Table 7. Variations of margin SVs, non-margin SVs, and total SVs in the model of kNN-SVM. Various C˜ with the fixed k = 101 and η = 1 was adopted in the experiment. 69 List of Symbols Remp Empirical Risk L Loss Function S Training Examples xi Input Patterns d Dimensional Feature Space yi Class Labels by Classical SVM ỹi Real-Valued Class Label by kNN-SVM f(xi) Decision Function by Classical SVM f˜(xi) Decision Function by kNN-SVM P(ωi|xi) Posteriori Probability η Acceleration Factor ξi = ξi˜ Slack Variables Margin by Classical SVM Margin by kNN-SVM C = C˜ Penalty Factor w = w̃ A Vector Orthogonal to the Hyperplane b˜ = b Bias Term K K-fold Cross-Validations K k-Nearest-Neighbors H = H˜ Structure of Structural Risk Minimization sρ Margin Sensitivity by Classical SVM sρ Margin Sensitivity by kNN-SVM |
參考文獻 |
1. Vapnik, V. N.: The Nature of Statistical Learning Theory. Springer-Verlag, Berlin Heidelberg New York (1995) 2. Vapnik, V. N.: An Overview of Statistical Learning Theory. IEEE Transactions on Neural Networks, Vol. 10 (1999) 988–999 3. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 61–67 4. Cover, T. M., Hart, P. E.: Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, Vol. 13 (1967) 21–27 5. Duda, R. O., Hart, P. E.: Pattern Classification and Scene Analysis. Wiley, New York (1973) 6. Fukunaga, K.: Statistical Pattern Recognition. 2nd edn. Academic Press, San Diego, CA (1990) 7. Cortes, C., Vapnik, V. N.: Support Vector Networks. Machine Learning, Vol. 20 (1995) 273–297 8. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 308–312, 371–380 9. Bartlett, P. L., Jordan, M. I., McAuliffe, J. D.: Convexity, Classification, and Risk Bounds. Technical Report 638, Department of Statistics, UC Berkeley, CA (2003) 10. Shawe-Taylor, J., Bartlett, P. L., Williamson, R. C., Anthony, M.: Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Theory, Vol. 44, No. 5 (1998) 1926–1940 11. Vapnik, V. N.: Statistical Learning Theory. Wiley, New York (1998) 12. Yang, C. -Y., Chou, J. -J., Yang, J. -S.: A Method to Improve Classification Performance of Ethnic Minority Classes in k-Nearest-Neighbor rule. Proceedings of IEEE International Conference on Computational Cybernetics, Siofork, Hungary (2003) 13. Yang, C. -Y.: Support Vector Classifier with a Fuzzy-Value Class Label. Lecture Notes in Computer Science, Vol. 3173, Springer-Verlag, Berlin New York (2004) 506–511 14. Breiman, L.: Bias, Variance and Arcing Classifiers. Technical Report 460, Statistics Department, University of California, CA (1996) 15. Murphy, P. M.: UCI-Benchmark Repository of Artificial and Real Data Sets, http://www.ics.uci.edu/~mlearn, University of California Irvine, CA (1995) 16. Vlachos, P., Meyer, M.: StatLib, http://lib.stat.cmu.edu/, Department of Statistics, Carnegie Mellon University (1989) 17. Schölkopf, B., Smola, A. J., Müller, K. R.: Nonlinear Component Analysis as a Kernel Eigen Value Problem. Neural Computing, Vol. 10 (1998) 1299–1319 18. Smola, A. J., Bartlett, P. L., Schölkopf, B., Schuurmans, D.: Advances in Large Margin Classifiers. MIT Press, Cambridge, MA (2000) 19. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 189–211 20. Pontil, M., Verri, A.: Properties of Support Vector Machines. Neural Computation, Vol. 10 (1998) 977–996 21. Vapnik, V. N., Chapelle, O.: Bounds on Error Expectation for SVM. In Smola, A. J., Bartlett, P. L., Scholkopf, B., Schuurmans, D. Ed., Advances in Large Margin Classifiers. (2000) 261–280 22. Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley and Sons, Inc., New York, NY (2000) 161–184 23. Webb, A.: Statistical Pattern Recognition. 2nd edn. John Wiley & Sons. (2002) 24. Dudani, S.: The Distance-Weighted k-Nearest-Neighbor rule. IEEE Transactions on Systems, Man and Cybernetics, Vol. 6. (1976) 325–327 25. Keller, J. M., Gray, M. R., Givens, J. A.: A Fuzzy k-Nearest Neighbors Algorithm, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 15. (1985) 580–585 26. Schölkopf, B., Smola, A. J.: Learning with Kernels. MIT Press, Cambridge, MA (2002) 25–60 27. Cao, R., Cuevas, A., Gonz´alez-Manteiga, W.: A Comparative Study of Several Smoothing Methods in Density Estimation. Computational Statistics and Data Analysis, 17 (1994):153–176 28. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 196–198 29. Vapnik, V. N., Bottou, L.: Local Learning Algorithms. Neural Computation, 4(6). (1992) 888–900 30. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, Berlin Heidelberg New York (2001) 14–27 31. Vapnik, V. N., Golowich, S., Smola, A. J.: Support vector method for function approximation, regression estimation, and signal processing, in Mozer, M. C., Jordan, M. I., and Petsche, T., Eds., Advances in Neural Information Processing Systems, 9:281–287, MIT Press, Cambridge, MA (1997) 32. Smola, A. J., Schölkopf, B.: A tutorial on support vector regression, NeuroCOLT2 Technical Report NC2-TR-1998–030 (1998) 33. Smola, A. J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning, in Langley, P., Ed., Proceedings of ICML'00, Morgan Kaufmann, San Francisco, (2000) 911–918 34. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press. (2000) 100–101 35. Yang, C. -Y.: Highlight the importance of outliers to boost support vector machines. (2005) (Submitted) |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信