電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2025-07-23起於校外公開使用
本論文紙本於2025-07-23起公開使用

系統識別號	U0002-0307202522181900
DOI	10.6846/tku202500495
論文名稱(中文)	不同距離測度與分群方法於空間資料之比較
論文名稱(英文)	A Comparative Study of Different Distance Measures and Clustering Methods in Spatial Data Analysis
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	統計學系應用統計學碩士班
系所名稱(英文)	Department of Statistics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	113
學期	2
出版年	114
研究生(中文)	李琍絹
研究生(英文)	LI-CHUAN LEE
學號	613650026
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2025-06-25
論文頁數	41頁
口試委員	指導教授 - 張雅梅(yameic@ccu.edu.tw) 口試委員 - 張春桃(chuntao@mail.tku.edu.tw) 口試委員 - 張育瑋(ychang@nccu.edu.tw) 共同指導教授 - 吳碩傑(shuo@mail.tku.edu.tw)
關鍵字(中)	核密度估計主成分分析分群分析距離度量
關鍵字(英)	Kernel density estimation Principal Component Analysis Clustering Analysis Distance metrics
第三語言關鍵字
學科別分類
中文摘要	物種的空間分佈是生態學與生物地理學中的重要議題，能反映物種對環境資源的需求與相互作用。隨著生態空間資料的累積與分析方法的發展，如何有效揭示多物種空間分佈的群聚模式，成為生態資料探勘的重要挑戰。本研究提出一套結合核密度估計（kernel density estimation）、主成分分析（principal component analysis）、距離測度(distance measures)與多種分群方法（包括 k-means、k-medoids 及多種階層式分群）的系統化分析流程。首先利用核密度估計將離散物種出現點轉換為連續的空間強度函數，進而透過主成分分析降低資料維度，保留主要變異結構。接著，採用不同分群演算法與距離度量，包含歐幾里得距離、坎培拉距離等，全面比較各方法在辨識物種群聚結構上的效能。為驗證方法的穩健性與適用性，本研究透過模擬資料進行比較，並應用於實際生態空間資料，結果顯示所提流程能有效捕捉物種分佈特徵，具備良好的解釋力與實務應用價值。本研究成果期望為生態空間資料的多物種分析提供具體可行且具彈性的分析框架，促進生態資料探勘與保育決策之發展。
英文摘要	The spatial distribution of species is a key issue in ecology and biogeography, reflecting species' environmental needs and interactions. With increasing ecological spatial data, uncovering clustering patterns among multiple species poses a major challenge. This study proposes a systematic framework combining kernel density estimation (KDE), principal component analysis (PCA), distance measures, and multiple clustering methods, including k-means, k-medoids, and hierarchical clustering. KDE transforms species occurrence points into spatial intensity functions, and PCA reduces dimensionality while preserving major variation. Various distance metrics and clustering algorithms are compared to assess their ability to identify spatial clusters. Simulation and real ecological data show the framework effectively captures species distribution patterns, offering strong interpretability and practical value. This work provides a flexible tool for multi-species spatial analysis and supports ecological research and conservation planning.
第三語言摘要
論文目次	目錄圖目錄 II 表目錄 III 第一章緒論 1 第二章研究方法 4 第一節主成分分析 (PCA) 5 第二節距離測度 5 第三節分群分析 7 第一小節階層式分群 7 第二小節非階層式分群 8 第四節蘭德指數 9 第三章模擬研究 10 第四章實例分析 14 第五章結論 24 參考文獻 26 附錄 31 圖目錄 3.1 2000點個數模擬空間資料: (a) 情況 1 : 向右上生長 (b) 情況 2 : 向左上生長 (c) 情況 3 : 向上生長 (d) 情況 4 : 向低海拔生長 (e) 情況 5 : 向高海拔生長 11 4.1 宜蘭福山植物園的等高線圈 ( 藍色的線條代表溪流) 15 4.2 宜蘭福山植物園的點過程圖 15 4.3 宜蘭福山植物園的強度圖 16 4.4 宜蘭福山植物園的強度圖(有數值) 16 4.5 kmeans 分 3 群 20 4.6 kmedoids-maximum 分 3 群 21 4.7 kmediods-canberra 分 3 群 22 表目錄 3.1 不同距離測度與分群方法在不同點個數條件下所計算出的蘭德指數 13 4.1 不同分群方法與群數下的指標比較 19
參考文獻	Abdi, H. (1994). “Additive-tree representations (with an application to face processing) . In: Lecture Notes in Biomathematics 84, pp. 43-59. Advantages and disadvantages of k-means \| Machine Learning (n.d.). https: // developers . google . com/machine-learning/ clust/e alrgoirinthmgs /kmeans/ advantages-disadvantages. Accessed: 2025-04-24. Aghabozorgi, S., A. Seyed Shirkhorshidi, andT. Y. Wah (2015). “Time-series clustering-A decade review. In: Information Systems 53, pp. 16- 38. DOI: 10.1016/j.is.2015.04.007. Cha, S.-H. (2007). “Comprehensive survey on distance/similarity measures between probability density functions . In: International Journal of Mathematical Models and Methods in Applied Sciences 1. doi: 10. 1.1. 154. 8446, pp. 30- 3007. Clatworthy, J., D. Buick, M. Hankins, J. Weinman, and R. Horne (2005). “The use and reporting of cluster analysis in health psychology: A review’ . In: British Journal of Health Psychology 10, pp. 32- 3958. Coates, A. and A. Y. Ng (2012). “Learning feature representations with k-means’ . In: Neural Networks: Tricks of the Trade. Ed. by G. Montavon, G. B. Orr, and K.-k. Muller. Originally archived (PDF) on 2013-07- 06. Springer. URL: http://ufldl.stanford.edu/wiki/resources/kmeans_ tricks.pdf. Ding, C. and X. He (2004). “K-means clustering via principal component analysis’ . In: Proceedings of the 21st International Conference on Machine Learning CICML), pp. 22- 2532. Dunham, M. H. (2003). Data Mining: Introductory and Advanced Topics. Upper Saddle River, New Jersey: Prentice Hall. Gan, G., C. Ma, and J. Wu (2007). Data Clustering: Theory, Algorithms, and Applications. SIAM - Society for Industrial and Applied Mathematics. Han, J., M. Kamber, and J. Pei (2006). Data Mining: Concepts and Techniques. Morgan Kaufmann. Heyer, L., S. Kruglyak, and S. Yooseph (1999). “Exploring expression data: identification and analysis of coexpressed genes’ . In: Genome Research 9, pp. 1106-1115. Honarkhah, M. and J. Caers (2010). “Stochastic Simulation of Patterns Using Distance-Based Pattern Modeling” . In: Mathematical Geosciences 42, pp. 487-517. Huang, Z. (1998). “Extensions to the k-means algorithm for clustering large datasets with categorical values’ . In: Data Mining and Knowledge Discovery 2, pp. 28- 3304. Hubert, L. and P. Arabie (1985)... “Comparing partitions’ . In: Journal of Classification 2, pp. 193-218. DOI: 10.1007/BF01908075. Jain, A. K., M. N. Murty, andP. J. Flynn (1999). “Data clustering: a review. In: ACM Computing Surveys 31.3, pp. 264-323. DOI: 10.1145/ 331499.331504. Jardine, N. andR. Sibson (1968). “The construction of hierarchic and non-hierarchic classifications’ . In: The Computer Journal 11, pp. 177- 184. K-Medoids Clustering (n.d.). https://link.springer.com/. Accessed: 2025- 05-23. Kaufman, L. and P. J. Rousseeuw (1987). Clustering by Means of Medoids. New York: Wiley. — (1990). “Partitioning Around Medoids (Program PAM)” . In: Wiley Series in Probability and Statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc., pp. 68-125. ISBN: 978-0-470-31680-1. DOI: 10.1002/9780470316801.ch2. Keil, P., T. Wiegand, A. B. Toth, D. J. McGlinn, and J. M. Chase (2021). “Measurement and Analysis of Interspecific Spatial Associations as a Facet of Biodiversity’ . In: Ecological Monographs 91, e01452. DOI: 10.1002/ecm. 1452. Ledo, A. (2015). “Nature and Age of Neighbours Matter: Interspecific Associations among Tree Species Exist and Vary across Life Stages in Tropical Forests . In: PLoS ONE 10, e0141387. DOI: 10.1371/journal.pone.0141387, Legendre, P. and M.-J. Fortin (1989). “Spatial pattern and ecological analysis’ . In: Vegetatio 80.2, pp. 107-138. DOI: 10.1007/BF00048036. Legendre, P. and L. Legendre (2012). Numerical Ecology. Elsevier. Mackay, D. J. C. (2003). Information Theory, Inference and Learning Algorithms. Cambridge University Press. MacQueen, J. (1967). “Some Methods for Classification and Analysis of Multivariate Observations’ . In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics. Berkeley, CA, USA: University of California Press, pp. 28- 2197. Mao, J. and A. K. Jain (1996). “A self-organizing network for hyperellipsoidal clustering (HEC)” . In: IEEE Transactions on Neural Networks 7, pp. 16-29. DOI: 10.1109/72.478389. Moustakas, A. and M. R. Evans (2015). “Effects of tree spatial structure on the dynamics of interacting forest species’ . In: Journal of Ecology 103.6, pp. 1444-1455. DOI: 10.1111/1365-2745.12473. Ng, R. T. andJ. Han (1994). “Efficient and effective clustering methods for spatial data mining’ . In: Proc. of the 20th VLDB Conference. Santiago, Chile, pp. 144-155. Nielsen, F. (2016). “Hierarchical Clustering” . In: Introduction to HPC with MPI for Data Science. Springer, pp. 195-211. ISBN: 978-3-319-21903- 0. Pang, S. E. H., J. W. F. Slik, D. Zurell, and E. L. Webb (2023). “The clustering of spatially associated species unravels patterns in tropical tree species distributions’ . In: Ecosphere 14.6. Handling Editor: Charles D. Canham, e04989. DOI: 10.1002/ecs2.4589. URL: https: //doi.org/10.1002/ecs2.4589. Parzen, E. (1962). “On Estimation of a Probability Density Function and Mode” . In: Annals of Mathematical Statistics 33.3, pp. 1065-1076. DOI: 10.1214/aoms/1177704472. Peres-Neto, P. R., P. Legendre, S. Dray, and D. Borcard (2006). “Variation partitioning of species data matrices: Estimation and comparison of fractions . In: Ecology 87.10, pp. 2614-2625. DOI: 10.1890/0012- 9658 (2006) 87 [2614: VPOSDM]2.0.CO; 2. Plotkin, J. B., J. Chave, and P. S. Ashton (2002). “Cluster Analysis of Spatial Patterns in Malaysian Tree Species’ . In: The American Naturalist 160.5, pp. 629-644. DOI: 10.1086/342823. Rosenblatt, M. (1956). “Remarks on Some Nonparametric Estimates of a Density Function” . In: Annals of Mathematical Statistics 27.3, pp. 83- 8287. DOL: 10.1214/aoms/1177728190. Santos, J. M. andM. Embrechts (2009). “On the Use of the AdjustReandd Index as a Metric for Evaluating Supervised Classification’ . In: Lecture Notes in Computer Science, pp. 175-184. DOI: 10.1007/978-3-642-042771-85. Schubert, E. and P. J. Rousseeuw (2020). “Fast and Eager k-Medoids Clustering: O(k) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms’ . In: Knowledge and Information Systems 62.4, pp. 751-776. DOI: 10.1007/s10115-020-01414-2. Shirkhorshidi, A. S., S. Aghabozorgi, and T.-Y. Wah (2015). “A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data’ . In: PLONEo 10.S12, e0144059. DOI: 10.1371/journal.pone.0144059. Wand, M. and M. Jones (1995). Kernel Smoothing. Chapman and Hall/CRC. DOI: 10.1201/b14876. Wang, H., W. Wang, H. Yang, et al. (2002). “Clusterinbgy pattern similarity in ladratag seets’ . In: Proceedings of the 2002 ACM SIGMOD international conference on Management of data. New York, USA: ACM, p. 394. DOI: 10.1145/564691 .564737. Zha, H., C. Ding, M. Gu, X. He, and H. Simon (2001). “Spectral Relaxation for K-means Clustering’ . In: Advances in Neural Information Processing Systems 14, pp. 10- 510674.
論文全文使用權限	國家圖書館：同意無償授權國家圖書館，書目與全文電子檔於繳交授權書後, 於網際網路立即公開校內：校內紙本論文立即公開同意電子論文全文授權於全球公開校內電子論文立即公開校外：同意授權予資料庫廠商校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信