| 系統識別號 | U0002-0312202514445700 |
|---|---|
| DOI | 10.6846/tku202500786 |
| 論文名稱(中文) | 應用極限梯度提升演算法探討抽水行為對地下水水位之影響 |
| 論文名稱(英文) | Applying the Extreme Gradient Boosting Algorithm to Evaluate the Impact of Pumping on Groundwater Levels |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 水資源及環境工程學系碩士班 |
| 系所名稱(英文) | Department of Water Resources and Environmental Engineering |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 114 |
| 學期 | 1 |
| 出版年 | 115 |
| 研究生(中文) | 陳致慧 |
| 研究生(英文) | Chih-Hui Chen |
| 學號 | 613480036 |
| 學位類別 | 碩士 |
| 語言別 | 繁體中文 |
| 第二語言別 | |
| 口試日期 | 2026-01-14 |
| 論文頁數 | 90頁 |
| 口試委員 |
指導教授
-
張麗秋(changlc@mail.tku.edu.tw)
口試委員 - 吳瑞賢(raywu@ncu.edu.tw) 口試委員 - 胡明哲(mchu@ntu.edu.tw) 共同指導教授 - 王聖瑋(wangsw@ncu.edu.tw) |
| 關鍵字(中) |
地下水水位預測 貝氏優化 時間序列交叉驗證 K-means 極限梯度提升(XGBoost) 安全水位 |
| 關鍵字(英) |
Groundwater level prediction Bayesian Hyperparameter Optimization Time Series Cross-Validation K-means Extreme Gradient Boosting(XGBoost) Groundwater level threshold |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
臺灣地狹人稠,各項產業對水資源需求龐大,其中濁水溪沖積扇為全臺農業用水最為密集之區域,長期以來因地表水源供應不穩,導致該區長期高度依賴地下水資源,進而引發嚴重的地層下陷與環境災害。並且傳統地下水物理模式常因缺乏精確的抽水量數據,以至於難以有效地掌握人為抽水行為對水位變動之具體影響。 故本研究蒐集濁水溪沖積扇之地下水位、氣象資料,並使用抽水井之抽水用電量作為抽水行為之代理變數(Proxy Variable),結合K-means空間聚類,將不同水文地質類型的區域劃分出來,並選出本研究的代表井作為後續地下水建模之依據。再以極限梯度提升演算法(XGBoost)建構地下水位預測模型,再透過巢狀時間序列交叉驗證(Nested TSCV)與貝氏優化(Bayesian Optimization)提升模型在不同時間序列切分下之泛化能力,以探討不同水文地質區域在氣象因子與人為抽水行為影響下之水位變動情形。 研究結果顯示,本模型在各代表井之預測標準化均方根誤差(NRMSE)介於9%至13%之間,具備穩健之預測效能;透過SHAP可解釋性分析發現,地下水位具有顯著的時間延時特性,前一個月地下水位(GWTt-1)為最關鍵之影響因子,且影響不同區域的地下水水位之抽水行為則顯示出不同的空間異質性,如客厝主要受一期稻作抽水影響,宏崙則受工業用水顯著影響,與該地區的主要抽水型態有直接關聯;進一步針對未來情境進行模擬,評估關鍵抽水行為的用電量減少10%至30%對水位之效益,結果顯示減抽策略在本身水位條件較佳時效益顯著,經調整能有效提升水位至安全水位以上,然而在嚴重枯水期時(低於嚴重下限水位),由於受限於本身的低水位條件,即使大幅減抽,水位回升幅度仍相對有限。綜上所述,本研究驗證了以抽水用電量推估抽水行為的可行性,並發現水位背景條件對管理成效具決定性影響,因此建議未來應依區域特性制定彈性且差異化之減抽策略,以達成地下水資源永續利用與地層下陷防治之目標。 |
| 英文摘要 |
Taiwan is a densely populated island with intensive water demands from multiple sectors. Among them, the Zhuoshui River alluvial fan is the most water-intensive agricultural region in Taiwan. Due to the long-term instability of surface water supply, groundwater resources have been heavily exploited for decades, leading to severe land subsidence and associated environmental hazards. However, traditional physically based groundwater models often fail to accurately quantify the impacts of anthropogenic pumping on groundwater level variations, primarily because of the lack of reliable pumping rate data. To address this limitation, this study collected long-term groundwater level observations and meteorological data across the Zhuoshui River alluvial fan and employed electricity consumption data from pumping wells as a proxy variable for groundwater abstraction. K-means spatial clustering was first applied to delineate regions with distinct hydrogeological characteristics, from which representative monitoring wells were selected for subsequent groundwater level modeling. An Extreme Gradient Boosting (XGBoost) model was then developed to simulate groundwater level variations. To enhance model generalization under different temporal data partitions, nested time-series cross-validation (nested TSCV) combined with Bayesian hyperparameter optimization was implemented. The results indicate that the proposed model exhibits robust predictive performance across all representative wells, with normalized root mean square error (NRMSE) ranging from 9% to 13%. SHapley Additive exPlanations (SHAP) analysis further reveals a pronounced temporal dependency in groundwater levels, with the groundwater level of the previous month (GWLt-1) identified as the most influential predictor. In addition, pumping-related features demonstrate clear spatial heterogeneity in their impacts on groundwater levels. For instance, groundwater levels at the (a)Kezhu site are predominantly influenced by irrigation pumping for first-crop rice, whereas industrial water use exerts a dominant influence at the (b)Honglun site, consistent with the prevailing pumping patterns in each region.Future scenario simulations were conducted by reducing electricity consumption associated with key pumping activities by 10% to 30% to evaluate potential groundwater level responses. The results show that pumping reduction strategies are most effective under relatively favorable groundwater level conditions, where groundwater levels can be restored above safety thresholds. In contrast, during severe drought periods, when groundwater levels fall below critical thresholds, the magnitude of groundwater level recovery remains limited even under substantial pumping reductions, due to the constraint imposed by the low initial groundwater level. Overall, this study confirms the feasibility of using electricity consumption data to represent groundwater abstraction within a data-driven modeling framework and highlights the decisive role of background groundwater level conditions in determining management effectiveness. These findings suggest that flexible and region-specific pumping reduction strategies are essential for achieving sustainable groundwater resource management and mitigating land subsidence in highly exploited agricultural regions. |
| 第三語言摘要 | |
| 論文目次 |
目錄 謝誌 i 摘要 iii ABSTRACT v 目錄 viii 圖目錄 xi 表目錄 xiii 第一章 前言 1 1.1 研究背景 1 1.2 研究目的 3 1.3 研究架構 3 第二章 文獻回顧 6 2.1 使用機器學習方法預測地下水位 6 2.2 XGBoost應用 8 2.3 貝氏超參數優化理論與應用 9 第三章 研究方法 11 3.1 研究區域 11 3.2 資料蒐集 14 3.2.1 資料篩選 14 3.2.2 資料前處理 18 3.3 空間聚類(Spatial Clustering) 23 3.4 資料分析 25 3.4.1 時間序列分解法—STL 26 3.4.2 自相關與偏自相關函數檢定 26 3.4.3 相關係數分析(Correlation Coefficient Analysis) 27 3.5 XGBoost模型建構 28 3.5.1 時間序列交叉驗證(Time Series Cross-Validation, TSCV) 29 3.5.2 貝氏優化(Bayesian Optimization) 31 3.6 模型可解釋性—SHAP(SHapley Additive exPlanations) 34 3.7 模型預測效能評估指標 35 3.8 未來情境與安全水位計算方法 37 第四章 結果與討論 39 4.1 空間聚類結果 39 4.2 資料分析結果 42 4.2.1 時間序列分析結果 48 4.2.2 自相關與偏自相關函數檢定結果 51 4.2.3 相關係數分析結果 54 4.3 地下水位預測結果 57 4.4 模型可解釋性分析—SHAP 62 4.5 未來情境結果 68 第五章 結論與建議 78 5.1 結論 78 5.2 建議 79 參考文獻 81 附錄一、降雨量及抽水用電量分析 87 附錄二、用電特徵篩選後之SHAP分析結果 89 圖目錄 圖 1-1本研究之研究流程 5 圖 3-1濁水溪沖積扇地區水文地質剖面圖(石榴-海園) 12 圖 3-2濁水溪沖積扇之地下水觀測井分布圖 14 圖 3-3本研究篩選之重點觀測井空間分布圖 18 圖 3-4根據徐昇氏多邊形劃分重點觀測井及其對應之雨量站結果圖 21 圖 3-5擴張視窗概念示意圖 30 圖 4-1 K-means含水層聚類之結果圖 41 圖 4-2各代表井之歷年分佈盒鬚圖 45 圖 4-3各代表井之月分佈盒鬚圖 48 圖 4-4各代表井之STL分解結果圖 49 圖 4-5各代表井的趨勢分解結果之比較圖 50 圖 4-6各代表井的季節性分解結果之比較圖 51 圖 4-7各代表井ACF與PACF之結果圖 54 圖 4-8各代表井之相關係數熱力圖結果 56 圖 4-9各代表井在不同折下的驗證集RMSE變化結果 59 圖 4-10各代表觀測井的SHAP summary plot結果 64 圖 4-11各代表井之SHAP Feature Importance結果 66 圖 4-12客厝觀測井於不同用電量調整幅度下之未來情境模擬與安全水位歷線 69 圖 4-13宏崙觀測井於不同用電量調整幅度下之未來情境模擬與安全水位歷線 71 圖 4-14箔子觀測井於不同用電量調整幅度下之未來情境模擬與安全水位歷線 73 圖 4-15香田觀測井於不同用電量調整幅度下之未來情境模擬與安全水位歷線 75 圖 4-16豐榮觀測井於不同用電量調整幅度下之未來情境模擬與安全水位歷線 77 表目錄 表 2-1機器學習方法比較 7 表 2-2各參數搜索方法比較 9 表 3-1重點觀測井之井座標 17 表 3-2研究區域地下水觀測井之水文地質參數彙整表 19 表 3-3、不同取水用途對應之用電類別與編號 22 表 4-1、K-means含水層聚類之代表井結果 42 表 4-2各代表井之ADF檢定結果 52 表 4-3各代表井於五折交叉驗證中之驗證集時序範圍與測試集評估指標 60 表 4-4、代表井在五折時間序列交叉驗證後的平均指標結果 61 表 4-5特徵篩選前後之模型測試集RMSE 67 |
| 參考文獻 |
1. Allazem, A., & Mohamedelhassan, E. (2025). A hybrid deep learning-Bayesian optimization model for enhanced slope stability classification. Geodata and AI, 5, 100030. https://doi.org/10.1016/j.geoai.2025.100030 2. Anh, D. T., Pandey, M., Mishra, V. N., Singh, K. K., Ahmadi, K., Janizadeh, S., Tran, T. T., Linh, N. T. T., & Dang, N. M. (2023). Assessment of groundwater potential modeling using support vector machine optimization based on Bayesian multi-objective hyperparameter algorithm. Applied Soft Computing, 132, 109848. https://doi.org/10.1016/j.asoc.2022.109848 3. Cawley, G. C., & Talbot, N. L. C. (2010). On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation. J. Mach. Learn. Res., 11, 2079–2107. 4. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785 5. Cleveland, R. B., Cleveland, W. S., & Terpenning, I. (1990). STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of Official Statistics, 6(1). https://www.proquest.com/docview/1266805989/abstract/BC14ECE40BC5420BPQ/1 6. Daliakopoulos, I. N., Coulibaly, P., & Tsanis, I. K. (2005). Groundwater level forecasting using artificial neural networks. Journal of Hydrology, 309(1–4), 229–240. https://doi.org/10.1016/j.jhydrol.2004.12.001 7. Gebreyesus, Y., Dalton, D., Nixon, S., Chiara, D. D., & Chinnici, M. (n.d.). Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP). 8. Guo, X., Gui, X., Xiong, H., Hu, X., Li, Y., Cui, H., Qiu, Y., & Ma, C. (2023). Critical role of climate factors for groundwater potential mapping in arid regions: Insights from random forest, XGBoost, and LightGBM algorithms. Journal of Hydrology, 621, 129599. https://doi.org/10.1016/j.jhydrol.2023.129599 9. Ibrahem Ahmed Osman, A., Najah Ahmed, A., Chow, M. F., Feng Huang, Y., & El-Shafie, A. (2021). Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal, 12(2), 1545–1556. https://doi.org/10.1016/j.asej.2020.11.011 10. Jones, D. R., Schonlau, M., & Welch, W. J. (1998). Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization, 13(4), 455–492. https://doi.org/10.1023/A:1008306431147 11. Krishan G, L. A. (2015). Groundwater Level Simulation Using Artificial Neural Network in Southeast, Punjab, India. Journal of Geology & Geosciences, 04(03). https://doi.org/10.4172/2329-6755.1000206 12. Kushner, H. J. (1964). A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise. Journal of Basic Engineering, 86(1), 97–106. https://doi.org/10.1115/1.3653121 13. Lundberg, S., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions (No. arXiv:1705.07874). arXiv. https://doi.org/10.48550/arXiv.1705.07874 14. Niazkar, M., Menapace, A., Brentan, B., Piraei, R., Jimenez, D., Dhawan, P., & Righetti, M. (2024). Applications of XGBoost in water resources engineering: A systematic literature review (Dec 2018–May 2023). Environmental Modelling & Software, 174, 105971. https://doi.org/10.1016/j.envsoft.2024.105971 15. Nourani, V., & Mousavi, S. (2016). Spatiotemporal groundwater level modeling using hybrid artificial intelligence-meshless method. Journal of Hydrology, 536, 10–25. https://doi.org/10.1016/j.jhydrol.2016.02.030 16. Rajaee, T., Ebrahimi, H., & Nourani, V. (2019). A review of the artificial intelligence methods in groundwater level modeling. Journal of Hydrology, 572, 336–351. https://doi.org/10.1016/j.jhydrol.2018.12.037 17. Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. https://direct.mit.edu/books/oa-monograph/2320/Gaussian-Processes-for-Machine-Learning 18. Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems, 25. https://proceedings.neurips.cc/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html 19. Stone, M. (1974). Cross-Validation and Multinomial Prediction. Biometrika, 61(3), 509–515. https://doi.org/10.2307/2334733 20. Theodosiou, M. (2011). Forecasting monthly and quarterly time series using STL decomposition. International Journal of Forecasting, 27(4), 1178–1195. https://doi.org/10.1016/j.ijforecast.2010.11.002 21. Wang, S., Peng, H., & Liang, S. (2022). Prediction of estuarine water quality using interpretable machine learning approach. Journal of Hydrology, 605, 127320. https://doi.org/10.1016/j.jhydrol.2021.127320 22. Wang, S.-W., Chen, Y.-Y., Hsu, S.-H., Kao, Y.-H., Kimura, M., Chang, L., Pan, T.-W., & Ni, C.-F. (2025). A case study on the application of a data-driven (XGBoost) approach on the environmental and socio-economic perspectives of agricultural groundwater management. Agricultural Water Management, 318, 109729. https://doi.org/10.1016/j.agwat.2025.109729 23. Wang, X., Jin, Y., Schmitt, S., & Olhofer, M. (2022). Recent Advances in Bayesian Optimization (No. arXiv:2206.03301). arXiv. https://doi.org/10.48550/arXiv.2206.03301 24. Yadav, B., Ch, S., Mathur, S., & Adamowski, J. (2017). Assessing the suitability of extreme learning machines (ELM) for groundwater level prediction. Journal of Water and Land Development, 32(1), 103–112. https://doi.org/10.1515/jwld-2017-0012 25. Yonekura, K., Miyazaki, S., Aichi, M., Nishizu, T., Hasegawa, M., & Suzuki, K. (2025). Prediction of groundwater level in Indonesian tropical peatland forest plantations using machine learning. Artificial Intelligence in Geosciences, 6(2), 100148. https://doi.org/10.1016/j.aiig.2025.100148 26. Yu, T.-K., Chang, I.-C., Chen, S.-D., Chen, H.-L., & Yu, T.-Y. (2025). Predicting potential soil and groundwater contamination risks from gas stations using three machine learning models (XGBoost, LightGBM, and Random Forest). Process Safety and Environmental Protection, 199, 107249. https://doi.org/10.1016/j.psep.2025.107249 27. Zhu, C., Zhou, K., Tang, F., Tang, Y., Li, X., & Si, B. (2025). A hierarchical Bayesian inference model for volatile multivariate exponentially distributed signals. Frontiers in Computational Neuroscience, 19. https://doi.org/10.3389/fncom.2025.1408836 28. 吳益裕, 林啓峰, 劉瓊玲, 溫志超, & 林建利. (2021). 地下水管理水位之演進與運用. 土木水利, 48(6). https://doi.org/10.6653/MoCICHE.202112_48(6).0004 29. 林芳華, 馮正一, & 張育瑄. (2010). 濁水溪沖積扇頂區之地下水文特性分析案例. 農林學報, 59(4). https://doi.org/10.30089/JAF.201012.0006 |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信