| 系統識別號 | U0002-2207202413010000 |
|---|---|
| DOI | 10.6846/tku202400543 |
| 論文名稱(中文) | 梯度下降法在零膨脹二分類模型之效能研究 |
| 論文名稱(英文) | Performance Study of Gradient Descent Methods on Zero-Inflated Binary Classification Models |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 統計學系應用統計學碩士班 |
| 系所名稱(英文) | Department of Statistics |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 112 |
| 學期 | 2 |
| 出版年 | 113 |
| 研究生(中文) | 陳咸慶 |
| 研究生(英文) | HSIEN-CHING CHEN |
| 學號 | 612650191 |
| 學位類別 | 碩士 |
| 語言別 | 繁體中文 |
| 第二語言別 | |
| 口試日期 | 2024-07-03 |
| 論文頁數 | 46頁 |
| 口試委員 |
指導教授
-
蔡宗儒(tzongru@gmail.com)
口試委員 - 楊 文 口試委員 - 李名鏞 |
| 關鍵字(中) |
零膨脹 梯度下降法 零膨脹伯努利迴歸模型 lasso迴歸 脊迴歸 |
| 關鍵字(英) |
Zero-inflated data gradient descent methods zero-inflated Bernoulli regression model lasso regression ridge regression |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
本論文研究零膨脹二元資料的分類問題,提出以零膨脹伯努利迴歸模型(Zero-Inflated Bernoulli Regression Model; ZIBer)為核心的解決方案,引入lasso(least absolute shrinkage and selection operator regression)與脊迴歸( ridge regression)技術進行正則化,以提升模型的分類效能,探討不同梯度下降法在ZIBer模型中的應用效果,包括固定學習速率、Momentum及Adaptive Moment Estimation(Adam)三種調整策略。利用蒙地卡羅模擬方法,評估這些策略在參數估計和模型性能上的表現。模擬結果顯示Momentum方法在學習速率調整上具有較短的運算時間,而固定學習速率和Adam方法在特定情況下也顯示出一定的優勢。為了驗證ZIBer模型在實際應用中的效能以台灣信用卡違約客戶資料集為例進行實證分析,結果顯示,ZIBer模型在敏感度(Sensitivity)和整體分類準確度(Accuracy)方面表現卓越,尤其在檢測違約風險方面,能夠有效區分少數類別(違約行為)。與此同時,XGBoost和人工神經網絡(ANN)在特異度(Specificity)上具有優勢,能夠準確識別大多數類別(未違約行為),但在捕捉少數類別時效果略遜於ZIBer模型。 |
| 英文摘要 |
This thesis focuses on the classification problem for zero-inflated binary data and proposes the Zero-Inflated Bernoulli Regression Model (ZIBer) as a core solution. Incorporating lasso and ridge regression techniques, for regularization can enhance the model's classification performance. The study investigates the effectiveness of different gradient descent methods within the ZIBer model, including fixed learning rate, Momentum, and Adaptive Moment Estimation (Adam) strategies. Monte Carlo simulations are employed to evaluate the performance of these strategies in parameter estimation and model efficiency. Simulation results indicate that the Momentum method offers shorter computation times for learning rate adjustments, while fixed learning rate and Adam also demonstrate advantages under specific conditions. To validate the practical applicability of the ZIBer model, we conducted an empirical analysis using a dataset of credit card default clients in Taiwan. The results show that the ZIBer model excels in sensitivity and overall classification accuracy, particularly in detecting default risks, effectively distinguishing the minority class (default behavior). Concurrently, XGBoost and artificial neural networks (ANN) exhibit advantages in specificity, accurately identifying the non-default behavior.Their performance in capturing the minority class is slightly inferior to the ZIBer model. |
| 第三語言摘要 | |
| 論文目次 |
目錄 目錄 I 圖目錄 II 表目錄 III 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 1 1.3 論文架構 2 第二章 文獻回顧 3 2.1 文獻綜述 3 第三章 二分類模型 12 3.1 羅吉斯迴歸模型 12 3.2 零膨脹伯努利迴歸模型 12 3.3 LASSO迴歸模型及RIDGE迴歸模型 17 第四章 蒙地卡羅模擬 23 第五章 實例分析 31 5.1 台灣信用卡違約客戶資料集 31 5.2 模型分析 33 第六章 結論與未來方向 40 6.1 結論 40 6.2 未來研究方向 41 參考文獻 43 圖目錄 圖4.1 四個模型準確度的密度圖 29 圖4. 2 四個模型特異度的密度圖 29 圖4. 3 四個模型敏感度的密度圖 30 表目錄 表4. 1、混淆矩陣 24 表4. 2、三種在梯度下降調整學習速率方法在1,000次模擬試驗中的績效表現 27 表4. 3、三種在梯度下降調整學習速率方法在1,000次模擬試驗中的CPU運算時間 27 表4. 4、 ZIBer迴歸、XGBoost及ANN三個模型使用Momentum梯度下降學習率調整方法在1,000次模擬試驗中的績效表現 28 表5. 1、2005年4月至9月台灣信用卡客戶違約資料集 32 表5. 2、ZIBer lasso迴歸模型篩選變數及估計係數 36 表5. 3、ZIBer脊迴歸模型篩選變數及估計係數 36 表5. 4、 ZIBer lasso迴歸、ZIBer脊迴歸、XGBoost及ANN四個模型使用Momentum梯度下降學習率調整方法在信用卡違約資料集上經100次迭代的績效表現 39 |
| 參考文獻 |
參考文獻 [1]Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of Microbiological Methods, 43(1), 3-31. [2]Beikbabaei, M., & Mehrizi-Sani, A. (2024). Detection and Mitigation of Cyberattacks on Volt-Var Control. arXiv:2404.02374. Retrieved April 01, 2024 [3]Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2019). A Comparative Analysis of XGBoost. arXiv:1911.01914. Retrieved November 01, 2019 [4]Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54(3), 1937-1967. [5]Chauvin, Y. (1988). A Back-Propagation Algorithm with Optimal Use of Hidden Units [6]Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754. Retrieved March 01, 2016 [7]Chiang, J.-Y., Lio, Y., Hsu, C.-Y., Ho, C.-L., & Tsai, T.-R. (2024). Binary Classification with Imbalanced Data. Entropy, 26(1), 15. [8]Diallo, A. O., Diop, A., & Dupuy, J. F. (2019). Estimation in zero-inflated binomial regression with missing covariates [Article]. Statistics, 53(4), 839-865. [9]Diop, A., Diop, A., & Dupuy, J.-F. (2016). Simulation-based Inference in a Zero-inflated Bernoulli Regression Model. Communications in Statistics - Simulation and Computation, 45(10), 3597-3614. [10]Gumus, M., & Kiran, M. S. (2017, 5-8 Oct. 2017). Crude oil price forecasting using XGBoost. 2017 International Conference on Computer Science and Engineering (UBMK), [11]Hagan, M. T., & Menhaj, M. B. (1994). Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks, 5(6), 989-993. [12]Hall, D. B. (2000). Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case Study. Biometrics, 56(4), 1030-1039. [13]Hecht-Nielsen, R. (1992). III.3 - Theory of the Backpropagation Neural Network**Based on “nonindent” by Robert Hecht-Nielsen, which appeared in Proceedings of the International Joint Conference on Neural Networks 1, 593–611, June 1989. © 1989 IEEE. In H. Wechsler (Ed.), Neural Networks for Perception (pp. 65-93). Academic Press. [14]Hintze, A., Kirkpatrick, D., & Adami, C. (2018). The structure of evolved representations across different substrates for artificial intelligence. arXiv:1804.01660. Retrieved April 01, 2018 [15]Joharestani, M. Z., Cao, C., Ni, X., Bashir, B., & Talebiesfandarani, S. (2019). PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data [Article]. Atmosphere, 10(7), Article 373. [16]Kasongo, S. M., & Sun, Y. (2020). Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset [Article]. Journal of Big Data, 7(1), Article 105. [17]Lambert, D. (1992). Zero-Inflated Poisson Regression, With an Application to Defects in Manufacturing. Technometrics, 34(1), 1-14. [18]Liang, W., Luo, S., Zhao, G., & Wu, H. (2020). Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms [Article]. Mathematics, 8(5), Article 765. [19]Louzada, F., de Oliveira, M. R., Jr, & Moreira, F. F. (2015). The zero-inflated cure rate regression model: Applications to fraud detection in bank loan portfolios. arXiv:1509.05244. Retrieved September 01, 2015 [20]Oganisian, A., Mitra, N., & Roy, J. (2018). A Bayesian Nonparametric Model for Zero-Inflated Outcomes: Prediction, Clustering, and Causal Estimation. arXiv:1810.09494. Retrieved October 01, 2018 [21]Ogunleye, A., & Wang, Q. G. (2020). XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6), 2131-2140. [22]Park, D. C., El-Sharkawi, M. A., Marks, R. J., Atlas, L. E., & Damborg, M. J. (1991). Electric Load Forecasting Using An Artificial Neural Network [Article]. IEEE Transactions on Power Systems, 6(2), 442-449. [23]Rodriguez, D., Dolado, J., Tuya, J., & Pfahl, D. (2019). Software defect prediction with zero-inflated Poisson models. arXiv:1910.13717. Retrieved October 01, 2019. [24]Saffari, S. E., & Adnan, R. (2012). Parameter estimation on zero-inflated negative binomial regression with right truncated data [Article]. Sains Malaysiana, 41(11), 1483-1487. [25]Shekar, P. R., Mathew, A., Yeswanth, P. V., & Deivalakshmi, S. (2024). A combined deep CNN-RNN network for rainfall-runoff modelling in Bardha Watershed, India [Article]. Artificial Intelligence in Geosciences, 5, Article 100073. [26]Sheridan, R. P., Wang, W. M., Liaw, A., Ma, J., & Gifford, E. M. (2016). Extreme Gradient Boosting as a Method for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Modeling, 56(12), 2353-2360. [27]Sietsma, J., & Dow, R. J. F. (1991). Creating artificial neural networks that generalize [Article]. Neural Networks, 4(1), 67-79. [28]Su, B., Xu, P., & Sheng, M. (2020). Generalized Zero-inflated Binomial Distribution Model Aimed at Air Quality Data Analysis [Article]. Xitong Fangzhen Xuebao / Journal of System Simulation, 32(11), 2226-2234. [29]Tang, B., Frye, H. A., Gelfand, A. E., & Silander, J. A., Jr. (2021). Zero-inflated Beta distribution regression modeling. arXiv:2112.07249. Retrieved December 01, 2021. [30]Wieczorek, T. J., Tchumatchenko, T., Wert Carvajal, C., & Eggl, M. F. (2023). A framework for the emergence and analysis of language in social learning agents. arXiv:2305.02632. Retrieved May 01, 2023. [31]Zeeshan, M., Khan, A., Amanullah, M., Bakr, M. E., Alshangiti, A. M., Balogun, O. S., & Yusuf, M. (2024). A new modified biased estimator for Zero inflated Poisson regression model [Article]. Heliyon, 10(3), Article e24225. [32]Zhang, G., Eddy Patuwo, B., & Y. Hu, M. (1998). Forecasting with artificial neural networks:: The state of the art. International Journal of Forecasting, 14(1), 35-62. [33]Zhang, W., Wu, C., Zhong, H., Li, Y., & Wang, L. (2021). Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geoscience Frontiers, 12(1), 469-477. [34]Zhu, M., Zhang, Y., Gong, Y., Xing, K., Yan, X., & Song, J. (2024). Ensemble Methodology:Innovations in Credit Default Prediction Using LightGBM, XGBoost, and LocalEnsemble. arXiv:2402.17979. Retrieved February 01, 2024. |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信