§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2207202222242700
DOI 10.6846/TKU.2022.00612
論文名稱(中文) 現狀設限白內障資料之機器提升學習
論文名稱(英文) Machine Boost Learning on Current Status Censored Cataract Data
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 數學學系數學與數據科學碩士班
系所名稱(英文) Master's Program, Department of Mathematics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 110
學期 2
出版年 111
研究生(中文) 盧杰愷
研究生(英文) Chieh-Kai Lu
學號 610190026
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2022-06-27
論文頁數 24頁
口試委員 指導教授 - 温啟仲(chichung.wen@gmail.com)
口試委員 - 蔡志群(141400@mail.tku.edu.tw)
口試委員 - 吳裕振(yuhjenn@cycu.edu.tw)
關鍵字(中) 逐一分量函數梯度下降
提升法
存活分析
關鍵字(英) Component-wise functional gradient descent
Mboost
Survival analysis
第三語言關鍵字
學科別分類
中文摘要
在本論文中,對於高維度的現狀設限數據問卷資料,我們以在比例勝算比模型下的負對數概似函數作為損失函數,使用機器提升學習方法來建構模型與選取變數,這個方法的計算是基於"mboost" R 套件來發展。我們提了3種決定選出變數重要性的方法,也提了1個評估最終模型預測效能的指標。我們進行了模擬實驗評量所提方法的數值表現,並以台灣65歲以上老人是否罹患白內障的問卷資料分析,作為方法的例說。
英文摘要
For current status censoring data with high dimensional covariates, we, in this  thesis, use the negative log-likelihood under the proportional odds model as the loss function and propose a machine boosting learning for model building and variable selection. The computation of the method is based on R package ‘mboost’. We propose three methods to determine the importance of selected variables and on index to evaluate the predictive power of the final model. We conduct simulations to evaluate the proposed procedure and analyze a cataract dataset of Taiwan residents aged over 65 to illustrate our method.
第三語言摘要
論文目次
一、前言     1
二、資料與模型介紹     4
三、梯度提升法     5
四、模擬     10
五、實例分析     15
六、結論     22
七、參考文獻     23
參考文獻
1. Breiman L (1998) Arcing classifiers (with discussion). Ann Stat 26:801–849
2. Breiman L (2001) Random forests. Mach Learn 45:5–32
3. Bühlmann P, Yu B (2003) Boosting with the L2 loss: regression and classification. J Am Stat Assoc 98: 324–338
4. Bühlmann P (2006) Boosting for high-dimensional linear models. Ann Stat 34:559–583
5. Bühlmann P, Hothorn T (2007) Model-based boosting in R: a handson tutorial using the R package mboost. Springer-Verlag Berlin Heidelberg 2012
6. Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Statistica Sinica 20:101–148
7. Friedman JH, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion). Ann Stat 28:337–407
8. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
9. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York
10. Hothorn T, Bühlmann P, Kneib T, Schmid M, Hofner B (2012) mboost: model-based boosting. http://CRAN. R-project.org/package=mboost, R package version 2.1-3
11. Huang J (1996). Efficient estimation for the Cox model with interval censoring. Annals of statistic, 24, 540-568.
12. Kneib T, Hothorn T, Tutz G (2009) Variable selection and model choice in geoadditive regression models. Biometrics 65:626–634. Web appendix accessed at http://www.biometrics.tibs.org/datasets/071127P. htm on 16 Apr 2012
13. Lin DY, Oakes D, Ying Z (1998). Additive hazards regression with current status data. Binometrika, 85, 289-298.
14. Mayr A, Hofner B, Schmid M (2012) The importance of knowing when to stop. A sequential stopping rule for component-wise gradient boosting. Methods of Information in Medicine 51: 178–186.
15. Rossini AJ, Tsiatis AA (1996). A semiparametric proportional odds regression model for the analysis of current status data. Journal of the American Statistical Association 91,713-721.
16. Schmid M, Hothorn T (2008a) Boosting additive models using component-wise P-splines. Comput Stat Data Anal 53:298–311
17. Sun J, Sun L (2005). Semiparametric linear transformation models for current status data. The Canadian Journal of Statistics, 33, 85-96
18. Tian L, Cai T (2006). On the accelerated failure time model for current status and interval censored data. Binometrika, 93, 329-342.
19. Turnbull, B. (1976) The empricial distribution with arbitrarily grouped and censored data Journal of the Royal Statistical Society B, vol 38 p290-295
20. Van der laan MJ, Dudoit S (2003). Unified cross-validation methodology for selection among estimstors: finite sample results, asymptotic optimality, and applications. Technical Report 130, Division of Biostatistics, University of California, Berkeley, Califomia.
21. Van der laan MJ, Robins JM (2003). Unifed Methods for Censored Longitudinal Data and Causality. New York: Springer
22. Van der laan MJ, Dudoit S and Van der vaart AW (2004). The crossvalidated daptive epsilonnet estimator. Technical Report 142, Division of Biostatistics, University of California, Berkeley, Califomia.
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信