系統識別號 | U0002-1908201912412800 |
---|---|
DOI | 10.6846/TKU.2019.00577 |
論文名稱(中文) | Beta迴歸模型在比例資料的應用 |
論文名稱(英文) | Beta Regression for Modelling Ratio Data |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 統計學系應用統計學碩士班 |
系所名稱(英文) | Department of Statistics |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 107 |
學期 | 2 |
出版年 | 108 |
研究生(中文) | 蕭美淇 |
研究生(英文) | MEI-CHI HSIAO |
學號 | 606650033 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | |
口試日期 | 2019-06-28 |
論文頁數 | 55頁 |
口試委員 |
指導教授
-
陳蔓樺
共同指導教授 - 鄭縉宜 委員 - 鄭浩民 委員 - 陳麗菁 |
關鍵字(中) |
beta迴歸模型 |
關鍵字(英) |
beta regression |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
癌症為全球人口主要致死疾病之一,如果可以找到存活的比例與病患之間的關聯性,就能給予合適的治療方式進行治療以及醫療照護。本篇針對食道癌探討該癌症病患之性別、種族、年齡、腫瘤大小、腫瘤擴散與否以及生活狀態與存活比例之間的關係。存活的比例資料不適用一般的迴歸模型,故本篇我們將介紹 beta 迴歸模型的由來發展、使用方法與時機,及介紹 "betareg" 與 "zoib" R 軟體套件,並利用兩套件對該資料進行配適,比較其估計結果,和比較考慮零一膨脹存在與否的模型架構。 |
英文摘要 |
Cancer is one of the leading causes of death in the global population. If the association between survival ratios and patients can be found, patients can be given better treatment and medical care. This article discusses the relationship between gender, race, age, tumor size, tumor extent, life status and survival rate for esophageal cancer. The general linear regression model can not handle ratio data, so we will introduce the beta regression. We introduce the background, development, and how to use the beta regression model. In addition, we also introduce two R packages "betareg" and "zoib". Finally, we analyse esophageal cancer through two packages to compare different estimation methods and model fits suitable for zero-and-one inflation exists. |
第三語言摘要 | |
論文目次 |
目錄 第一章 緒論 1 1.1 研究背景與動機 1 1.2 文獻回顧 2 1.3 論文結構 9 第二章 資料介紹 10 2.1 簡介SEER資料庫 10 2.2 食道癌之相關研究 10 2.3 相關的敘述性統計 12 2.3.1 性別 12 2.3.2 種族 12 2.3.3 生活狀態 13 2.3.4 腫瘤擴散與否 14 2.3.5 腫瘤大小 15 2.3.6 年齡 15 2.3.7 比例 16 第三章 模型與坒軟體套件 18 3.1 beta分配 18 3.2 “betareg”套件 20 3.2.1 模型假設 20 3.2.2 R軟體的應用 22 3.2.3相關模型診斷 24 3.3 “zoib”套件 26 3.3.1 模型假設 26 3.3.2 貝氏推論 28 3.3.3 R軟體的應用 30 3.3.4 相關模型診斷 32 第四章 實際資料分析 38 4.1 不考慮零一膨脹的估計結果 38 4.1.1 使用lm套件估計 38 4.1.2 比較beta迴歸下betareg與zoib套件 40 4.2 考慮零一膨脹zoib的估計結果 44 4.3 模型選擇 51 第五章 結論 52 圖目錄 圖 2.1 食道癌分期以及食道內部結構 11 圖 3.1 beta分配在不同 α β 值下的機率密度函數 18 圖 3.2 給定 ϕ 或 μ 的beta分配在不同 μ 或 ϕ 下的機率密度函數 19 圖 3.3 圖1到圖6的plot()函數圖 25 圖 3.4 paraplot()函數在”betareg”與”zoib”套件間參數估計的差異 34 圖 3.5 traceplot()函數繪製的趨勢圖 35 圖 3.6 autocorr.plot()函數繪製的自相關圖 36 圖 3.7 gelman.diag()函數計算的psrf值 36 表目錄 表 2.1 食道癌分期 11 表 2.2 性別的次數分配表 12 表 2.3 人種的次數分配表 13 表 2.4生活狀態的次數分配表 14 表 2.5 腫瘤擴散與否的次數分配表 14 表 2.6 腫瘤大小的次數分配表 15 表 2.7 年齡的次數分配表 16 表 2.8 病人比例的次數分配 17 表 4.1 lm套件下變數的點估計及區間估計 39 表 4.2 lm套件下殘差檢定 39 表 4.3 固定ϕi為常數下,μi採用不同鏈接函數的點估計及區間估計 41 表 4.4 固定ϕi為迴歸式下,μi採用不同鏈接函數的點估計及區間估計 43 表 4.5 考慮零一膨脹且ϕi為常數下,μi採用不同鏈接函數的點估計及區間估計46 表 4.6 考慮零一膨脹且ϕi為常數下,刪去不顯著變數後之點估計及區間估計47 表 4.7 固定ϕi為迴歸式下,μi採用不同鏈接函數的點估計及區間估計 49 表 4.8固定ϕi為迴歸式下,刪除不顯著變數後之點估計及區間估計 51 |
參考文獻 |
J. Verkuilen and M. Smithson, “Fuzzy set theory: Applications in the social sciences vol. 147,” 2006. R. Ospina and S. L.Ferrari, ”A general class of zero or-one inflated beta regression models, “Computational Statistics & Data Analysis, vol. 56, no. 6, pp. 1609-1623, 2012. L. A. Hatfield, M. E. Boye, M. D. Hackshaw, and B. P Carlin, "Maltilevel bayesian models for survival times and longitudinal patient-reported outcomes with many zeros,” Journal of the American Statistical Association, vol. 107, no. 499, pp.875-885, 2012. D. A. Williams, “Extra-binomial variation in logistic linear models,” Journal of the Royal Statistical Society: Series C(Applied Statstics), vol. 31, no. 2, pp. 144-148, 1982. R.Prentice, “Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors,” Journal of the American Statistical Association, vol. 81, no. 394, pp. 321-327, 1086. J. A. Nelder and R. W. Wedderburn, "Generalized linear models, Journal of the Royal Statistical Society Series A (General), vol. 135, no. 3. pp. 370-384, 1972. D. Williams,”394: The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity,” Biometrics, pp. 949-952, 1975. M.J. Crowder, "Beta-binomial anova for proportions," Applied statistics, pp. 34-37, 1978. S. Ferrari and F. Cribari-Neto, "Beta regression for modelling rates and proportions," Journal of applied statistics, vol. 31, no. 7, pp. 799-815, 2004. B.-C. Wei, Y.-Q. Hu, and W.-K. Fung, “Generalized leverage and its applications,” Scandinavian Journal of statistics, vol. 25, no. 1, pp. 25-37, 1998. R. Kieschnick and B. D. McCullough, "Regression analysis of variates observed on (0, 1): percentages, proportions and fractions," Statistical modelling, vol. 3, no. 3, pp. 193-213, 2003. P. Paolino, "Maximum likelihood estimation of models with beta-distributed dependent variables," Political Analysis, vol. no. 4, pp. 325-346, 2001. K.L. Vasconcellos and F. Cribari-Neto, “Improved maximum likelihood estimation in a new class of beta regression models,” Brazilian Journal of Probability and Statistics, pp. 13-31, 2005. R. Ospina, F. Cribari-Neto, and K. L. Vasconcellos, “Improved point and interval estimation for a beta regression model,” Computational Statistics & Data Analysis, vol. 51, no. 2, pp. 960-981, 2006. P. L. Espinheira, S. L. Ferrari, and F. Cribari-Neto, “Influence diagnostics in beta regression,” Computational Statistics & Data Analysis, vol. 52, no. 9, pp. 4417-4431, 2008. A. B. Simas, W. Barreto-Souza, and A. V. Rocha, “Improved estimators for a general class of beta regression models,” Computational Statistics & Data Analysis, vol. 54, no. 2, pp. 348-366, 2010. D. M. Stasinopoulos, R. A. Rigby et al., “Generalized additive models for location scale and shape (gamlss) in r,” Journal of Statistical Software, vol. 23, no. 7, pp. 1-46, 2007. A. Simas and A. Rocha, “betareg: Beta regression,” 2006. F. Cribari-Neto and A. Zeileis, “Beta regression in r,” 2009. A. Zeileis, C. Kleiber, and S. Jackman, “Regression models for count data in r,” Journal of statistical software, vol. 27, no. 8, pp. 1-25, 2008. F. Liu and Y. Kong, “zoib: an r package for bayesian inference for beta regression and zero/one inflated beta regression,” RJ, vol. 7, no. 2, pp. 34-51, 2015. A. Zeileis, “Object-oriented computation of sandwich estimators,” 2006. P. L. Espinheira, S. L. Ferrari, and F. Cribari-Neto, “On beta regression residuals,” Journal of Applied Statistics, vol. 35, no. 4, pp. 407-419, 2008. I. Kosmidis, D. Firth et al., “A generic algorithm for reducing bias in parametric estimation,” Electronic Journal of Statistics, vol. 4, pp. 1097-1112, 2010. D. J. Spiegelhalter, N. G. Best, B. P. Carlin, and A. Van Der Linde, “Bayesian measures of model complexity and fit,” Journal of the royal statistical society: Series b (statistical methodology), vol. 64, no. 4, pp. 583-639, 2002. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信