系統識別號 | U0002-2907201912405200 |
---|---|
DOI | 10.6846/TKU.2019.00981 |
論文名稱(中文) | 空間零膨脹對數常態模型於俄亥俄州肺癌資料之應用 |
論文名稱(英文) | A spatial zero-inflated log-normal model for lung cancer data in Ohio |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 統計學系應用統計學碩士班 |
系所名稱(英文) | Department of Statistics |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 107 |
學期 | 2 |
出版年 | 108 |
研究生(中文) | 劉昌輔 |
研究生(英文) | Chang-Fu Liu |
學號 | 606650223 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | |
口試日期 | 2019-07-16 |
論文頁數 | 28頁 |
口試委員 |
指導教授
-
張雅梅
委員 - 吳碩傑 委員 - 張育瑋 |
關鍵字(中) |
對數常態 零膨脹 非平穩空間模型 最小絕對壓縮與篩選運算法 最小角迴歸法 空間分佈 |
關鍵字(英) |
zero-inflated lognormal non-stationary spatial model lasso lars spatial distribution |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
本研究資料為俄亥俄州1968、1978、1988 年的黑人肺癌資料,資料中包含了大量的零,有零膨脹的現象,而且肺癌人口一年一年增加,在此我們使用廣義加法模型來配適,而模型選擇採用最小絕對壓縮與篩選運算法(least absolute shrinkage and selection operator, lasso)進行模型的篩選。本篇用lasso 去選取的核函數以及平穩過程,使用最小角度迴歸法(least angle regression, lars) 去解lasso 的估計值,並利用交叉驗證法選出最佳模型。因為俄亥俄州位於美國五大湖工業區,所以我們對工廠於肺癌的影響感興趣,於是將工廠的位置圖疊到核函數估計的分布圖去觀察。 |
英文摘要 |
This research analyzes the lung cancer data of the black people from 1968, 1978, and 1988 in Ohio. It is zero-inflated because the responses contain a large number of zeros. Here we use the generalized additive model to fit the data and use the least absolute shrinkage and selection operator (lasso) for model selection. In this research, the least angle regression (lars) is used for solving the lasso estimates, and the cross-validation is used for selecting the best model. Ohio is located in the Great Lakes industrial area of the United States. We are interested in finding the impact of the factories in lung cancer by overlapping the images of the industry locations and the distribution of lung cancer. |
第三語言摘要 | |
論文目次 |
目錄 第一章簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 第二章模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 第三章估計方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 第一節參數估計式. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 第二節最小絕對壓縮與篩選運算法(least absolute shrinkage and selection operator, lasso) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 第三節交叉驗證法(cross-validation, cv) . . . . . . . . . . . . . . . . . . . . . . 11 第四章俄亥俄州黑人肺癌資料應用. . . . . . . . . . . . . . . . . . . . . . . . . . . 13 第五章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 圖目錄 3.1 lars 示意圖. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1 黑人男性平均死亡人數. . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 工廠分佈圖. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 黑人男性零膨脹粗估與估計. . . . . . . . . . . . . . . . . . . . . . . . 15 4.4 黑人男性零膨脹的CV . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.5 非零膨脹的黑人男性CV 值與估計. . . . . . . . . . . . . . . . . . . . 18 4.6 黑人男性標準差粗估與估計. . . . . . . . . . . . . . . . . . . . . . . . 20 4.7 Mohoning 郡與Cuyahoga 郡相關係數. . . . . . . . . . . . . . . . . . 21 4.8 Lucas 郡與Guernsey 郡相關係數. . . . . . . . . . . . . . . . . . . . . 22 4.9 Meigs 郡與Hamilton 郡相關係數. . . . . . . . . . . . . . . . . . . . . 22 表目錄 4.1 零膨脹之基底函數參數估計( 3) . . . . . . . . . . . . . . . . . . . . . 16 4.2 零膨脹之基底函數參數估計( 3) . . . . . . . . . . . . . . . . . . . . . 17 4.3 非零膨脹之基底函數參數估計( 1) . . . . . . . . . . . . . . . . . . . . 19 4.4 參數變異數估計( 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 |
參考文獻 |
[1] H. Xia, B. P. CARLIN, and L. A. Waller, “Hierarchical models for mapping ohio lung cancer rates,” Environmetrics: The official journal of the International Environmetrics Society, vol. 8, no. 2, pp. 107–120, 1997. [2] H. Xia and B. P. Carlin, “Spatio-temporal models with errors in covariates: mapping ohio lung cancer mortality,” Statistics in medicine, vol. 17, no. 18, pp. 2025– 2043, 1998. [3] H. K. Lim, W. K. Li, and L. Philip, “Zero-inflated poisson regression mixture model,” Computational Statistics & Data Analysis, vol. 71, pp. 151–158, 2014. [4] W. H. Greene, “Accounting for excess zeros and sample selection in poisson and negative binomial regression models,” 1994. [5] K.-Y. Wong and K. Lam, “Modeling zero-inflated count data using a covariatedependent random effect model,” Statistics in medicine, vol. 32, no. 8, pp. 1283– 1293, 2013. [6] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996. [7] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani et al., “Least angle regression,” The Annals of statistics, vol. 32, no. 2, pp. 407–499, 2004. [8] H. Zou, “The adaptive lasso and its oracle properties,” Journal of the American statistical association, vol. 101, no. 476, pp. 1418–1429, 2006. [9] P. Zeng, Y. Wei, Y. Zhao, J. Liu, L. Liu, R. Zhang, J. Gou, S. Huang, and F. Chen, “Variable selection approach for zero-inflated count data via adaptive lasso,” Journal of Applied Statistics, vol. 41, no. 4, pp. 879–894, 2014. [10] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970. [11] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 67, no. 2, pp. 301–320, 2005. [12] M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 68, no. 1, pp. 49–67, 2006. [13] H. Liu, K.-S. Chan et al., “Introducing cozigam: an r package for unconstrained and constrained zero-inflated generalized additive model analysis,” Journal of Statistical Software, vol. 35, no. 11, pp. 1–26, 2010. [14] W. J. Fu, “Penalized regressions: the bridge versus the lasso,” Journal of computational and graphical statistics, vol. 7, no. 3, pp. 397–416, 1998. [15] D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” biometrika, vol. 81, no. 3, pp. 425–455, 1994. [16] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp. 1–38, 1977. [17] Y. Fan and C. Y. Tang, “Tuning parameter selection in high dimensional penalized likelihood,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 75, no. 3, pp. 531–552, 2013. [18] G. H. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as a method for choosing a good ridge parameter,” Technometrics, vol. 21, no. 2, pp. 215–223, 1979. [19] M. Y. Park and T. Hastie, “L1-regularization path algorithm for generalized linear models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 69, no. 4, pp. 659–677, 2007. [20] R. Tibshirani, “The lasso method for variable selection in the cox model,” Statistics in medicine, vol. 16, no. 4, pp. 385–395, 1997. [21] M. Alfò and G. Trovato, “Semiparametric mixture models for multivariate count data, with application,” The Econometrics Journal, vol. 7, no. 2, pp. 426–454, 2004. [22] P. Wang, M. L. Puterman, I. Cockburn, and N. Le, “Mixed poisson regression models with covariate dependent rates,” Biometrics, pp. 381–400, 1996. [23] P. Wilson, “The misuse of the vuong test for non-nested models to test for zeroinflation,” Economics Letters, vol. 127, pp. 51–53, 2015. [24] J. Van den Broek, “A score test for zero inflation in a poisson distribution,” Biometrics, pp. 738–743, 1995. [25] Q. H. Vuong, “Likelihood ratio tests for model selection and non-nested hypotheses,” Econometrica: Journal of the Econometric Society, pp. 307–333, 1989. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信