§ 瀏覽學位論文書目資料
系統識別號 U0002-2907201912405200
DOI 10.6846/TKU.2019.00981
論文名稱(中文) 空間零膨脹對數常態模型於俄亥俄州肺癌資料之應用
論文名稱(英文) A spatial zero-inflated log-normal model for lung cancer data in Ohio
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 統計學系應用統計學碩士班
系所名稱(英文) Department of Statistics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 107
學期 2
出版年 108
研究生(中文) 劉昌輔
研究生(英文) Chang-Fu Liu
學號 606650223
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2019-07-16
論文頁數 28頁
口試委員 指導教授 - 張雅梅
委員 - 吳碩傑
委員 - 張育瑋
關鍵字(中) 對數常態
零膨脹
非平穩空間模型
最小絕對壓縮與篩選運算法
最小角迴歸法
空間分佈
關鍵字(英) zero-inflated
lognormal
non-stationary spatial model
lasso
lars
spatial distribution
第三語言關鍵字
學科別分類
中文摘要
本研究資料為俄亥俄州1968、1978、1988 年的黑人肺癌資料,資料中包含了大量的零,有零膨脹的現象,而且肺癌人口一年一年增加,在此我們使用廣義加法模型來配適,而模型選擇採用最小絕對壓縮與篩選運算法(least absolute shrinkage and selection operator, lasso)進行模型的篩選。本篇用lasso 去選取的核函數以及平穩過程,使用最小角度迴歸法(least angle regression, lars) 去解lasso 的估計值,並利用交叉驗證法選出最佳模型。因為俄亥俄州位於美國五大湖工業區,所以我們對工廠於肺癌的影響感興趣,於是將工廠的位置圖疊到核函數估計的分布圖去觀察。
英文摘要
This research analyzes the lung cancer data of the black people from 1968, 1978, and 1988 in Ohio. It is zero-inflated because the responses contain a large number of zeros. Here we use the generalized additive model to fit the data and use the least absolute shrinkage and selection operator (lasso) for model selection. In this research, the least angle regression (lars) is used for solving the lasso estimates, and the cross-validation is used for selecting the best model. Ohio is located in the Great Lakes industrial area of the United States. We are interested in finding the impact of the factories in lung cancer by overlapping the images of the industry locations and the distribution of lung cancer.
第三語言摘要
論文目次
目錄
第一章簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
第二章模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
第三章估計方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
第一節參數估計式. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
第二節最小絕對壓縮與篩選運算法(least absolute shrinkage and selection
operator, lasso) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
第三節交叉驗證法(cross-validation, cv) . . . . . . . . . . . . . . . . . . . . . . 11
第四章俄亥俄州黑人肺癌資料應用. . . . . . . . . . . . . . . . . . . . . . . . . . . 13
第五章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23


圖目錄
3.1 lars 示意圖. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1 黑人男性平均死亡人數. . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 工廠分佈圖. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 黑人男性零膨脹粗估與估計. . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 黑人男性零膨脹的CV . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5 非零膨脹的黑人男性CV 值與估計. . . . . . . . . . . . . . . . . . . . 18
4.6 黑人男性標準差粗估與估計. . . . . . . . . . . . . . . . . . . . . . . . 20
4.7 Mohoning 郡與Cuyahoga 郡相關係數. . . . . . . . . . . . . . . . . . 21
4.8 Lucas 郡與Guernsey 郡相關係數. . . . . . . . . . . . . . . . . . . . . 22
4.9 Meigs 郡與Hamilton 郡相關係數. . . . . . . . . . . . . . . . . . . . . 22


表目錄
4.1 零膨脹之基底函數參數估計( 3) . . . . . . . . . . . . . . . . . . . . . 16
4.2 零膨脹之基底函數參數估計( 3) . . . . . . . . . . . . . . . . . . . . . 17
4.3 非零膨脹之基底函數參數估計( 1) . . . . . . . . . . . . . . . . . . . . 19
4.4 參數變異數估計( 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
參考文獻
[1] H. Xia, B. P. CARLIN, and L. A. Waller, “Hierarchical models for mapping ohio
lung cancer rates,” Environmetrics: The official journal of the International Environmetrics
Society, vol. 8, no. 2, pp. 107–120, 1997.
[2] H. Xia and B. P. Carlin, “Spatio-temporal models with errors in covariates: mapping
ohio lung cancer mortality,” Statistics in medicine, vol. 17, no. 18, pp. 2025–
2043, 1998.
[3] H. K. Lim, W. K. Li, and L. Philip, “Zero-inflated poisson regression mixture
model,” Computational Statistics & Data Analysis, vol. 71, pp. 151–158, 2014.
[4] W. H. Greene, “Accounting for excess zeros and sample selection in poisson and
negative binomial regression models,” 1994.
[5] K.-Y. Wong and K. Lam, “Modeling zero-inflated count data using a covariatedependent
random effect model,” Statistics in medicine, vol. 32, no. 8, pp. 1283–
1293, 2013.
[6] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the
Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996.
[7] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani et al., “Least angle regression,”
The Annals of statistics, vol. 32, no. 2, pp. 407–499, 2004.
[8] H. Zou, “The adaptive lasso and its oracle properties,” Journal of the American
statistical association, vol. 101, no. 476, pp. 1418–1429, 2006.
[9] P. Zeng, Y. Wei, Y. Zhao, J. Liu, L. Liu, R. Zhang, J. Gou, S. Huang, and
F. Chen, “Variable selection approach for zero-inflated count data via adaptive
lasso,” Journal of Applied Statistics, vol. 41, no. 4, pp. 879–894, 2014.
[10] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for
nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[11] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,”
Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 67,
no. 2, pp. 301–320, 2005.
[12] M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped
variables,” Journal of the Royal Statistical Society: Series B (Statistical Methodology),
vol. 68, no. 1, pp. 49–67, 2006.
[13] H. Liu, K.-S. Chan et al., “Introducing cozigam: an r package for unconstrained
and constrained zero-inflated generalized additive model analysis,” Journal of Statistical
Software, vol. 35, no. 11, pp. 1–26, 2010.
[14] W. J. Fu, “Penalized regressions: the bridge versus the lasso,” Journal of computational
and graphical statistics, vol. 7, no. 3, pp. 397–416, 1998.
[15] D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,”
biometrika, vol. 81, no. 3, pp. 425–455, 1994.
[16] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete
data via the em algorithm,” Journal of the royal statistical society. Series B
(methodological), pp. 1–38, 1977.
[17] Y. Fan and C. Y. Tang, “Tuning parameter selection in high dimensional penalized
likelihood,” Journal of the Royal Statistical Society: Series B (Statistical Methodology),
vol. 75, no. 3, pp. 531–552, 2013.
[18] G. H. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as a method
for choosing a good ridge parameter,” Technometrics, vol. 21, no. 2, pp. 215–223,
1979.
[19] M. Y. Park and T. Hastie, “L1-regularization path algorithm for generalized linear
models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology),
vol. 69, no. 4, pp. 659–677, 2007.
[20] R. Tibshirani, “The lasso method for variable selection in the cox model,” Statistics
in medicine, vol. 16, no. 4, pp. 385–395, 1997.
[21] M. Alfò and G. Trovato, “Semiparametric mixture models for multivariate count data, with application,” The Econometrics Journal, vol. 7, no. 2, pp. 426–454,
2004.
[22] P. Wang, M. L. Puterman, I. Cockburn, and N. Le, “Mixed poisson regression
models with covariate dependent rates,” Biometrics, pp. 381–400, 1996.
[23] P. Wilson, “The misuse of the vuong test for non-nested models to test for zeroinflation,”
Economics Letters, vol. 127, pp. 51–53, 2015.
[24] J. Van den Broek, “A score test for zero inflation in a poisson distribution,” Biometrics,
pp. 738–743, 1995.
[25] Q. H. Vuong, “Likelihood ratio tests for model selection and non-nested hypotheses,”
Econometrica: Journal of the Econometric Society, pp. 307–333, 1989.
論文全文使用權限
校內
紙本論文於授權書繳交後5年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後5年公開
校外
同意授權
校外電子論文於授權書繳交後5年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信