電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2011-07-19起於校外公開使用
本論文紙本於2011-07-19起公開使用

系統識別號	U0002-0607201121284600
DOI	10.6846/TKU.2011.00202
論文名稱(中文)	不完整長期追蹤二元資料之插補策略
論文名稱(英文)	Imputation Strategies for Incomplete Longitudinal Binary Data
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	統計學系碩士班
系所名稱(英文)	Department of Statistics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	99
學期	2
出版年	100
研究生(中文)	李紫熒
研究生(英文)	Tzu-Ying Li
學號	698650180
學位類別	碩士
語言別	英文
第二語言別
口試日期	2011-06-17
論文頁數	38頁
口試委員	指導教授 - 陳怡如委員 - 林國欽委員 - 鄧文舜
關鍵字(中)	長期追蹤資料遺失值多重插補法
關鍵字(英)	Longitudinal data Missing data Multiple imputation
第三語言關鍵字
學科別分類
中文摘要	長期追蹤研究期間常會產生遺失值的問題，解決遺失值的問題有許多種方法，其中一種解決遺失值的有效方法為插補法。Demirtas與Hedeker (2007) 利用在多變量常態下具有完整發展架構的多重插補法與應用隨機生成二元反應變數之演算法，以對於二元資料進行轉換，進而提出對於不完整長期追蹤二元資料之插補策略。由於Demirtas與Hedeker (2007)方法無法確保相關性矩陣為正定，以及必須滿足範圍限制使得其相關性才有唯一解。為改善使用Demirtas-Hedeker方法時可能會面臨到的困難，我們提出對Demirtas- Hedeker方法之修改插補程序，並應用標準偏誤 (standardized bias)，覆蓋率(coverage percentage)，和均方誤根(root-mean-squared error)等基準量測，討論在不同的遺失型態與遺失比率下，比較所提出之插補方法與Demirtas-Hedeker方法之表現差異。此外，並使用實例來模擬研究說明如何應用我們所提出的方法。
英文摘要	It is very common for longitudinal studies to involve missing data. The imputation method is one of the effective procedures for handling with the problem of missing data. Based on the well-developed multiple imputation for normal responses and a random number generation algorithm for binary outcomes, Demirtas and Hedeker (2007) introduced a quasi-imputation strategy for incomplete longitudinal binary data. The shortcomings of Demirtas-Hedeker approach are that positive-definiteness of the correlation matrix cannot be guaranteed and the correlations need to satisfy the constraint for a unique solution. To improve the shortcomings of Demirtas-Hedeker method, the proposed methods can be regarded as the modification of Demirtas-Hedeker method with simpler procedures. The performance of Demirtas-Hedeker method and the proposed procedures is compared in terms of standardized bias, coverage percentage, and root-mean-squared error under various configurations of missing rates and missingness mechanisms. A real data set is used to illustrate the application of the proposed methods.
第三語言摘要
論文目次	Contents 1 Introduction 1 2 Description of Methodology 7 2.1 Imputation Method . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Demirtas and Hedeker Approach . . . . . . . . . . . . . . . . . 12 2.3 Proposed Imputation Strategies . . . . . . . . . . . . . . . . . . 13 3 Simulation Study 18 3.1 Missingness Mechanisms . . . . . . . . . . . . . . . . . . . . . . 19 3.2 GEE Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 Conclusion and Discussion 34 i List of Tables 1 The first ten patients for each center in a trial of respiratory disease. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2 The parameter estimates of GEE model with independent working correlation and their standard errors, confidence intervals, test statistics as well as p-values for respiratory disease data. . 22 3 The parameter estimates of GEE model with exchangeable working correlation and their standard errors, confidence intervals, test statistics as well as p-values for respiratory disease data. . 23 4 The parameter estimates of GEE model with unstructured working correlation and their standard errors, confidence intervals, test statistics as well as p-values for respiratory disease data. . 24 5 The performance measures of DH, M1 and M2 approaches using GEE model with independent working correlation under various missing rates of MCAR. The targeted value is -0.0656. . . . . . 28 6 The performance measures of DH, M1 and M2 approaches using GEE model with independent working correlation under various missing rates of MAR. The targeted value is -0.0656. . . . . . . 29 ii 7 The performance measures of DH, M1 and M2 approaches using GEE model with exchangeable working correlation under various missing rates of MCAR. The targeted value is -0.0685. . 30 8 The performance measures of DH, M1 and M2 approaches using GEE model with exchangeable working correlation under various missing rates of MAR. The targeted value is -0.0685. . . 31 9 The performance measures of DH, M1 and M2 approaches using GEE model with unstructured working correlation under various missing rates of MCAR. The targeted value is -0.0954. . 32 10 The performance measures of DH, M1 and M2 approaches using GEE model with unstructured working correlation under various missing rates of MAR. The targeted value is -0.0954. . . 33 iii
參考文獻	Bibliography Agresti, A. (2002). Categorical Data Analysis, 2nd edition, Wiley: New York. Demirtas, H. and Hedeker, D. (2007). Gaussianization-based quasi-imputation and expansion strategies for incomplete correlated binary responses, Statis- tics in Medicine, 26, 782-799. Diggle, P.J., Heagerty, P.J., Liang, K.Y. and Zeger, S.L. (1994). Analysis of Longitudinal Data, 2nd edition, Oxford University Press. Emrich, L.J. and Piedmonte, R.P. (1991). A method for generating highdimensional multivariate binary outcomes., American Statistician, 45, 302- 304. Fitzmaurice, G.M. and Lipsitz, S.R. (1995). A model for binary time series data with serial odds ratio patterns., Applied Statistics, 44, 51-61. Hedeker, D. (2007). On imputing continuous data when the eventual interest pertains to ordinalized outcomes via threshold concept, Computational Statistics & Data Analysis, 52, 2261-2271. Hedeker, D. and Gibbons, R.D. (1997). Application of Random-Effects Pattern-Mixture Models for Missing Data in Longitudinal Studies, Psycho- logical Methods, 2, 64-78. Koch, G.G., Carr, G.J., Amara, I.A., Stokes, M.E. and Uryniak, T.J. (1990). Categorical data analysis. In Statistical Methodology in the Pharmaceutical Sciences, Berry DA (ed.). Marcel Dekker: New York, 389-473. 37 Lavori, P.W., Dawson, R., Shera, D. (1995). A multiple imputation strategy for clinical trials with truncation of patient data, Statistics in Medicine, 14, 1913-1925. Lee, A.J. (1993). Generating random binary deviates having fixed marginal distributions and specified degrees of association, Statistical Computing, 47, 209-215. Liang, K.Y., and Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models, Biometrika, 73, 13-22. Little, R.J.A. and Rubin, D.B. (2002). Statistical Analysis with Missing Data, 2nd edition, Wiley: New York. Kenward, M.G. and Carpener, J. (2007). Multiple imputation: current perspectives, Statistical Method in Medical Research, 16, 199-218. Rubin, D.B. (1976). Inference and missing data (with discussion), Biometrika, 63, 581-592. Rubin, D.B. (1978). Multiple Imputation in Sample Surveys, Proc. Survey Res. Meth. Sec., Am. Statist. Assoc., 20-34. Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Survey, Wiley: New York. Schafer, J.L. (1997). Analysis of Incomplete Multivariate Data, Chapman & Hall: London. Schafer, J.L. (1999). Multiple imputation: a primer, Statistical Methods in Medical Research, 8, 3-15. Stiratelli, R., Laird, N. and Ware, J.H. (1984). Random-effects models for serial observations with binary, Biometrics, 40, 961-971. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data, Springer: New York. 38
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信