系統識別號 | U0002-2701202112495000 |
---|---|
DOI | 10.6846/TKU.2021.00732 |
論文名稱(中文) | 透過數據分析建立疾病風險預測模式 |
論文名稱(英文) | Establishment of Disease Risk Prediction Models Based on Data Analysis |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系博士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 109 |
學期 | 1 |
出版年 | 110 |
研究生(中文) | 李修安 |
研究生(英文) | Hsiu-An Lee |
學號 | 805410031 |
學位類別 | 博士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2021-01-13 |
論文頁數 | 87頁 |
口試委員 |
指導教授
-
趙榮耀(chaory@gmail.com)
委員 - 徐建業(cyhsu@ntunhs.edu.tw) 委員 - 郭博昭 委員 - 趙榮耀(chaory@gmail.com) 委員 - 陳建彰(ccchen34@mail.tku.edu.tw) 委員 - 郭經華(chkuo@mail.tku.edu.tw) |
關鍵字(中) |
預測模型 數據分析 疾病預防 精準健康管理 |
關鍵字(英) |
Prediction Model Data Analysis Disease Prevention Precision Health Management |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
隨著醫療技術不斷的發展,人類的平均壽命逐年增加,疾病仍然是人類死亡的主要原因,其中,惡性腫瘤在台灣始終為近十年的十大死因之首,其中肺癌又是惡性腫瘤中的第一位。癌症的早期診斷非常重要,早期診斷出癌症後,通常可以通過手術和輔助療法治愈,造成癌症的原因則有許多不同的說法,包含共病症、基因、飲食生活習慣…等等,近數十年來多數醫師與科學家始終在找尋疾病成因,但目前尚未有確切的證據證實能夠精準確認潛在疾病的發生,因此疾病預防以及癌症早期診斷變得越來越重要,目前在科學證據支持下,可以透過數據分析來識別不同疾病之間的關係,當出現某些症狀時,可以在癌症進展之前就發現它,並立即進行治療以使預後效果更好。 本研究目標為在建立醫學數據應用與分析的精準架構,進而開發疾病預測模型。以我國健康保險資料庫為基礎,透過科學大數據分析的方法查找不同疾病與肺癌的潛在相關因素,並將其與基於證據的醫學研究進行比較,以確認因素之間的相關性,然後透過採用最小絕對收縮(LASSO)和深度神經網路方法(DNN),設計出一種基於數據科學建構預測模型的新流程。 最終,本研究用科學的流程建構兩個不同的案例的模型。第一個模型是十年罹患肺癌預測模型,透過深度神經網路,根據13種不同的疾病計算罹患肺癌的可能性,並能夠幫助潛在患者更早地發現肺癌,建構出的模型效能準確度為85.4%,靈敏度為72.4%和特異性為85%以及ROC(95%CI,0.8604-0.8885)為87.4%。第二個模型是基於不同治療方法肺癌治療三年存活率的預測模型,以邏輯回歸與類神經網路建構基於五種因子的治療存活率預測模型,我們研究中最好的模型為類神經網路模型,其精準度為82.7%,靈敏度為77.6%,特異性為76.8%,以及AUROC為81%。本研究提出的兩種模型,比起過往的模型皆有較高的精準度,第一種模型以科學數據分析為依據,開發高準確度疾病預測模型。第二種模型則可做為不同療法選擇的決策參考依據,並且發現了高血壓之定期服用藥物可能為肺癌治療的保護因子。 |
英文摘要 |
With the continuous development of medical technology, the average human lifespan has been increasing year by year. However, diseases are still the main cause of human death. Among them, cancer leads all other diseases in recent decades in Taiwan. Cancer is usually curable by surgery and adjunctive therapy when diagnosed in early stages. Early cancer can usually be operated on, but elder patients may recover slowly from treatment. Being in bed for a few weeks will affect the general condition of the elderly and prevent them from fully recovering. In order to find a resolation between the pros and cons of the treatment for the elderly, it is necessary to balance over-treatment and under-treatment. Therefore, early diagnosis and disease prevention are becoming more and more important. The relationships between different diseases can be identified through medical data analysis. When certain symptoms appear, cancer can be found before it is advanced, and the immediate treatment follows that makes better prognosis. This study aims to establish an architecture for medical data analysis and design a disease prediction model. Based on the National Health Insurance Research Database, we attempt to find potential correlates of disease and compare them with evidence-based medical research in order to confirm factor correlation. Finally, by employing Least Absolute Shrinkage and deep neural network methods, we design a new approach of building prediction models. Two models are established in this study using different methods. The first model is a prediction model for lung cancer. A deep neural network was created to calculate the probability of lung cancer, depending on the different pre-diagnosed diseases, and to result in the earlier detection of lung cancer for the potential patients. Based on only 13 factors, the performance of model shows an accuracy of 85.4%, a sensitivity of 72.4% and a specificity of 85%, as well as an 87.4% area under ROC (AUROC) (95%, 0.8604-0.8885) model precision. The second model is a prediction model for the survival rate of lung cancer based on different treatments. Based on only 5 factors, the performance of model in our study shows model precision of 82.7% accuracy, a sensitivity of 77.6% and specificity of 76.8%, as well as 81% AUROC. Both models show better performance than other previous studies. The first model is based on scientific data analysis to develop a highly accurate lung cancer prediction model. The second model can be used as a reference for decision-making for different treatment options. In additional, this study also found that the lung cancer patients with hypertension tend to have a lower death rate. |
第三語言摘要 | |
論文目次 |
中文摘要 I Abstract II List of Figures VI List of Tables VII Chapter I. Introduction 1 1.1. Background 1 1.2. Motivation 6 1.3. Research Purpose 7 1.4. The Frame of the Dissertation 8 Chapter II. Literature Review 10 2.1. Availability and Application of Health Data 10 2.2. Analysis of Health Data 13 2.3. Comorbidity and Clinical Evidence of Lung Cancer 15 2.4. Development and Results of Prediction Models 18 Chapter III. Methodology and Proposed Approach 21 3.1. Data Processing and Integration 21 A. Data Source 21 B. Data Pre-Processing 22 C. Heterogeneous Data Integration 24 D. NHIRD Data Description 26 3.2. Evidence of Factors Association 28 A. Propensity Score Match 29 B. Literature Review of Evidence-Based Medicine 31 C. Chi-Square Test 32 3.3. Establishment and Adjustment of the Prediction Model 33 A. Artificial Neural Network 33 B. Binary Logistic Regression 35 3.4. New Approach for Prediction Model Establishment 38 A. Least Absolute Shrinkage and Selection Operator (LASSO) 39 B. Deep Neural Network 44 3.5. Validation and Evaluation 47 A. Confusion Metrics 47 B. Receiver Operating Characteristic (ROC) Curve 50 Chapter IV. Case Studies and Results 52 4.1. Ten-year prediction model for lung cancer 53 A. Data Resources, Processing, and Demography 53 B. Factor Selection 59 C. Model Establishment and Evaluation 61 4.2. Three-year survival prediction model for lung cancer treatment 63 A. Data Resources, Processing, and Demography 63 B. Factor Selection 71 C. Model Establishment and Evaluation 72 D. Web-based system of lung cancer treatment survival prediction 74 Chapter V. Conclusion and Discussion 75 Reference 79 List of Figures Figure 1 Per Capita Expenditures in US $ by Disease Category, 2000 - 2013[7] 3 Figure 2 The Frame of the Dissertation 9 Figure 3 The Working Flowchart of Data Conversion Procedure 23 Figure 4 R Procedure Code 23 Figure 5 The Workflow Stages of Data Integration 24 Figure 6 Relational Data Diagram 27 Figure 7 Evidence of Factors Finding Workflow 28 Figure 8 Common Prediction Model Establishment Process 33 Figure 9 ANN Workflow Diagram 34 Figure 10 A Simply Binary Logistic Regression Curve 36 Figure 11 Research Process 39 Figure 12 Residual Squared Sum Geometry in Quadratic Function 42 Figure 13 Residual Squared Sum Geometry with L1 Normalization in Quadratic Function 42 Figure 14 DNN Model Diagram 44 Figure 15 Neuron Training Flow Chart 45 Figure 16 The Process Flow of Building a DNN Model 47 Figure 17 Different Performance ROC Comparison Chart 51 Figure 18 Data Processing Example 54 Figure 19 DNN model structure 62 Figure 20 ROC Plot of the Model 63 Figure 21 The Procedure of Extracting Data from NHIRD 64 Figure 22 The Procedures of Dividing Samples into Two Groups 65 Figure 23 Comparison of the ROC Curves of the ANN and LR Models 73 Figure 24 Display of Survival Probability with Different Factors of Lung Cancer Therapy 74 List of Tables Table 1 NHIRD Basic Data Sheet 26 Table 2 Confusion Matrix Table 48 Table 3 Subject Demographics 55 Table 4 Clinical characteristics 56 Table 5 The Coefficient and P-value of Each Factor 60 Table 6 Testing Result of Different DNN Structure 61 Table 7 Taiwan’s NHIRD Code Mapping TNM Table 66 Table 8 Subject Demographics 68 Table 9 Different Therapy Distribution Table 70 Table 10 χ2 (Chi-square) Analysis for Correlation Between Death and Different Factors 71 |
參考文獻 |
1. World Health Organization, The top 10 causes of death. 2018; Available from: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. 2. Taiwan Minitry of Health and Welfare, 2017 cause of death statistics analysis. 2017. 3. Peeter Karihtala ja, Ulla Puistola, Syöpä iäkkäällä naisella. Duodecim 2015. 131: p. 1507-1512. 4. Ng, O., E. Watts, C. A. Bull, R. Morris, A. Acheson and A. Banerjea, Colorectal cancer outcomes in patients aged over 85 years. The Annals of The Royal College of Surgeons of England, 2016. 98(03): p. 216-221. 5. Hennequin, C., S. Guillerm and L. Quero, Radiotherapy in elderly patients, recommendations for the main localizations: Breast, prostate and gynaecological cancers. Cancer radiotherapie: journal de la Societe francaise de radiotherapie oncologique, 2015. 19(6-7): p. 397-403. 6. Naeim, Arash, Matti Aapro, Rashmi Subbarao and Lodovico Balducci, Supportive care considerations for older adults with cancer. Journal of Clinical Oncology, 2014. 32(24): p. 2627-2634. 7. Tracker, Peterson-Kaiser Health System, Kaiser Family Foundation analysis of the Bureau of Economic. 2016. 8. Team, R Core, R: A language and environment for statistical computing. 2013. 9. Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss and Vincent Dubourg, Scikit-learn: Machine learning in Python. Journal of machine learning research, 2011. 12(Oct): p. 2825-2830. 10. Van Rossum, Guido and Fred L. Drake Jr, Python reference manual. 1995: Centrum voor Wiskunde en Informatica Amsterdam. 11. Grose, Derek and Robert Milroy, Chronic obstructive pulmonary disease a complex comorbidity. Journal of Comorbidity, 2011. 1: p. 45-50. 12. Young, Robert P., Raewyn J. Hopkins, Timothy Christmas, Peter N. Black, P. Metcalf and G. D. Gamble, COPD prevalence is increased in lung cancer, independent of age, sex and smoking history. European Respiratory Journal, 2009. 34(2): p. 380-386. 13. Yu, Yang-Hao, Chien-Chang Liao, Wu-Huei Hsu, Hung-Jen Chen, Wei-Chih Liao, Chih-Hsin Muo, Fung-Chang Sung and Chih-Yi Chen, Increased lung cancer risk among patients with pulmonary tuberculosis: a population cohort study. Journal of Thoracic Oncology, 2011. 6(1): p. 32-37. 14. Wang, Sunny, Melisa L. Wong, Nathan Hamilton, J. Ben Davoren, Thierry M. Jahan and Louise C. Walter, Impact of age and comorbidity on non-small-cell lung cancer treatment in older veterans. Journal of clinical oncology, 2012. 30(13): p. 1447-1455. 15. Iqbal, Usman, Phung-Anh Nguyen, Shabbir Syed-Abdul, Hsuan-Chia Yang, Chih-Wei Huang, Wen-Shan Jian, Min-Huei Hsu, Yun Yen and Yu-Chuan Jack Li, Is long-term use of benzodiazepine a risk for cancer? Medicine, 2015. 94(6). 16. Chen, Yu-Chun, Hsiao-Yun Yeh, Jau-Ching Wu, Ingo Haschler, Tzeng-Ji Chen and Thomas Wetter, Taiwan’s National Health Insurance Research Database: administrative health care database as study object in bibliometrics. Scientometrics, 2011. 86(2): p. 365-380. 17. D'Amelio Jr, AM, A Cassidy, K Asomaning, OY Raji, SW Duffy, JK Field, MR Spitz, D Christiani and Carol J Etzel, Comparison of discriminatory power and accuracy of three lung cancer risk models. British journal of cancer, 2010. 103(3): p. 423. 18. Etzel, Carol J, Sumesh Kachroo, Mei Liu, Anthony D'Amelio, Qiong Dong, Michele L Cote, Angela S Wenzlaff, Waun Ki Hong, Anthony J Greisinger and Ann G Schwartz, Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prevention Research, 2008. 1(4): p. 255-265. 19. Field, John K, Olaide Y Raji and Stephen W Duffy, Predictive Accuracy of the Liverpool Lung Project Risk Model. Annals of internal medicine, 2013. 158(7): p. 568-569. 20. Bach, Peter B., Michael W. Kattan, Mark D. Thornquist, Mark G. Kris, Ramsey C. Tate, Matt J. Barnett, Lillian J. Hsieh and Colin B. Begg, Variations in lung cancer risk among smokers. Journal of the National Cancer Institute, 2003. 95(6): p. 470-478. 21. Cassidy, Adrian, Jonathan P. Myles, Martie van Tongeren, R. D. Page, T. Liloglou, S. W. Duffy and J. K. Field, The LLP risk model: an individual risk prediction model for lung cancer. British journal of cancer, 2008. 98(2): p. 270. 22. Tammemägi, Martin C., Hormuzd A. Katki, William G. Hocking, Timothy R. Church, Neil Caporaso, Paul A. Kvale, Anil K. Chaturvedi, Gerard A. Silvestri, Tom L. Riley and John Commins, Selection criteria for lung-cancer screening. New England Journal of Medicine, 2013. 368(8): p. 728-736. 23. Spitz, Margaret R., Waun Ki Hong, Christopher I. Amos, Xifeng Wu, Matthew B. Schabath, Qiong Dong, Sanjay Shete and Carol J. Etzel, A risk model for prediction of lung cancer. Journal of the National Cancer Institute, 2007. 99(9): p. 715-726. 24. World Health Organization, WHO eHealth Resolution. 2005; Available from: https://www.who.int/healthacademy/news/en/. 25. Gunter, Tracy D and Nicolas P Terry, The emergence of national electronic health record architectures in the United States and Australia: models, costs, and questions. Journal of medical Internet research, 2005. 7(1): p. e3. 26. Dinov, Ivo D, Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data. Gigascience, 2016. 5(1): p. 12. 27. Rumsfeld, John S, Karen E Joynt and Thomas M Maddox, Big data analytics to improve cardiovascular care: promise and challenges. Nature Reviews Cardiology, 2016. 13(6): p. 350. 28. Slobogean, Gerard P, Peter V Giannoudis, Frede Frihagen, Mary L Forte, Saam Morshed and Mohit Bhandari, Bigger data, bigger problems. Journal of orthopaedic trauma, 2015. 29: p. S43-S46. 29. Scruggs, Sarah B, Karol Watson, Andrew I Su, Henning Hermjakob, John R Yates III, Merry L Lindsey and Peipei Ping, Harnessing the heart of big data. Circulation research, 2015. 116(7): p. 1115-1119. 30. Wang, Weiqi and Eswar Krishnan, Big data and clinicians: a review on the state of the science. JMIR medical informatics, 2014. 2(1): p. e1. 31. Bellazzi, Riccardo and Blaz Zupan, Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, 2008. 77(2): p. 81-97. 32. Binder, Harald and Maria Blettner, Big Data in Medical Science—a Biostatistical View: Part 21 of a Series on Evaluation of Scientific Publications. Deutsches Ärzteblatt International, 2015. 112(9): p. 137. 33. Berwick, Donald M, Connected for health: using electronic health records to transform care delivery. 2010: John Wiley & Sons. 34. Collen, Morris F and Corinne Linden, Screening in a group practice prepaid medical care plan: As applied to periodic health examinations. Journal of Clinical Epidemiology, 1955. 2(4): p. 400-408. 35. Abacha, Asma Ben and Pierre Zweigenbaum, MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies. Information processing management, 2015. 51(5): p. 570-594. 36. Ceusters, Werner, Filip Buekens, Georges De Moor and Andra Waagmeester, The distinction between linguistic and conceptual semantics in medical terminology and its implication for NLP-based knowledge acquisition. Methods of information in medicine, 1998. 37(04/05): p. 327-333. 37. Dobrokhotov, Pavel B, Cyril Goutte, Anne-Lise Veuthey and Eric Gaussier, Combining NLP and probabilistic categorisation for document and term selection for Swiss-Prot medical annotation. Bioinformatics, 2003. 19(suppl_1): p. i91-i94. 38. Shen, Dinggang, Guorong Wu and Heung-Il Suk, Deep learning in medical image analysis. Annual review of biomedical engineering, 2017. 19: p. 221-248. 39. Litjens, Geert, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen Awm Van Der Laak, Bram Van Ginneken and ClaraI Sánchez, A survey on deep learning in medical image analysis. Medical image analysis, 2017. 42: p. 60-88. 40. Milletari, Fausto, Nassir Navab and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. in 2016 Fourth International Conference on 3D Vision (3DV). 2016. IEEE. 41. Connolly, Natalia, Julia Anixt, Patty Manning, Daniel Ping‐I Lin, Keith A Marsolo and Katherine Bowers, Maternal metabolic risk factors for autism spectrum disorder—an analysis of electronic medical records and linked birth data. Autism Research, 2016. 9(8): p. 829-837. 42. Eisenberg, Michael L, Shufeng Li, Mark R Cullen and Laurence C Baker, Increased risk of incident chronic medical conditions in infertile men: analysis of United States claims data. Fertility sterility, 2016. 105(3): p. 629-636. 43. Shah, Tejal, Fethi Rabhi and Pradeep Ray, Investigating an ontology-based approach for Big Data analysis of inter-dependent medical and oral health conditions. Cluster Computing, 2015. 18(1): p. 351-367. 44. Ettehad, Dena, Connor A. Emdin, Amit Kiran, Simon G. Anderson, Thomas Callender, Jonathan Emberson, John Chalmers, Anthony Rodgers and Kazem Rahimi, Blood pressure lowering for prevention of cardiovascular disease and death: a systematic review and meta-analysis. The Lancet, 2016. 387(10022): p. 957-967. 45. Walling, Anne M., Jane C. Weeks, Katherine L. Kahn, Diana Tisnado, Nancy L. Keating, Sydney M. Dy, Neeraj K. Arora, Jennifer W. Mack, Philip M. Pantoja and Jennifer L. Malin, Symptom Prevalence in Lung and Colorectal Cancer Patients. Journal of Pain and Symptom Management, 2015. 49(2): p. 192-202. 46. Yu, Yuan-Bin, Jyh-Pyng Gau, Chun-Yu Liu, Muh-Hwa Yang, Shu-Chiung Chiang, Hui-Chi Hsu, Ying-Chung Hong, Liang-Tsai Hsiao, Jin-Hwang Liu, Tzeon-Jye Chiou, Po-Min Chen, Tzong-Shyuan Lee, Li-Fang Chou, Cheng-Hwai Tzeng and Tzeng-Ji Chen, A nation-wide analysis of venous thromboembolism in 497,180 cancer patients with the development and validation of a risk-stratification scoring system. Thromb Haemost, 2012. 108(1). 47. Wu, Mei-Yi, Yung-Ho Hsu, Chien-Ling Su, Yuh-Feng Lin and Hui-Wen Lin, Risk of Herpes Zoster in CKD: A Matched-Cohort Study Based on Administrative Data. American journal of kidney diseases, 2012. 48. Wei, Po-Li, Joseph J. Keller, Hung-Hua Liang and Herng-Ching Lin, Acute appendicitis and adverse pregnancy outcomes: a nationwide population-based study. Journal of Gastrointestinal Surgery, 2012. 16(6): p. 1204-1211. 49. Valderas, Jose M., Barbara Starfield, Bonnie Sibbald, Chris Salisbury and Martin Roland, Defining comorbidity: implications for understanding health and health services. The Annals of Family Medicine, 2009. 7(4): p. 357-363. 50. Cassidy, Adrian, Stephen W Duffy, Jonathan P Myles, Triantafillos Liloglou and John K Field, Lung cancer risk prediction: a tool for early detection. International journal of cancer, 2007. 120(1): p. 1-6. 51. Tammemagi, C Martin, Christine Neslund-Dudas, Michael Simoff and Paul Kvale, Smoking and lung cancer survival: the role of comorbidity and treatment. Chest, 2004. 125(1): p. 27-37. 52. Tammemagi, C. Martin, Christine Neslund‐Dudas, Michael Simoff and Paul Kvale, Impact of comorbidity on lung cancer survival. International journal of cancer, 2003. 103(6): p. 792-802. 53. Janssen-Heijnen, Maryska L. G., S. Smulders, Vepp Lemmens, Frank W. J. M. Smeenk, Hjaa Van Geffen and J. W. W. Coebergh, Effect of comorbidity on the treatment and prognosis of elderly patients with non-small cell lung cancer. Thorax, 2004. 59(7): p. 602-607. 54. Jeremic, Branislav, Impact of comorbidity on survival after surgical resection in patients with stage I non-small cell lung cancer. The Journal of thoracic and cardiovascular surgery, 2003. 125(2): p. 444-445. 55. Jian, Zhi-Hong, Jing-Yang Huang, Pei-Chieh Ko, Shiou-Rung Jan, Oswald Ndi Nfor, Chia-Chi Lung, Wen-Yuan Ku, Chien-Chang Ho, Hui-Hsien Pan and Yung-Po Liaw, Impact of coexisting pulmonary diseases on survival of patients with lung adenocarcinoma: a STROBE-compliant article. Medicine, 2015. 94(4). 56. Crabtree, Traves D., Chadrick E. Denlinger, Bryan F. Meyers, Issam El Naqa, Jennifer Zoole, A. Sasha Krupnick, Daniel Kreisel, G. Alexander Patterson and Jeffrey D. Bradley, Stereotactic body radiation therapy versus surgical resection for stage I non–small cell lung cancer. The Journal of thoracic and cardiovascular surgery, 2010. 140(2): p. 377-386. 57. Donzé, Jacques, Drahomir Aujesky, Deborah Williams and Jeffrey L. Schnipper, Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA internal medicine, 2013. 173(8): p. 632-638. 58. López-Martínez, Fernando, Aron Schwarcz, Edward Rolando Núñez-Valdez and Vicente García-Díaz, Machine learning classification analysis for a hypertensive population as a function of several risk factors. Expert Systems with Applications, 2018. 110: p. 206-215. 59. Honda, Takanori, Daigo Yoshida, Jun Hata, Yoichiro Hirakawa, Yuki Ishida, Mao Shibata, Satoko Sakata, Takanari Kitazono and Toshiharu Ninomiya, Development and validation of modified risk prediction models for cardiovascular disease and its subtypes: The Hisayama Study. Atherosclerosis, 2018. 279: p. 38-44. 60. Hoggart, Clive, Paul Brennan, Anne Tjonneland, Ulla Vogel, Kim Overvad, Jane Nautrup Østergaard, Rudolf Kaaks, Federico Canzian, Heiner Boeing and Annika Steffen, A risk model for lung cancer incidence. Cancer Prevention Research, 2012. 5(6): p. 834-846. 61. Chen, Yu-Chun, Jau-Ching Wu, Tzeng-Ji Chen and Thomas Wetter, A publicly available database accelerates academic production. Bmj, 2011. 342: p. d637. 62. Hsing, Ann W. and John PA Ioannidis, Nationwide Population Science: Lessons From the Taiwan National Health Insurance Research Database. JAMA internal medicine, 2015. 175(9): p. 1527-1529. 63. Yang, Shun-Fa, Yu-Hsun Wang, Ni-Yu Su, Hui-Chieh Yu, Chia-Yi Wei, Chuan-Hang Yu and Yu-Chao Chang, Changes in prevalence of precancerous oral submucous fibrosis from 1996 to 2013 in Taiwan: A nationwide population-based retrospective study. Journal of the Formosan Medical Association, 2018. 117(2): p. 147-152. 64. Wang, Tung-Yuan, Yu-Wei Chiu, Yi-Tzu Chen, Yu-Hsun Wang, Hui-Chieh Yu, Chuan-Hang Yu and Yu-Chao Chang, Malignant transformation of Taiwanese patients with oral leukoplakia: A nationwide population-based retrospective cohort study. Journal of the Formosan Medical Association, 2018. 117(5): p. 374-380. 65. World Health Organization, ICD-O: International classification of diseases for oncology, in ICD-O: International classification of diseases for oncology. 1976. 66. Quan, Hude, Vijaya Sundararajan, Patricia Halfon, Andrew Fong, Bernard Burnand, Jean-Christophe Luthi, L Duncan Saunders, Cynthia A Beck, Thomas E Feasby and William A Ghali, Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care, 2005: p. 1130-1139. 67. Dolin, Robert H, Liora Alschuler, Calvin Beebe, Paul V Biron, Sandra Lee Boyer, Daniel Essin, Elliot Kimber, Tom Lincoln and John E Mattison, The HL7 clinical document architecture. Journal of the American Medical Informatics Association, 2001. 8(6): p. 552-569. 68. Wacholder, Sholom, Joseph K. McLaughlin, Debra T. Silverman and Jack S. Mandel, Selection of controls in case-control studies: I. Principles. American journal of epidemiology, 1992. 135(9): p. 1019-1028. 69. Rosenbaum, Paul R. and Donald B. Rubin, The central role of the propensity score in observational studies for causal effects. Biometrika, 1983. 70(1): p. 41-55. 70. Eddy, David M., Practice policies: where do they come from? Jama, 1990. 263(9): p. 1265-1275. 71. Wilson, Peter W. F., Ralph B. D’Agostino, Daniel Levy, Albert M. Belanger, Halit Silbershatz and William B. Kannel, Prediction of coronary heart disease using risk factor categories. Circulation, 1998. 97(18): p. 1837-1847. 72. Pearson, Karl, X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1900. 50(302): p. 157-175. 73. Walker, Strother H. and David B. Duncan, Estimation of the probability of an event as a function of several independent variables. Biometrika, 1967. 54(1-2): p. 167-179. 74. Tibshirani, Robert, Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996. 58(1): p. 267-288. 75. Breiman, Leo, Better subset regression using the nonnegative garrote. Technometrics, 1995. 37(4): p. 373-384. 76. Friedman, Jerome, Trevor Hastie and Rob Tibshirani, Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 2010. 33(1): p. 1. 77. Tetko, Igor V, David J Livingstone and Alexander I Luik, Neural network studies. 1. Comparison of overfitting and overtraining. Journal of chemical information and computer sciences, 1995. 35(5): p. 826-833. 78. McElreath, R., AIC provides a surprisingly simple estimate of the average out-of-sample deviance. Statistical Rethinking: A Bayesian Course with Examples in R and Stan author/funder. All rights reserved. No reuse allowed without permission. multimodal actions of huperzine A. Proceedings of the National Academy of Sciences of the United States of America, 2016. 110: p. E746-755. 79. Hinton, Geoffrey E., Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever and Ruslan R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012. 80. Warde-Farley, David, Ian J. Goodfellow, Aaron Courville and Yoshua Bengio, An empirical analysis of dropout in piecewise linear networks. arXiv preprint arXiv:1312.6197, 2013. 81. Condie, Tyson, Paul Mineiro, Neoklis Polyzotis and Markus Weimer. Machine learning on big data. IEEE, 2013. 82. Murdoch, Travis B. and Allan S. Detsky, The inevitable application of big data to health care. Jama, 2013. 309(13): p. 1351-1352. 83. Tammemagi, C. Martin, Christine Neslund-Dudas, Michael Simoff and Paul Kvale, In lung cancer patients, age, race-ethnicity, gender and smoking predict adverse comorbidity, which in turn predicts treatment and survival. Journal of clinical epidemiology, 2004. 57(6): p. 597-609. 84. Czejdo, Bogdan Denny and Mikolaj Baszun, Remote patient monitoring system and a medical social network. International Journal of Social and Humanistic Computing, 2010. 1(3): p. 273-281. 85. Jacquemet, Guillaume, Habib Baghirov, Maria Georgiadou, Harri Sihto, Emilia Peuhu, Pierre Cettour-Janet, Tao He, Merja Perälä, Pauliina Kronqvist and Heikki Joensuu, L-type calcium channels regulate filopodia stability and cancer cell invasion downstream of integrin signalling. Nature communications, 2016. 7: p. 13297. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信