§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2701202112495000
DOI 10.6846/TKU.2021.00732
論文名稱(中文) 透過數據分析建立疾病風險預測模式
論文名稱(英文) Establishment of Disease Risk Prediction Models Based on Data Analysis
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系博士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 109
學期 1
出版年 110
研究生(中文) 李修安
研究生(英文) Hsiu-An Lee
學號 805410031
學位類別 博士
語言別 英文
第二語言別
口試日期 2021-01-13
論文頁數 87頁
口試委員 指導教授 - 趙榮耀(chaory@gmail.com)
委員 - 徐建業(cyhsu@ntunhs.edu.tw)
委員 - 郭博昭
委員 - 趙榮耀(chaory@gmail.com)
委員 - 陳建彰(ccchen34@mail.tku.edu.tw)
委員 - 郭經華(chkuo@mail.tku.edu.tw)
關鍵字(中) 預測模型
數據分析
疾病預防
精準健康管理
關鍵字(英) Prediction Model
Data Analysis
Disease Prevention
Precision Health Management
第三語言關鍵字
學科別分類
中文摘要
隨著醫療技術不斷的發展,人類的平均壽命逐年增加,疾病仍然是人類死亡的主要原因,其中,惡性腫瘤在台灣始終為近十年的十大死因之首,其中肺癌又是惡性腫瘤中的第一位。癌症的早期診斷非常重要,早期診斷出癌症後,通常可以通過手術和輔助療法治愈,造成癌症的原因則有許多不同的說法,包含共病症、基因、飲食生活習慣…等等,近數十年來多數醫師與科學家始終在找尋疾病成因,但目前尚未有確切的證據證實能夠精準確認潛在疾病的發生,因此疾病預防以及癌症早期診斷變得越來越重要,目前在科學證據支持下,可以透過數據分析來識別不同疾病之間的關係,當出現某些症狀時,可以在癌症進展之前就發現它,並立即進行治療以使預後效果更好。
本研究目標為在建立醫學數據應用與分析的精準架構,進而開發疾病預測模型。以我國健康保險資料庫為基礎,透過科學大數據分析的方法查找不同疾病與肺癌的潛在相關因素,並將其與基於證據的醫學研究進行比較,以確認因素之間的相關性,然後透過採用最小絕對收縮(LASSO)和深度神經網路方法(DNN),設計出一種基於數據科學建構預測模型的新流程。
最終,本研究用科學的流程建構兩個不同的案例的模型。第一個模型是十年罹患肺癌預測模型,透過深度神經網路,根據13種不同的疾病計算罹患肺癌的可能性,並能夠幫助潛在患者更早地發現肺癌,建構出的模型效能準確度為85.4%,靈敏度為72.4%和特異性為85%以及ROC(95%CI,0.8604-0.8885)為87.4%。第二個模型是基於不同治療方法肺癌治療三年存活率的預測模型,以邏輯回歸與類神經網路建構基於五種因子的治療存活率預測模型,我們研究中最好的模型為類神經網路模型,其精準度為82.7%,靈敏度為77.6%,特異性為76.8%,以及AUROC為81%。本研究提出的兩種模型,比起過往的模型皆有較高的精準度,第一種模型以科學數據分析為依據,開發高準確度疾病預測模型。第二種模型則可做為不同療法選擇的決策參考依據,並且發現了高血壓之定期服用藥物可能為肺癌治療的保護因子。
英文摘要
With the continuous development of medical technology, the average human lifespan has been increasing year by year. However, diseases are still the main cause of human death. Among them, cancer leads all other diseases in recent decades in Taiwan. Cancer is usually curable by surgery and adjunctive therapy when diagnosed in early stages. Early cancer can usually be operated on, but elder patients may recover slowly from treatment. Being in bed for a few weeks will affect the general condition of the elderly and prevent them from fully recovering. In order to find a resolation between the pros and cons of the treatment for the elderly, it is necessary to balance over-treatment and under-treatment. Therefore, early diagnosis and disease prevention are becoming more and more important. The relationships between different diseases can be identified through medical data analysis. When certain symptoms appear, cancer can be found before it is advanced, and the immediate treatment follows that makes better prognosis.
This study aims to establish an architecture for medical data analysis and design a disease prediction model. Based on the National Health Insurance Research Database, we attempt to find potential correlates of disease and compare them with evidence-based medical research in order to confirm factor correlation. Finally, by employing Least Absolute Shrinkage and deep neural network methods, we design a new approach of building prediction models.
Two models are established in this study using different methods. The first model is a prediction model for lung cancer. A deep neural network was created to calculate the probability of lung cancer, depending on the different pre-diagnosed diseases, and to result in the earlier detection of lung cancer for the potential patients. Based on only 13 factors, the performance of model shows an accuracy of 85.4%, a sensitivity of 72.4% and a specificity of 85%, as well as an 87.4% area under ROC (AUROC) (95%, 0.8604-0.8885) model precision. The second model is a prediction model for the survival rate of lung cancer based on different treatments. Based on only 5 factors, the performance of model in our study shows model precision of 82.7% accuracy, a sensitivity of 77.6% and specificity of 76.8%, as well as 81% AUROC.
Both models show better performance than other previous studies. The first model is based on scientific data analysis to develop a highly accurate lung cancer prediction model. The second model can be used as a reference for decision-making for different treatment options. In additional, this study also found that the lung cancer patients with hypertension tend to have a lower death rate.
第三語言摘要
論文目次
中文摘要		I
Abstract	II
List of Figures	VI
List of Tables	VII
Chapter I.	Introduction	1
1.1.	Background	1
1.2.	Motivation	6
1.3.	Research Purpose	7
1.4.	The Frame of the Dissertation	8
Chapter II.	Literature Review	10
2.1.	Availability and Application of Health Data	10
2.2.	Analysis of Health Data	13
2.3.	Comorbidity and Clinical Evidence of Lung Cancer	15
2.4.	Development and Results of Prediction Models	18
Chapter III.	Methodology and Proposed Approach	21
3.1.	Data Processing and Integration	21
A.	Data Source	21
B.	Data Pre-Processing	22
C.	Heterogeneous Data Integration	24
D.	NHIRD Data Description	26
3.2.	Evidence of Factors Association	28
A.	Propensity Score Match	29
B.	Literature Review of Evidence-Based Medicine	31
C.	Chi-Square Test	32
3.3.	Establishment and Adjustment of the Prediction Model	33
A.	Artificial Neural Network	33
B.	Binary Logistic Regression	35
3.4.	New Approach for Prediction Model Establishment	38
A.	Least Absolute Shrinkage and Selection Operator (LASSO)	39
B.	Deep Neural Network	44
3.5.	Validation and Evaluation	47
A.	Confusion Metrics	47
B.	Receiver Operating Characteristic (ROC) Curve	50
Chapter IV.	Case Studies and Results	52
4.1.	Ten-year prediction model for lung cancer	53
A.	Data Resources, Processing, and Demography	53
B.	Factor Selection	59
C.	Model Establishment and Evaluation	61
4.2.	Three-year survival prediction model for lung cancer treatment	63
A.	Data Resources, Processing, and Demography	63
B.	Factor Selection	71
C.	Model Establishment and Evaluation	72
D.	Web-based system of lung cancer treatment survival prediction	74
Chapter V.	Conclusion and Discussion	75
Reference	79

 
List of Figures
Figure 1 Per Capita Expenditures in US $ by Disease Category, 2000 - 2013[7]	3
Figure 2 The Frame of the Dissertation	9
Figure 3 The Working Flowchart of Data Conversion Procedure	23
Figure 4 R Procedure Code	23
Figure 5 The Workflow Stages of Data Integration	24
Figure 6 Relational Data Diagram	27
Figure 7 Evidence of Factors Finding Workflow	28
Figure 8 Common Prediction Model Establishment Process	33
Figure 9 ANN Workflow Diagram	34
Figure 10 A Simply Binary Logistic Regression Curve	36
Figure 11 Research Process	39
Figure 12 Residual Squared Sum Geometry in Quadratic Function	42
Figure 13 Residual Squared Sum Geometry with L1 Normalization in Quadratic Function	42
Figure 14 DNN Model Diagram	44
Figure 15 Neuron Training Flow Chart	45
Figure 16 The Process Flow of Building a DNN Model	47
Figure 17 Different Performance ROC Comparison Chart	51
Figure 18 Data Processing Example	54
Figure 19 DNN model structure	62
Figure 20 ROC Plot of the Model	63
Figure 21 The Procedure of Extracting Data from NHIRD	64
Figure 22 The Procedures of Dividing Samples into Two Groups	65
Figure 23 Comparison of the ROC Curves of the ANN and LR Models	73
Figure 24 Display of Survival Probability with Different Factors of Lung Cancer Therapy	74

 
List of Tables
Table 1 NHIRD Basic Data Sheet	26
Table 2 Confusion Matrix Table	48
Table 3 Subject Demographics	55
Table 4 Clinical characteristics	56
Table 5 The Coefficient and P-value of Each Factor	60
Table 6 Testing Result of Different DNN Structure	61
Table 7 Taiwan’s NHIRD Code Mapping TNM Table	66
Table 8 Subject Demographics	68
Table 9 Different Therapy Distribution Table	70
Table 10 χ2 (Chi-square) Analysis for Correlation Between Death and Different Factors	71
參考文獻
1.	World Health Organization, The top 10 causes of death. 2018; Available from: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death.
2.	Taiwan Minitry of Health and Welfare, 2017 cause of death statistics analysis. 2017.
3.	Peeter Karihtala ja, Ulla Puistola, Syöpä iäkkäällä naisella. Duodecim 2015. 131: p. 1507-1512.
4.	Ng, O., E. Watts, C. A. Bull, R. Morris, A. Acheson and A. Banerjea, Colorectal cancer outcomes in patients aged over 85 years. The Annals of The Royal College of Surgeons of England, 2016. 98(03): p. 216-221.
5.	Hennequin, C., S. Guillerm and L. Quero, Radiotherapy in elderly patients, recommendations for the main localizations: Breast, prostate and gynaecological cancers. Cancer radiotherapie: journal de la Societe francaise de radiotherapie oncologique, 2015. 19(6-7): p. 397-403.
6.	Naeim, Arash, Matti Aapro, Rashmi Subbarao and Lodovico Balducci, Supportive care considerations for older adults with cancer. Journal of Clinical Oncology, 2014. 32(24): p. 2627-2634.
7.	Tracker, Peterson-Kaiser Health System, Kaiser Family Foundation analysis of the Bureau of Economic. 2016.
8.	Team, R Core, R: A language and environment for statistical computing. 2013.
9.	Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss and Vincent Dubourg, Scikit-learn: Machine learning in Python. Journal of machine learning research, 2011. 12(Oct): p. 2825-2830.
10.	Van Rossum, Guido and Fred L. Drake Jr, Python reference manual. 1995: Centrum voor Wiskunde en Informatica Amsterdam.
11.	Grose, Derek and Robert Milroy, Chronic obstructive pulmonary disease a complex comorbidity. Journal of Comorbidity, 2011. 1: p. 45-50.
12.	Young, Robert P., Raewyn J. Hopkins, Timothy Christmas, Peter N. Black, P. Metcalf and G. D. Gamble, COPD prevalence is increased in lung cancer, independent of age, sex and smoking history. European Respiratory Journal, 2009. 34(2): p. 380-386.
13.	Yu, Yang-Hao, Chien-Chang Liao, Wu-Huei Hsu, Hung-Jen Chen, Wei-Chih Liao, Chih-Hsin Muo, Fung-Chang Sung and Chih-Yi Chen, Increased lung cancer risk among patients with pulmonary tuberculosis: a population cohort study. Journal of Thoracic Oncology, 2011. 6(1): p. 32-37.
14.	Wang, Sunny, Melisa L. Wong, Nathan Hamilton, J. Ben Davoren, Thierry M. Jahan and Louise C. Walter, Impact of age and comorbidity on non-small-cell lung cancer treatment in older veterans. Journal of clinical oncology, 2012. 30(13): p. 1447-1455.
15.	Iqbal, Usman, Phung-Anh Nguyen, Shabbir Syed-Abdul, Hsuan-Chia Yang, Chih-Wei Huang, Wen-Shan Jian, Min-Huei Hsu, Yun Yen and Yu-Chuan Jack Li, Is long-term use of benzodiazepine a risk for cancer? Medicine, 2015. 94(6).
16.	Chen, Yu-Chun, Hsiao-Yun Yeh, Jau-Ching Wu, Ingo Haschler, Tzeng-Ji Chen and Thomas Wetter, Taiwan’s National Health Insurance Research Database: administrative health care database as study object in bibliometrics. Scientometrics, 2011. 86(2): p. 365-380.
17.	D'Amelio Jr, AM, A Cassidy, K Asomaning, OY Raji, SW Duffy, JK Field, MR Spitz, D Christiani and Carol J Etzel, Comparison of discriminatory power and accuracy of three lung cancer risk models. British journal of cancer, 2010. 103(3): p. 423.
18.	Etzel, Carol J, Sumesh Kachroo, Mei Liu, Anthony D'Amelio, Qiong Dong, Michele L Cote, Angela S Wenzlaff, Waun Ki Hong, Anthony J Greisinger and Ann G Schwartz, Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prevention Research, 2008. 1(4): p. 255-265.
19.	Field, John K, Olaide Y Raji and Stephen W Duffy, Predictive Accuracy of the Liverpool Lung Project Risk Model. Annals of internal medicine, 2013. 158(7): p. 568-569.
20.	Bach, Peter B., Michael W. Kattan, Mark D. Thornquist, Mark G. Kris, Ramsey C. Tate, Matt J. Barnett, Lillian J. Hsieh and Colin B. Begg, Variations in lung cancer risk among smokers. Journal of the National Cancer Institute, 2003. 95(6): p. 470-478.
21.	Cassidy, Adrian, Jonathan P. Myles, Martie van Tongeren, R. D. Page, T. Liloglou, S. W. Duffy and J. K. Field, The LLP risk model: an individual risk prediction model for lung cancer. British journal of cancer, 2008. 98(2): p. 270.
22.	Tammemägi, Martin C., Hormuzd A. Katki, William G. Hocking, Timothy R. Church, Neil Caporaso, Paul A. Kvale, Anil K. Chaturvedi, Gerard A. Silvestri, Tom L. Riley and John Commins, Selection criteria for lung-cancer screening. New England Journal of Medicine, 2013. 368(8): p. 728-736.
23.	Spitz, Margaret R., Waun Ki Hong, Christopher I. Amos, Xifeng Wu, Matthew B. Schabath, Qiong Dong, Sanjay Shete and Carol J. Etzel, A risk model for prediction of lung cancer. Journal of the National Cancer Institute, 2007. 99(9): p. 715-726.
24.	World Health Organization, WHO eHealth Resolution. 2005; Available from: https://www.who.int/healthacademy/news/en/.
25.	Gunter, Tracy D and Nicolas P  Terry, The emergence of national electronic health record architectures in the United States and Australia: models, costs, and questions. Journal of medical Internet research, 2005. 7(1): p. e3.
26.	Dinov, Ivo D, Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data. Gigascience, 2016. 5(1): p. 12.
27.	Rumsfeld, John S, Karen E Joynt and Thomas M Maddox, Big data analytics to improve cardiovascular care: promise and challenges. Nature Reviews Cardiology, 2016. 13(6): p. 350.
28.	Slobogean, Gerard P, Peter V Giannoudis, Frede Frihagen, Mary L Forte, Saam Morshed and Mohit  Bhandari, Bigger data, bigger problems. Journal of orthopaedic trauma, 2015. 29: p. S43-S46.
29.	Scruggs, Sarah B, Karol Watson, Andrew I Su, Henning Hermjakob, John R Yates III, Merry L Lindsey and Peipei Ping, Harnessing the heart of big data. Circulation research, 2015. 116(7): p. 1115-1119.
30.	Wang, Weiqi and Eswar Krishnan, Big data and clinicians: a review on the state of the science. JMIR medical informatics, 2014. 2(1): p. e1.
31.	Bellazzi, Riccardo and Blaz Zupan, Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, 2008. 77(2): p. 81-97.
32.	Binder, Harald and Maria Blettner, Big Data in Medical Science—a Biostatistical View: Part 21 of a Series on Evaluation of Scientific Publications. Deutsches Ärzteblatt International, 2015. 112(9): p. 137.
33.	Berwick, Donald M, Connected for health: using electronic health records to transform care delivery. 2010: John Wiley & Sons.
34.	Collen, Morris F and Corinne Linden, Screening in a group practice prepaid medical care plan: As applied to periodic health examinations. Journal of Clinical Epidemiology, 1955. 2(4): p. 400-408.
35.	Abacha, Asma Ben and Pierre Zweigenbaum, MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies. Information processing management, 2015. 51(5): p. 570-594.
36.	Ceusters, Werner, Filip Buekens, Georges De Moor and Andra Waagmeester, The distinction between linguistic and conceptual semantics in medical terminology and its implication for NLP-based knowledge acquisition. Methods of information in medicine, 1998. 37(04/05): p. 327-333.
37.	Dobrokhotov, Pavel B, Cyril Goutte, Anne-Lise Veuthey and Eric Gaussier, Combining NLP and probabilistic categorisation for document and term selection for Swiss-Prot medical annotation. Bioinformatics, 2003. 19(suppl_1): p. i91-i94.
38.	Shen, Dinggang, Guorong Wu and Heung-Il Suk, Deep learning in medical image analysis. Annual review of biomedical engineering, 2017. 19: p. 221-248.
39.	Litjens, Geert, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen Awm Van Der Laak, Bram Van Ginneken and ClaraI Sánchez, A survey on deep learning in medical image analysis. Medical image analysis, 2017. 42: p. 60-88.
40.	Milletari, Fausto, Nassir Navab and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. in 2016 Fourth International Conference on 3D Vision (3DV). 2016. IEEE.
41.	Connolly, Natalia, Julia Anixt, Patty Manning, Daniel Ping‐I Lin, Keith A Marsolo and Katherine Bowers, Maternal metabolic risk factors for autism spectrum disorder—an analysis of electronic medical records and linked birth data. Autism Research, 2016. 9(8): p. 829-837.
42.	Eisenberg, Michael L, Shufeng Li, Mark R Cullen and Laurence C Baker, Increased risk of incident chronic medical conditions in infertile men: analysis of United States claims data. Fertility sterility, 2016. 105(3): p. 629-636.
43.	Shah, Tejal, Fethi Rabhi and Pradeep Ray, Investigating an ontology-based approach for Big Data analysis of inter-dependent medical and oral health conditions. Cluster Computing, 2015. 18(1): p. 351-367.
44.	Ettehad, Dena, Connor A. Emdin, Amit Kiran, Simon G. Anderson, Thomas Callender, Jonathan Emberson, John Chalmers, Anthony Rodgers and Kazem Rahimi, Blood pressure lowering for prevention of cardiovascular disease and death: a systematic review and meta-analysis. The Lancet, 2016. 387(10022): p. 957-967.
45.	Walling, Anne M., Jane C. Weeks, Katherine L. Kahn, Diana Tisnado, Nancy L. Keating, Sydney M. Dy, Neeraj K. Arora, Jennifer W. Mack, Philip M. Pantoja and Jennifer L. Malin, Symptom Prevalence in Lung and Colorectal Cancer Patients. Journal of Pain and Symptom Management, 2015. 49(2): p. 192-202.
46.	Yu, Yuan-Bin, Jyh-Pyng Gau, Chun-Yu Liu, Muh-Hwa Yang, Shu-Chiung Chiang, Hui-Chi Hsu, Ying-Chung Hong, Liang-Tsai Hsiao, Jin-Hwang Liu, Tzeon-Jye Chiou, Po-Min Chen, Tzong-Shyuan Lee, Li-Fang Chou, Cheng-Hwai Tzeng and Tzeng-Ji Chen, A nation-wide analysis of venous thromboembolism in 497,180 cancer patients with the development and validation of a risk-stratification scoring system. Thromb Haemost, 2012. 108(1).
47.	Wu, Mei-Yi, Yung-Ho Hsu, Chien-Ling Su, Yuh-Feng Lin and Hui-Wen Lin, Risk of Herpes Zoster in CKD: A Matched-Cohort Study Based on Administrative Data. American journal of kidney diseases, 2012.
48.	Wei, Po-Li, Joseph J. Keller, Hung-Hua Liang and Herng-Ching Lin, Acute appendicitis and adverse pregnancy outcomes: a nationwide population-based study. Journal of Gastrointestinal Surgery, 2012. 16(6): p. 1204-1211.
49.	Valderas, Jose M., Barbara Starfield, Bonnie Sibbald, Chris Salisbury and Martin Roland, Defining comorbidity: implications for understanding health and health services. The Annals of Family Medicine, 2009. 7(4): p. 357-363.
50.	Cassidy, Adrian, Stephen W Duffy, Jonathan P Myles, Triantafillos Liloglou and John K Field, Lung cancer risk prediction: a tool for early detection. International journal of cancer, 2007. 120(1): p. 1-6.
51.	Tammemagi, C Martin, Christine Neslund-Dudas, Michael Simoff and Paul Kvale, Smoking and lung cancer survival: the role of comorbidity and treatment. Chest, 2004. 125(1): p. 27-37.
52.	Tammemagi, C. Martin, Christine Neslund‐Dudas, Michael Simoff and Paul Kvale, Impact of comorbidity on lung cancer survival. International journal of cancer, 2003. 103(6): p. 792-802.
53.	Janssen-Heijnen, Maryska L. G., S. Smulders, Vepp Lemmens, Frank W. J. M. Smeenk, Hjaa Van Geffen and J. W. W. Coebergh, Effect of comorbidity on the treatment and prognosis of elderly patients with non-small cell lung cancer. Thorax, 2004. 59(7): p. 602-607.
54.	Jeremic, Branislav, Impact of comorbidity on survival after surgical resection in patients with stage I non-small cell lung cancer. The Journal of thoracic and cardiovascular surgery, 2003. 125(2): p. 444-445.
55.	Jian, Zhi-Hong, Jing-Yang Huang, Pei-Chieh Ko, Shiou-Rung Jan, Oswald Ndi Nfor, Chia-Chi Lung, Wen-Yuan Ku, Chien-Chang Ho, Hui-Hsien Pan and Yung-Po Liaw, Impact of coexisting pulmonary diseases on survival of patients with lung adenocarcinoma: a STROBE-compliant article. Medicine, 2015. 94(4).
56.	Crabtree, Traves D., Chadrick E. Denlinger, Bryan F. Meyers, Issam El Naqa, Jennifer Zoole, A. Sasha Krupnick, Daniel Kreisel, G. Alexander Patterson and Jeffrey D. Bradley, Stereotactic body radiation therapy versus surgical resection for stage I non–small cell lung cancer. The Journal of thoracic and cardiovascular surgery, 2010. 140(2): p. 377-386.
57.	Donzé, Jacques, Drahomir Aujesky, Deborah Williams and Jeffrey L. Schnipper, Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA internal medicine, 2013. 173(8): p. 632-638.
58.	López-Martínez, Fernando, Aron Schwarcz, Edward Rolando Núñez-Valdez and Vicente García-Díaz, Machine learning classification analysis for a hypertensive population as a function of several risk factors. Expert Systems with Applications, 2018. 110: p. 206-215.
59.	Honda, Takanori, Daigo Yoshida, Jun Hata, Yoichiro Hirakawa, Yuki Ishida, Mao Shibata, Satoko Sakata, Takanari Kitazono and Toshiharu Ninomiya, Development and validation of modified risk prediction models for cardiovascular disease and its subtypes: The Hisayama Study. Atherosclerosis, 2018. 279: p. 38-44.
60.	Hoggart, Clive, Paul Brennan, Anne Tjonneland, Ulla Vogel, Kim Overvad, Jane Nautrup Østergaard, Rudolf Kaaks, Federico Canzian, Heiner Boeing and Annika Steffen, A risk model for lung cancer incidence. Cancer Prevention Research, 2012. 5(6): p. 834-846.
61.	Chen, Yu-Chun, Jau-Ching Wu, Tzeng-Ji Chen and Thomas Wetter, A publicly available database accelerates academic production. Bmj, 2011. 342: p. d637.
62.	Hsing, Ann W. and John PA Ioannidis, Nationwide Population Science: Lessons From the Taiwan National Health Insurance Research Database. JAMA internal medicine, 2015. 175(9): p. 1527-1529.
63.	Yang, Shun-Fa, Yu-Hsun Wang, Ni-Yu Su, Hui-Chieh Yu, Chia-Yi Wei, Chuan-Hang Yu and Yu-Chao Chang, Changes in prevalence of precancerous oral submucous fibrosis from 1996 to 2013 in Taiwan: A nationwide population-based retrospective study. Journal of the Formosan Medical Association, 2018. 117(2): p. 147-152.
64.	Wang, Tung-Yuan, Yu-Wei Chiu, Yi-Tzu Chen, Yu-Hsun Wang, Hui-Chieh Yu, Chuan-Hang Yu and Yu-Chao Chang, Malignant transformation of Taiwanese patients with oral leukoplakia: A nationwide population-based retrospective cohort study. Journal of the Formosan Medical Association, 2018. 117(5): p. 374-380.
65.	World Health Organization, ICD-O: International classification of diseases for oncology, in ICD-O: International classification of diseases for oncology. 1976.
66.	Quan, Hude, Vijaya Sundararajan, Patricia Halfon, Andrew Fong, Bernard Burnand, Jean-Christophe Luthi, L Duncan Saunders, Cynthia A Beck, Thomas E Feasby and William A Ghali, Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care, 2005: p. 1130-1139.
67.	Dolin, Robert H, Liora Alschuler, Calvin Beebe, Paul V Biron, Sandra Lee Boyer, Daniel Essin, Elliot Kimber, Tom Lincoln and John E Mattison, The HL7 clinical document architecture. Journal of the American Medical Informatics Association, 2001. 8(6): p. 552-569.
68.	Wacholder, Sholom, Joseph K. McLaughlin, Debra T. Silverman and Jack S. Mandel, Selection of controls in case-control studies: I. Principles. American journal of epidemiology, 1992. 135(9): p. 1019-1028.
69.	Rosenbaum, Paul R. and Donald B. Rubin, The central role of the propensity score in observational studies for causal effects. Biometrika, 1983. 70(1): p. 41-55.
70.	Eddy, David M., Practice policies: where do they come from? Jama, 1990. 263(9): p. 1265-1275.
71.	Wilson, Peter W. F., Ralph B. D’Agostino, Daniel Levy, Albert M. Belanger, Halit Silbershatz and William B. Kannel, Prediction of coronary heart disease using risk factor categories. Circulation, 1998. 97(18): p. 1837-1847.
72.	Pearson, Karl, X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1900. 50(302): p. 157-175.
73.	Walker, Strother H. and David B. Duncan, Estimation of the probability of an event as a function of several independent variables. Biometrika, 1967. 54(1-2): p. 167-179.
74.	Tibshirani, Robert, Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996. 58(1): p. 267-288.
75.	Breiman, Leo, Better subset regression using the nonnegative garrote. Technometrics, 1995. 37(4): p. 373-384.
76.	Friedman, Jerome, Trevor Hastie and Rob Tibshirani, Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 2010. 33(1): p. 1.
77.	Tetko, Igor V, David J Livingstone and Alexander I Luik, Neural network studies. 1. Comparison of overfitting and overtraining. Journal of chemical information and computer sciences, 1995. 35(5): p. 826-833.
78.	McElreath, R., AIC provides a surprisingly simple estimate of the average out-of-sample deviance. Statistical Rethinking: A Bayesian Course with Examples in R and Stan author/funder. All rights reserved. No reuse allowed without permission. multimodal actions of huperzine A. Proceedings of the National Academy of Sciences of the United States of America, 2016. 110: p. E746-755.
79.	Hinton, Geoffrey E., Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever and Ruslan R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
80.	Warde-Farley, David, Ian J. Goodfellow, Aaron Courville and Yoshua Bengio, An empirical analysis of dropout in piecewise linear networks. arXiv preprint arXiv:1312.6197, 2013.
81.	Condie, Tyson, Paul Mineiro, Neoklis Polyzotis and Markus Weimer. Machine learning on big data. IEEE, 2013.
82.	Murdoch, Travis B. and Allan S. Detsky, The inevitable application of big data to health care. Jama, 2013. 309(13): p. 1351-1352.
83.	Tammemagi, C. Martin, Christine Neslund-Dudas, Michael Simoff and Paul Kvale, In lung cancer patients, age, race-ethnicity, gender and smoking predict adverse comorbidity, which in turn predicts treatment and survival. Journal of clinical epidemiology, 2004. 57(6): p. 597-609.
84.	Czejdo, Bogdan Denny and Mikolaj Baszun, Remote patient monitoring system and a medical social network. International Journal of Social and Humanistic Computing, 2010. 1(3): p. 273-281.
85.	Jacquemet, Guillaume, Habib Baghirov, Maria Georgiadou, Harri Sihto, Emilia Peuhu, Pierre Cettour-Janet, Tao He, Merja Perälä, Pauliina Kronqvist and Heikki Joensuu, L-type calcium channels regulate filopodia stability and cancer cell invasion downstream of integrin signalling. Nature communications, 2016. 7: p. 13297.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信