電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2022-09-17起於校外公開使用
本論文紙本於2022-09-17起公開使用

系統識別號	U0002-0409201911374000
DOI	10.6846/TKU.2019.00114
論文名稱(中文)	以條件式循環生成對抗網路為架構之多種風格圖像轉換系統
論文名稱(英文)	Multiple Style Image Transfer System Based on Conditional CycleGAN
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	107
學期	2
出版年	108
研究生(中文)	呂品慧
研究生(英文)	Ping-Hui Lu
學號	606410461
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2019-07-16
論文頁數	76頁
口試委員	指導教授 - 林慧珍(086204@mail.tku.edu.tw) 委員 - 廖弘源委員 - 蔡憶佳
關鍵字(中)	卷積類神經網路深度學習生成對抗網路條件式生成對抗網路循環式生成對抗網路 PatchGAN 影像風格轉換
關鍵字(英)	convolutional networks deep learning Generative Adversarial Nets (GAN) Conditional GAN CycleGAN PatchGAN image style transfer
第三語言關鍵字
學科別分類
中文摘要	本篇論文提出一個多種影像風格轉換系統。主要的想法是結合Conditional GAN [1]和CycleGAN [2]，建立出一個條件式循環生成對抗網路（Conditional CycleGAN），期望能夠將一張圖片從一種風格轉換到另一種指定風格的逼真圖片。系統的輸入除了一張圖片外還附加了一個指定轉換風格的標籤作為條件，因而比在兩個圖像域之間轉換的CycleGAN有更多的轉換圖像域。期望能建立一個更有彈性的圖像轉換系統。此架構的概念可以延伸至其他的轉換應用，比如人臉表情或各種特徵的合成等。
英文摘要	This paper proposes a multi-style image transfer system. The main idea is to combine Conditional GAN [1] and CycleGAN [2] to create a Conditional CycleGAN, which is expected to transfer one picture from one style to another. The input of the system, in addition to a picture, is attached with a label specifying the transfer style as a condition, and thus may have more styles to choose from than CycleGAN. We expect to develop a more flexible image style transfer system with which one out of several styles to transfer. The concept of this architecture can be extended to other transformation applications, such as facial expressions transfer, face aging, and the synthesis of various features.
第三語言摘要
論文目次	目錄目錄 III 圖表目錄 V 第一章、研究背景與目的 1 第二章、相關研究 3 2.1 生成對抗網路（Generative Adversarial Nets, GAN） 3 2.2 條件式生成對抗網路（Conditional GAN, CGAN） 5 2.3 循環生成對抗網路（CycleGAN） 7 2.4 PatchGAN 11 第三章、研究方法與進行步驟 13 3.1 網路基本架構 13 3.2 風格特徵圖 17 3.3 系統架構 18 3.4 損失函數 20 3.5 訓練細節 22 3.6 系統訓練流程 23 第四章、實驗結果與分析 25 4.1 輸入標籤表示法 26 4.2 初始權重設定 28 4.3 判別器和生成器訓練影像數比率 31 4.4 風格轉換結果 35 4.4.1 照片轉換成三種畫作風格的結果 35 4.4.2畫作風格轉換成照片風格的結果 41 4.4.3三種畫作風格之間的轉換結果 43 4.4.4未訓練的風格轉換至四種訓練風格的結果 47 第五章、結論與未來展望 49 參考文獻 50 附錄：英文論文 53 圖表目錄圖一 GAN架構圖 5 圖二 CGAN判別器訓練架構圖 6 圖三 CGAN生成器訓練架構圖 7 圖四 CycleGAN 生成器訓練架構圖 10 圖五 CycleGAN判別器訓練架構圖（a）判別器DS；（b）判別器DT 11 圖六 PatchGAN感受域示意圖 12 圖七 CCGAN系統架構圖 14 圖八條件式CycleGAN架構部件（a）條件式CycleGAN簡圖；（b）前向流程架構圖；（c）前向流程簡圖；（d）反向流程架構圖；（e）反向流程簡圖；（f）前向循環一致性；（g）反向循環一致性 17 圖九 AE系統架構圖 18 圖十風格特徵圖取得流程示意圖 18 圖十一 CCGAN生成器之輸入示意 18 圖十二 CCGAN生成器架構圖 19 圖十三 CCGAN判別器架構圖 19 圖十四 CCGAN的網路訓練前向程序 24 圖十五部分樣本範例（a）照片；（b）梵谷畫作；（c）莫內畫作；（d）浮世繪畫作 26 圖十六使用不同輸入標籤表示法之「照片→梵谷」轉換結果（a）原圖；（b）one-hot vector；（c）style feature map 28 圖十七不同初始權重訓練後之「照片→梵谷」轉換結果（a）原圖；（b）隨機權重；（c）每對類別單獨訓練後的權重平均 29 圖十八不同初始權重訓練後之「照片→莫內」轉換結果（a）原圖；（b）隨機權重；（c）每對類別單獨訓練後的權重平均 30 圖十九判別器和生成器每回訓練影像數不同比率之「照片→梵谷」轉換結果（a）原圖；（b）比率1 : 1之結果；（c）比率2 : 1之結果 32 圖二十判別器和生成器每回訓練影像數不同比率之「照片→莫內」轉換結果（a）原圖；（b）比率1 : 1之結果；（c）比率2 : 1之結果 33 圖二十一判別器和生成器每回訓練影像數不同比率之「照片→浮世繪」轉換結果（a）原圖；（b）比率1 : 1之結果；（c）比率2 : 1之結果 34 圖二十二「照片→梵谷」轉換結果（a）、（c）原圖；（b）、（d）結果 35 圖二十三「照片→莫內」轉換結果（a）、（c）原圖；（b）、（d）結果 36 圖二十四「照片→浮世繪」轉換結果（a）、（c）原圖；（b）、（d）結果 37 圖二十五照片轉換成其它風格之結果（a）原圖；（b）梵谷風格；（c）莫內風格；（d）浮世繪風格 40 圖二十六「梵谷→照片」轉換結果（a）、（c）原圖；（b）、（d）結果 41 圖二十七「莫內→照片」轉換結果（a）、（c）原圖；（b）、（d）結果 42 圖二十八「浮世繪→照片」轉換結果（a）、（c）原圖；（b）、（d）結果 43 圖二十九「梵谷→莫內」轉換結果（a）、（c）原圖；（b）、（d）結果 44 圖三十「梵谷→浮世繪」轉換結果（a）、（c）原圖；（b）、（d）結果 44 圖三十一「莫內→梵谷」轉換結果（a）、（c）原圖；（b）、（d）結果 45 圖三十二「莫內→浮世繪」轉換結果（a）、（c）原圖；（b）、（d）結果 45 圖三十三「浮世繪→梵谷」轉換結果（a）、（c）原圖；（b）、（d）結果 46 圖三十四「浮世繪→莫內」轉換結果（a）、（c）原圖；（b）、（d）結果 46 圖三十五塞尚畫作之風格轉換結果（a）原圖；（b）梵谷風格；（c）莫內風格；（d）浮世繪風格；（e）照片風格 47 圖三十六水墨畫作之風格轉換結果（a）原圖；（b）梵谷風格；（c）莫內風格；（d）浮世繪風格；（e）照片風格 48
參考文獻	[1]. M. Mirza and S. Osindero, “Conditional Generative Adversarial Nets,” [cs.LG], 2014. [2]. J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” arXiv:1703.10593 [cs.CV], 2018. [3]. F.-W. Yang, H. J. Lin, S.-H. Yen, and C.-H. Wang, “A study on the convolutional neural algorithm of image style transfer,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. No. 33, Issue No. 05, 2018, 1954020-1 - 1954020-19. DOI: 10.1142/S021800141954020X. [4]. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: A Large-Scale Hierarchical Image Database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248-255, 2009, DOI:10.1109/CVPR.2009.5206848. [5]. K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv:1409.1556 [cs.CV], 2015. [6]. G. E. Hinton and R. R. Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks,” Science, Vol. 313, Issue 5786, 28 July 2006. [7]. D. P. Kingma and M. Welling, "Auto-Encoding Variational Bayes," arXiv:1312.6114v10 [stat.ML], 2014. [8]. K. Sohn, X. Yan, and H. Lee, “Learning Structured Output Representation using Deep Conditional Generative Models,” Advances in Neural Information Processing Systems (NIPS 2015), pp. 1-9, 2015. [9]. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Networks,” arXiv:1406.2661v1 [stat.ML], 2014. [10]. C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” arXiv:1609.04802v5 [cs.CV], 2017. [11]. C. Vondrick, H. Pirsiavash, and A. Torralba, “Generating Videos with Scene Dynamics,” arXiv:1609.02612v3 [cs.CV], 2016. [12]. Y. Zhang, Z. Gan, and L. Carin, “Generating Text via Adversarial Training,” Advances in Neural Information Processing Systems (NIPS 2015), pp. 1-6, 2016. [13]. L. Yu, W. Zhang, J. Wang, and Y. Yu, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient,” arXiv:1609.05473v6 [cs.LG], 2017. [14]. X. Yi, E. Walia, and P. Babyn, “Generative Adversarial Network in Medical Imaging: A Review,” arXiv:1809.07294v3 [cs.CV], 2019. [15]. S. Kazeminia, C. Baur, A. Kuijper, B. V. Ginneken, N. Navab, S. Albarqouni, and A. Mukhopadhyay, “GANs for Medical Image Analysis,” arXiv:1809.06222v2 [cs.CV], 2018. [16]. M. Arjovsky and L. Bottou, “Towards Principled Methods for Training Generative Adversarial Networks,” arXiv:1701.04862v1 [stat.ML], 2017. [17]. M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” arXiv:1701.07875v3 [stat.ML], 2017. [18]. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved Training of Wasserstein GANs,” arXiv:1704.00028v3 [cs.LG], 2017. [19]. T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral Normalization for Generative Adversarial Networks,” arXiv:1802.05957v1 [cs.LG], 2018. [20]. J. Zhao, M. Mathieu, and Y. LeCun, “Energy-based Generative Adversarial Network,” arXiv:1609.03126v4 [cs.LG], 2017. [21]. H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-Attention Generative Adversarial Networks,” arXiv:1805.08318 [stat.ML], 2018. [22]. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium,” arXiv:1706.08500v6 [cs.LG], 2018. [23]. N. Kodali, J Abernethy, J. Hays, and Z. Kira, “On Convergence and Stability of GANs,” arXiv:1705.07215 [cs.AI], 2017. [24]. X. Wei, B. Gong, Z. Liu, W. Lu, and L. Wang, “Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect,” arXiv:1803.01541v1 [cs.CV], 2018. [25]. A. Odena, C. Olah, and J. Shlens, “Conditional Images Synthesis with Auxiliary Classifier GANs,” arXiv:1610.09585v4 [stat.ML], 2017. [26]. H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas, “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks,” arXiv:1612.03242[cs. CVPR], 2017. [27]. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image Translation with Conditional Adversarial Nets,” arXiv:1611.07004 [cs.CV], 2017. [28]. X. Mao, Q. Li, H. Xie, R. Y.K. Lau, Z. Wang, and S. P. Smolley, “Least Squares Generative Adversarial Networks,” arXiv:1406.2661 [cs.CV], 2017. [29]. Y. Taigman, A. Polyak, and L. Wolf, “Unsupervised Cross-Domain Image Generation,” arXiv:1611.02200v1 [cs.CV], 2017. [30]. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” arXiv:1512.03385v1 [cs.CV], 2015.
論文全文使用權限	校內：紙本論文於授權書繳交後3年公開同意電子論文全文授權校園內公開校內電子論文於授權書繳交後3年公開校外：同意授權校外電子論文於授權書繳交後3年公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信