§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2006202514352600
DOI 10.6846/tku202500308
論文名稱(中文) 具有元活化正規化的可逆網路-用於內容保留影像風格轉換
論文名稱(英文) Reversible Network with Meta ActNorm for Content- Preserved Image Style Transfer
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 林楷鈞
研究生(英文) Kai-Jun Lin
學號 612410141
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2025-06-19
論文頁數 64頁
口試委員 指導教授 - 林慧珍(086204@mail.tku.edu.tw)
口試委員 - 顏淑惠
口試委員 - 凃瀞珽
關鍵字(中) 任意風格轉換
流模型
內容遺漏修復
元活化正規化
階層式損失函數
關鍵字(英) Arbitrary style transfer
Flow model
Content loss restoration
Meta-activation normalization
Hierarchical loss function
第三語言關鍵字
學科別分類
中文摘要
論文提要內容:
本研究主要聚焦於影像風格轉換技術,並對Jie An等人所提的藝術流框架(Artflow framework)進行改進。ArtFlow採用可逆機制,使得影像在前向過程中,能由像素空間映射至特徵空間,轉換為特徵向量,並透過風格轉換模組對特徵向量轉換為風格化特徵向量; 而在反向過程中,則可將風格化特徵向量映射至像素空間,得到風格化影像,有效保留原始內容細節。此種設計確保風格轉換過程的可逆性,有效避免了傳統方法中常見的內容遺漏(content leaking)問題。然而,在適應性方面仍有改進空間。為提升模型的適應能力並維持結構保真性,本文以ArtFlow架構為核心進行改良,於其活化正規化模組中引入一個可動態生成正規化參數的元網路,形成改進版本,稱為元活化正規化(Meta Activation Normalization, Meta ActNorm)。此一設計不僅強化了整體風格轉換的適應性與彈性,也有效提升轉換品質與視覺效果。綜合上述改良,我們將提出之架構命名為基於元網路的藝術流影像風格轉換(Image Style Transfer Based on Meta ArtFlow,ISTMAF)架構。
在ISTMAF架構中,Meta ActNorm具備動態調整能力,能於前向過程中根據輸入影像生成適應性的正規化參數,並於反向過程有效融合這些參數,進一步強化模型的適應性與穩定性。透過一系列實驗與量化評估結果顯示,本方法不僅能有效保留內容影像的關鍵結構與細節,亦能實現風格轉換後的視覺一致性,避免產生結構扭曲等失真現象。 

英文摘要
This research primarily focuses on image style transfer and enhances the Artflow framework proposed by Jie An et al. Artflow adopts a reversible mechanism that enables the image to be mapped from pixel space to feature space during the forward process, transforming it into a feature vector. This feature vector is then converted into a stylized feature vector through a style transfer module. In the reverse process, the stylized feature vector can be mapped back to the pixel space to generate the stylized image, effectively preserving the original content details. The design ensures the reversibility of the style transfer process and effectively avoids the common content leaking issue found in traditional methods. However, there is still room for improvement in terms of adaptability. To enhance the model’s adaptability while maintaining structural fidelity, this study introduces improvements centered on the Artflow framework by incorporating a meta network capable of dynamically generating normalization parameters within its ActNorm module. The improved version is referred to as Meta activation normalization (Meta ActNorm). This design not only strengthens the overall adaptability and flexibility of the image style transfer system but also effectively improves transfer quality and visual performance. In light of the aforementioned improvements, the architecture we propose is referred to as the ISMAF (Image Style Transfer Based on Meta ArtFlow). 
In the ISTMAF architecture, Meta ActNorm has dynamic adjustment capabilities, allowing it to generate adaptive normalization parameters based on the input image during the forward process and effectively integrate these parameters during the reverse process, further enhancing the model’s adaptability and stability. A series of experimental results and quantitative evaluation demonstrate that this method not only effectively preserves the key structures and details of the content image but also achieves visual consistency after style transfer, avoiding distortion phenomena such as structural deformation.
第三語言摘要
論文目次
目錄
目錄	v
圖目錄	vii
表目錄	viii
第一章  緒論	1
第二章  相關研究	5
2.1生成流模型	5
2.2藝術流	5
2.3白化與著色轉換(WCT)	6
第三章  研究方法	8
3.1完整架構介紹	8
3.2模型程序運作	10
3.2.1 前向程序	10
3.2.2 反向程序	13
3.2.3 ISTMAF操作流程	15
3.3損失函數	16
3.4後處理製作	18
第四章  實驗結果	19
4.1實驗配置	19
4.2本方法轉換結果	20
4.2.1 Meta ActNorm的參數融合與風格損失的設計	20
4.2.2對齊風格損失的參數配置	22
4.2.3損失函數設計比較	23
4.2.4多次轉換結果展示	24
4.2.5後處理	25
4.2.6更多測試範例	27
4.3與不同方法比較	29
第五章  結論與討論	31
參考文獻	32
附錄: 英文論文	36


圖目錄
圖一、ISTMAF架構	9
圖二、元網路Netsb架構	11
圖三、不同設定下之結果展示	21
圖四、對齊風格損失設定之下之結果展示	22
圖五、損失函數設計差異	23
圖六、多次轉換結果	24
圖七、後處理結果比較	26
圖八、本方法風格轉換範例	27
圖九、人臉影像轉換範例	28
圖十、風格轉換結果比較	30

表目錄
表一、有無後處理結果指標比較	26
表二、SSIM與Gram difference量化評估值比較	30
參考文獻
[1]	Yanxi Wei, “Artistic image style transfer based on cyclegan network model,” International Journal of Image and Graphics, Vol. 24, No. 04, 2450049, 2024.
https://doi.org/10.1142/S0219467824500499
[2]	Jongmin Gim, Jihun Park, Kyoungmin Lee and Sunghoon Im, “Content-adaptive style transfer: A training-free approach with vq autoencoders,” in Proceeding of the Asian Conference on Computer Vision, 2024 (ACCV 2024), pp. 187–204.
[3]	Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin, “Image analogies,” in Proceedings of 28th Annual Conference on Computer Graphics and Interactive Techniques, 2001 (SIGGRAPH '01), pp. 327–340.
https://doi.org/10.1145/383259.383295
[4]	Alexi A. Efros and William T. Freeman, “Image quality for texture synthesis and transfer,” in Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '01), pp. 341–346 https://doi.org/10.1145/383259.383296
[5]	Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge, “Image style transfer using convolutional neural networks,” 2016 Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 2414–2423, doi: 10.1109/CVPR.2016.265.
[6]	Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” 2017 IEEE International Conference on Computer Vision (ICCV 2017), pp. 2242–2251, doi: 10.1109/ICCV.2017.244.
[7]	Xun Huang and Serge Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” 2017 IEEE International Conference on Computer Vision (ICCV 2017), pp. 1510–1519, doi: 10.1109/ICCV.2017.167.
[8]	Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, and Jiebo Luo, “Artflow: unbiased image style transfer via reversible neural flows,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021). doi: 10.1109/CVPR46437.2021.
[9]	Weichen Fan, Jinghuan Chen, and Ziwei Liu, “Hierarchy flow for high-fidelity image-to-image translation,” arXiv:2308.06909v1 [cs.CV], 2023. 
https://doi.org/10.48550/arXiv.2308.06909
[10]	Karen Simonyan and Andrew Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556v6 [cs.CV], 2015.
https://doi.org/10.48550/arXiv.1409.1556
[11]	Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Shergiil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” The Twenty-Eighth Annual Conference on Neural Information Processing Systems, 2014 (NeurIPS 2014).
[12]	Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky, “Instance normalization: the missing ingredient for fast stylization,” arXiv:1607.08022v3 [cs.CV], 2017. https://doi.org/10.48550/arXiv.1607.08022
[13]	Diederik P. Kingma and Prafulla Dhariwal, “Glow: generative flow with invertible 1x1 convolutions,” The Thirty-Second Annual Conference on Neural Information Processing Systems, 2018 (NeurIPS 2018).
[14]	Fang Yaom, “A learning theory of meta learning,” 2024 National Science Review, Vol. 11, Issue 8, August 2024, nwae133. https://doi.org/10.1093/nsr/nwae133
[15]	Mikolak Binkowski, Danica J. Sutherland, Michael Arbel, and Arthur Gretton, “Demystifying mmd gans,” arXiv:1801.01401v5[stat.ML], 2018. 
https://doi.org/10.48550/arXiv.1801.01401
[16]	Diederik P. Kingma and Max Welling, “An introduction to variational autoencoders,” arXiv:1906.02691v3 [cs.LG], 2019. https://doi.org/10.1561/2200000056
[17]	Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang, “Universal style transfer via feature transforms,” in Proceeding of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), pp. 385-395.
[18]	Tian Qi Chen and Mark Schmidt, “Fast patch-based style transfer of arbitrary style,” arXiv:1612.04337v1 [cs.CV], 2016. https://doi.org/10.48550/arXiv.1612.04337
[19]	Xueting Li, Sifei Liu, Jan Kautz, and Ming-Hsuan Yang, “Learning linear transformations for fast arbitrary style transfer,” 2019 Conference on Computer Vision and Pattern Recognition (CVPR 2019).
[20]	Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, and Li Zhang, “A closed-form solution to universal style transfer,” 2019 IEEE International Conference on Computer Vision (ICCV 2019).
[21]	Lu Sheng, Ziyi Lin, Jing Shao, and Xiaogang Wang, “Avatar-net: multi-scale zero-shot style transfer by feature decoration,” 2018 IEEE International Conference on Computer Vision (ICCV 2018).
[22]	Hung-Yu Chen, I-sheng Fang, Chia-Ming Cheng, and Wei-Chen Chiu, “Self-contained stylization via steganography for reverse and serial style transfer,” arXiv:1812.03910v3 [cs.CV], 2020. https://doi.org/10.48550/arXiv.1812.03910
[23]	Tsung-Yi Lin, Michael, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick, “Microsoft coco: common objects in context,” The European Conference on Computer Vision, 2014 (NeurIPS 2014), pp. 740–755.
[24]	Small yellow duck and Wendy Kan. Painter by numbers. 
https://kaggle.com/competitions/painter-by-numbers, 2016. Kaggle.
[25]	Tero Karras, Samuli Laine, and Timo Aila, “A style-based generator architecture for generative adversarial networks,” 2020 IEEE International Conference on Computer Vision (ICCV 2020).
[26]	Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila, “Training generative adversarial networks with limited data,” in Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020 (NIPS 2020), Article No. 1015, pp. 1210–1211
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信