§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1707202318051500
DOI 10.6846/tku202300402
論文名稱(中文) 書法風格的手寫字轉換:一種基於生成對抗網路的方法
論文名稱(英文) Handwriting to Calligraphy Style Transformation: A Generative Adversarial Network Approach
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 111
學期 2
出版年 112
研究生(中文) 江岱樺
研究生(英文) Dai-Hua Jiang
學號 610410598
學位類別 碩士
語言別 英文
第二語言別
口試日期 2023-07-06
論文頁數 41頁
口試委員 口試委員 - 高昶易(edenkao@scu.edu.tw)
口試委員 - 林其誼(chiyilin@mail.tku.edu.tw)
口試委員 - 林莊傑(158778@mail.tku.edu.tw)
指導教授 - 吳孟倫(mlwutp@gmail.com)
關鍵字(中) GAN
字體生成
書法字風格轉換
關鍵字(英) GAN
font generation
calligraphy style transfer
第三語言關鍵字
學科別分類
中文摘要
中國書法是使用毛筆書寫文字的一門藝術,不同風格的文字具有不同的形狀與細節,展現出來的美感也不同。為了推廣書法技藝,我們在本研究提出一個字體轉換的方法。不像現有方法幾乎只針對特定印刷字體和書法字做轉換。我們的研究也包含將非專家寫的手寫字作為輸入並轉換成書法風格。我們的研究是基於生成對抗網路 (Generative Adversarial Network, GAN),讓電腦讀取大量手寫字與其相應的書法字,建構出一個能將手寫字風格轉換為書法字風格的人工智慧系統。我們使用基於U-Net的生成器,還使用了風格標籤的嵌入控制模型生成的文字風格,並且將文字的骨架資訊運用於我們的模型中,透過加入骨架資訊可以使模型更穩定的將文字重建出來。最後,我們將我們的方法與其他目前最先進的字體生成方法進行比較,證明我們的方法優於過去的書法字風格轉換方法。
英文摘要
Chinese calligraphy is an art of writing characters with a brush. Different styles of characters have different shapes and details. These characters show different aesthetics. To promote calligraphy skills, we propose a method to transfer handwritten characters into calligraphy fonts in this study. Unlike existing methods, which almost only research on specific printed fonts and calligraphy characters. Our research also includes handwritten characters written by non-experts as input and converted into calligraphic styles. Our research is based on Generative Adversarial Network (GAN). It allows computers to read a large number of handwritten characters and their corresponding calligraphy characters. Then the artificial intelligence model can transfer handwritten characters into calligraphic styles. The generator in our method is based on U-Net. We use the embedding of style labels to control the style generated by the model and apply the skeleton information of characters to our model. By adding skeleton information, the model can generate character more stable. Finally, we compare our method with other state-of-the-art font generation methods and demonstrate that our method outperforms past calligraphic character style transfer methods.
第三語言摘要
論文目次
CONTENTS

摘要	I
Abstract	II
CONTENTS 	III
LIST OF FIGURES	IV
LIST OF TABLES	V
CHAPTER 1 INTRODUCTION	1
1.1 Background	1
1.2 Motivation	2
1.3 Research Objective	3
1.4 Problem Statement	4
1.5 Thesis Organization	4
CHAPTER 2 RELATED WORKS	5
2.1 Font Synthesis Based on Interpolation	5
2.2 GAN Based Methods	8
2.3 Multiple Stage GAN Based Methods	12
2.4 Attention Mechanism	15
CHAPTER 3 PROPOSED METHOD	17
3.1 System Architecture	17
3.2 Normalization	21
3.2.1 Batch Normalization	21
3.2.2 Instance Normalization	22
3.3 Cross Attention Block	23
3.4 Loss Function	24
CHAPTER 4 EXPERIMENTAL RESULTS	26
4.1 Experimental Setup	28
4.2 Seen Style Seen Font	29
4.3 Seen Style Unseen Font	31
4.4 Unseen Handwriting Style Transfer	34
CHAPTER 5 CONCLUSIONS	38
REFERENCES	39

LIST OF FIGURES

Figure 1 Handwritten characters transfer into calligraphy style examples.	3
Figure 2 The skeleton, edge, and contour of a character.	6
Figure 3 Flow of synthesize calligraphy characters using interpolation.	7
Figure 4 Illustration of synthesized calligraphy characters using interpolation.	8
Figure 5 Architecture of GAN.	9
Figure 6 Architecture of zi2zi.	10
Figure 7 Architecture of SA-VAE.	11
Figure 8 Flow of Calligraphy font style transfer based on GAN.	12
Figure 9 Architecture of SCFont.	13
Figure 10 Architecture proposed by Gao et al.	13
Figure 11 Architecture proposed by Wen et al.	14
Figure 12 Flow of Calligraphy font style transfer based on multi-stage GAN.	15
Figure 13 Architecture of the proposed GAN.	18
Figure 14 Cross-attention block architecture.	24
Figure 15 Example character in 5 styles.	29
Figure 16 Seen style seen font transform to regular script generated results.	30
Figure 17 Seen style seen font transform to official script generated results.	31
Figure 18 Seen style unseen font transform to regular script generated results.	32
Figure 19 Seen style unseen font transform to official script generated results.	33
Figure 20 Failure generated results.	34
Figure 21 Handwritings transform to official script generated results.	35
Figure 22 Handwritings transform to regular script generated results compare.	36
Figure 23 Handwritings transform to official script generated results compare.	37
	

 
LIST OF TABLES

Table 1 Architecture of the Encoder.	19
Table 2 Architecture of The Decoder.	20
Table 3 Architecture of The Discriminator D and Style Classifier Ds.	21
Table 4 Seen Style Seen Font Similarities.	30
Table 5 Seen Style Unseen Font Similarities.	32
Table 6 Handwriting Transfer Results Similarities.	34
Table 7 Unseen Style Seen Font Similarities.	36
參考文獻
[1]	G. Ian et al., “Generative adversarial networks.” in Proceedings of the International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.
[2]	L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks.” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Paradise, Nevada, pp. 2414-2423, 2016.
[3]	Y. Tian, “zi2zi: Master chinese calligraphy with conditional adversarial networks,” GitHub, https://github.com/kaonashi-tyc/zi2zi, April 2017 (accessed January 2023).
[4]	D. Sun, Q. Zhang, and J. Yang, “Pyramid embedded generative adversarial network for automated font generation,” in Proceedings of the 24th International Conference on Pattern Recognition, Beijing, China, pp. 976-981, 2018.
[5]	S. J. Wu, C. Y. Yang, and J. Y. J. Hsu, “Calligan: Style and structure-aware Chinese calligraphy character generator,” in Proceedings of the AI for Content Creation Workshop at CVPR 2020, 2020.
[6]	Y. Zhuang et al., “Retrieval of Chinese calligraphic character image,” in Proceedings of the 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, pp. 17-24, 2004.
[7]	Y. Zheng and D. Doermann, “Handwriting matching and its application to handwriting synthesis,” in Proceedings of the Eighth International Conference on Document Analysis and Recognition, Seoul, South Korea, pp. 861-865, 2005.
[8]	X. Zhang and G. Liu, “Chinese calligraphy character image synthesis based on retrieval,” in Proceedings of the 10th Pacific Rim Conference on Multimedia, Bangkok, Thailand, pp. 167-178, 2009. 
[9]	K. F. Lai and R. T. Chin, “Deformable contours: Modeling and extraction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 11, pp. 1084-1090, November 1995. doi: 10.1109/34.473235
[10]	P. Isola, et al., “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 1125¬¬-1134, 2017.
[11]	A. Odena, O. Christopher, and J. Shlens, “Conditional image synthesis with auxiliary classifier GANs,” in Proceedings of the 34th International Conference on Machine Learning. PMLR, Sydney, Australia, pp. 2642-2651, 2017.
[12]	Y. Taigman, A. Polyak, and L. Wolf, “Unsupervised cross-domain image generation,” in Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
[13]	D. Sun et al., “Learning to write stylized Chinese characters by reading a handful of examples,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 920-927, 2018.
[14]	Y. Zhang, Y. Zhang, and W. Cai, “Separating style and content for generalized style transfer,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, pp. 8447-8455, 2018.
[15]	Y. Xie et al., “DG-Font: Deformable generative networks for unsupervised font generation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, Tennessee, pp. 5130-5140, 2021.
[16]	Y. Jiang et al., “SCFont: Structure-guided Chinese font generation via deep stacked networks,” in Proceedings of the AAAI conference on Artificial Intelligence, Honolulu, Hawaii, pp. 4015-4022, 2019.
[17]	Y. Gao, and J. Wu, “GAN-based unpaired Chinese character image translation via skeleton transformation and stroke rendering,” in Proceedings of the AAAI conference on Artificial Intelligence, New York, NY, pp. 646-653, 2020.
[18]	C. Wen et al., “Handwritten Chinese font generation with collaborative stroke refinement,” in Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Waikoloa, Hawaii, pp. 3882-3891, 2021.
[19]	A. Vaswani et al., “Attention is all you need,” in Proceedings of the Neural Information Processing Systems, Long Beach, California, 2017.
[20]	H. Zhang et al., “Self-attention generative adversarial networks,” in Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, pp. 7354-7363, 2019.
[21]	C. Ren et al., “SAFont: Automatic font synthesis using self-attention mechanisms,” Australian Journal of Intelligent Information Processing Systems, vol. 16, no. 2, pp.19-25, December 2019.
[22]	L. Tang et al., “Few-shot font generation by learning fine-grained local styles.,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, Louisiana, pp. 7895-7904, 2022.
[23]	Y. Kong et al., “Look closer to supervise better: One-shot font generation via component-based discriminator.,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, Louisiana, pp. 13482-13491, 2022.
[24]	O. Ronneberger et al., “U-net: Convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, pp. 234-241, 2015.
[25]	D. Ulyanovet al., “Texture networks: Feed-forward synthesis of textures and stylized images,” in Proceedings of the 33rd International Conference on Machine Learning, New York, NY, pp. 1349–1357, 2016.
[26]	S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift.,” in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448-456, 2015.
[27]	S. Yuanet et al., “SE-GAN: Skeleton enhanced GAN-based model for brush handwriting font generation,” in Proceedings of the 2022 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, pp. 1-6, 2022.
[28]	N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, January 1979. doi: 10.1109/TSMC.1979.4310076
[29]	T. C. Lee et al., “Building skeleton models via 3-D medial surface/axis thinning algorithms,” CVGIP: Graphical Models and Image Processing, vol. 56, no. 6, pp. 462-478-478, November 1994. doi: 10.1006/cgip.1994.1042
[30]	Allen, M. David, “Mean square error of prediction as a criterion for selecting variables.,” Technometrics, vol. 13, no. 3, pp. 469-475, April 1971. doi: 10.1080/00401706.1971.10488811
[31]	Q. Huynh-Thu and M. Ghanbari, “Scope of validity of PSNR in image/video quality assessment,” Electronics letters, vol. 44, no. 13, pp. 800-801, June 2008. doi: 10.1049/el:20080522
[32]	Z. Wang et al., “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April 2004. doi:10.1109/TIP.2003.819861
[33]	R. Zhang et al., “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, pp. 586-595, 2018.
[34]	Ministry of Education, “國字標準字體字形檔-楷書,” 語文成果網, https://language.moe.gov.tw/result.aspx?classify_sn=23&subclassify_sn=442&content_sn=30, October 2019 (accessed January 2023).
[35]	Ministry of Education, “隸書字形檔,” 語文成果網, https://language.moe.gov.tw/result.aspx?classify_sn=23&subclassify_sn=442&content_sn=51, October 2019 (accessed January 2023).
[36]	S. Wu, “悠哉字体 / Yozai Font,” Github, https://github.com/lxgw/yozai-font, June 2020 (accessed January 2023).
論文全文使用權限
國家圖書館
同意無償授權國家圖書館,書目與全文電子檔於繳交授權書後, 於網際網路立即公開
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信