
系統識別號 
U00023107201313180800 
中文論文名稱

快速模指數運算之研究 
英文論文名稱

A Study of Fast Modular Exponentiation 
校院名稱 
淡江大學 
系所名稱(中) 
資訊工程學系博士班 
系所名稱(英) 
Department of Computer Science and Information Engineering 
學年度 
101 
學期 
2 
出版年 
102 
研究生中文姓名 
黃龍信 
研究生英文姓名 
LoangShing Huang 
學號 
896410114 
學位類別 
博士 
語文別 
英文 
口試日期 
20130613 
論文頁數 
89頁 
口試委員 
指導教授黃仁俊 委員楊中皇 委員王旭正 委員李南逸 委員黃心嘉 委員賴義鵬 委員黃仁俊

中文關鍵字 
模指數運算
蒙哥瑪利模數運算
KarasubaOfman方法
Comba方法

英文關鍵字 
Modular Exponentiation
Montgomery Reduction
KarasubaOfman Method
Comba Method

學科別分類 
學科別＞應用科學＞資訊工程

中文摘要 
本篇論文探討如何加速模指數運算之效能。針對行動通訊裝置如智慧型手機、平板電腦與智慧卡連結網際網路、3G/4G行動通訊、車載通訊與無線通訊系統的需求日益增加，安全通訊上的需求益日顯重要。這些行動通訊裝置需要安全的通訊環境以確保互相交換之訊息可維持機密性並只有經認證的相關人等可存取該些裝置。這樣的安全通訊環境可透過現代密碼函數來建立，而這些函數皆植基於所謂模指數運算上。模指數運算是由模乘法所構成，模乘法又是建立於模數運算與大整數乘法上。本篇論文提出一些策略可用來加速模指數運算效能，包含加速蒙哥瑪利模運算(Montgomery Reduction)、決定最佳切割方法與臨界值可用來加速大整數乘法與模運算。同時，本篇論文亦致力於減少記憶體存取次數可讓大整數乘法更有效率。本篇論文所提之策略均旨於加速模指數運算效能，使現代密碼通訊協定可運用於行動通訊裝置上。 
英文摘要 
This thesis investigates how to accelerate the performance of modular exponentiation. As the requirement for secure communication over the Internet, 3G/4G cellular systems, vehicular communication networks and wireless sensor networks is growing, connecting small, mobile devices such as smart phones, tablet PCs, and smart cards to these networks is also a growing trend. Many of these devices require secure connections to ensure that the exchanged information remains confidential and that only authorized persons can access the devices. Such secure communication is established on many modern cryptographic functions, based on the framework of modular exponentiation. Modular exponentiation is composed of modular multiplication, which is made of modular reduction, large integer multiplication. In order to speed up modular exponentiation, this thesis proposes some strategies, including speeding up Montgomery reduction, determining the optimal split method and the threshold value so as to speed up the large integer multiplication, and modular reduction. Furthermore, this thesis ponders on reducing the times of memory accesses so as to make the large integer multiplication even more efficient. The goal of all strategies proposed is intended to accelerate the performance of modular exponentiation as far as possible, and make it possible to implement modern cryptographic protocols on mobile devices. 
論文目次 
Tables of Contents
CHAPTER 1. Introduction 1
CHAPTER 2. Related Works 14
2.1. Reducing the Numbers of Bits of the Exponent with the Value of “1” 15
2.2. Accelerating Modular Reduction 16
2.3. Accelerating a Large Integer Multiplication 20
2.3.1. Classical Multiplication 20
2.3.2. KaratsubaOfman Method 22
CHAPTER 3. Efficient Montgomery Reduction for Large Integers 26
3.1. Discussions of Montgomery Reduction Using the Product Scanning Method 26
3.2. The Efficient Method to Perform LargeInteger Montgomery Reduction 35
CHAPTER 4. The Optimal Split Method of Large Integer Multiplication 44
4.1. The Proposed Method 45
4.2. Analysis of the Proposed Method 48
4.2.1. Two Basic Methods 48
4.2.2. The Optimal Method 55
CHAPTER 5. Experiment Results 61
5.1. Architecture of TI DSP C55x Family[39] 61
5.2. Experiment Results for Efficient Montgomery Reduction 64
5.3. Experiment Results for the Optimal Split Method 70
5.4. Experiment Results for Integrated Proposed Methods 74
CHAPTER 6. Conclusions and Future Researches 79
6.1. Summary of Contributions 79
6.2. Future Researches 82
Reference 84
List of Figures
Figure 1. The diagram of the classical multiplication 21
Figure 2. The diagram of product scanning method 27
Figure 3. Balancedtway split (at split depth 1) 49
Figure 4. Unbalancedtway split (at split depth 1) 52
Figure 5. Plot of the number of base instructions of recursivebalancedtway split (at split depth n) 57
Figure 6. Graphical representation of TMS320C55x 63
List of Tables
Table 1. RSAbased key agreement and key transport key length transitions (n is the size of the modulus) 12
Table 2. Recommended algorithms and minimum key sizes(L is the size of the public key, N is the size of the private key; k and f is the key size) 12
Table 3. Comparable strengths(L is the size of the public key, N is the size of the private key; k and f is the key size) 13
Table 4. The number of base instructions for two sword classical multiplication 23
Table 5. The number of base instructions for a large integer addition/subtraction 25
Table 6. The number of classical mul., large integer add./sub. for the KaratsubaOfman method 25
Table 7. The number of base instructions for the KaratsubaOfman method 25
Table 8. Complexity of the three reduction algorithms in reducing a 2sword number T modulo a sword modulus m. 66
Table 9. Numbers of CPU cycles for implementing three Montgomery reductions 67
Table 10. Speedup of our proposed method vs. other Montgomery reductions (the other modular reductions/our proposed method) 67
Table 11. Numbers of CPU cycles for modular exponentiation with different modular multiplications where the bit lengths of base, modulus, and exponent are the same length 68
Table 12. Speedup of modular exponentiation with our proposed method vs. other modular multiplications where the bit lengths of base, modulus, and exponent are the same length (the time cost of other modular reductions/ the time cost of our proposed method) 69
Table 13. The computational time for one modular exponentiation with our new Montgomery reduction, where the bit lengths of base and modulus are 1024 on 100MHz CPU frequency 69
Table 14. The computational time for one modular exponentiation with our new Montgomery reduction, where the bit lengths of base and modulus are 2048 on 100MHz CPU frequency 70
Table 15. The number of CPU cycles for the large integer multiplication 72
Table 16. The number of CPU cycles for the multiplication with unbalanced methods 73
Table 17. The number of CPU cycles for integrated Montgomery modular multiplication and the proposed recursivebalanced2way method 73
Table 18. The number of CPU cycles for modular exponentiation with different modular multiplications where the bit lengths of base, modulus, and exponent are the same length 74
Table 19. Speedup of modular exponentiation with the proposed method vs. other modular multiplications where the bit lengths of base, modulus, and exponent are the same length (the CPU Cycles of other modular reductions/ the CPU Cycles of the proposed method) 74
Table 20. The number of CPU cycles for our integrated Montgomery modular multiplication and the other Montgomery multiplications 76
Table 21. The number of CPU cycles for modular exponentiation with different modular multiplications where the bit lengths of base, modulus, and exponent are the same length 77
Table 22. Speedup of modular exponentiation with the proposed integrated method vs. other modular multiplications where the bit lengths of base, modulus, and exponent are the same length (the CPU Cycles of other modular reductions/ the CPU Cycles of the proposed method) 78 
參考文獻 
[1] J. Grobschadl, R. M. Avanzi, E. Savas, and S. Tillich, "Energyefficient software implementation of large integer modular arithmetic," CHES 2005, LNCS 3659, pp. 7590, 2005.
[2] T. Simuni, "Energy Efficient System Design and Utilization," Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2001.
[3] S. Ravi and A. Raghunathan, "Security in embedded systems: Design challenges," ACM Transaction on Embedded Computing Systems, vol. 3, pp. 461491, 2004.
[4] R. Rivest, A. Shamir, and L. Adleman, "A method for obtaining digital signature and publickey cryptosystems," Communications of ACM, vol. 21, pp. 120126, 1978.
[5] K. Koyama, UM. Maurer, T. Okamoto, and SA. Vanstone, "New publickey schemes based on elliptic curves over the ring Zn," presented at the CRYPTO'91, Santa Barbara, 1991.
[6] NIST, "FIPS 1863: Digital signature standard(DSS)," June 2009, Available: http://csrc.nist.gov/publications/fips/fips1862/fips1862change1.pdf.
[7] W. Diffie and M.E. Hellman, "New directions in cryptography," IEEE Transaction on Information Theory, vol. IT22, pp. 638654, 1976.
[8] H. Y. Lin, "Security and authentication in PCs," Computers and Electrical Engineering, vol. 25, pp. 225248, 1999.
[9] S. Suzuki and K. Nakada, "An authentication technique based on distributed security management for the global mobility network," IEEE Journal on Selected Areas Communications, vol. 15, pp. 16081617, 1997.
[10] W. Geiselmann and R. Steinwandt, "Specialpurpose hardware in cryptanalysis: The case of 1024Bit RSA," IEEE Security and Privacy, vol. 5, pp. 6366, 2007.
[11] T. Guneysu, C. Paar, and J. Pelel, "Specialpurpose hardware for solving the elliptic curve discrete logarithm problem," ACM Transactions on Reconfigurable Technology and Systems, vol. 1, article no. 8, 2008.
[12] NIST, "Special Publications SP 80057 Part 1 : Recommendation for key management: Part 1: General (Revision 3)," July 2012, Available: http://csrc.nist.gov/publications/nistpubs/80057/sp80057_part1_rev3_general.pdf.
[13] NIST, "SP 80057 Part 3 : Recommendation for key managemen: Part 3: Applicationspecific key management guidance," December 2009, Available: http://csrc.nist.gov/publications/nistpubs/80057/sp80057_PART3_keymanagement_Dec2009.pdf.
[14] D. Boneh and H. Shacham, "Fast variants of RSA," Cryptobytes, vol. 5, pp. 19, 2002.
[15] D. E. Knuth, The Art of computer programming, AddisonWesley, 1998.
[16] M. Joye and S. M. Yen, "Optimal lefttoright binary signeddigit recoding," IEEE Transactions on Computers, vol. 49, pp. 740748, 2000.
[17] W. C. Yang, D. J. Guan, and C. S. Laih, "Algorithm of asynchronous binary signeddigit recoding on fast multiexponentiation," Applied Mathematics and Computation, vol. 167, pp. 108117, 2005.
[18] B. Qin, M. Li, F. Kong, and D. Li, "New lefttoright minimal weight signeddigit radixr representation," Computers and Electrical Engineering, vol. 35, pp. 150158, 2009.
[19] S. M. Yen, "Improved commonmultiplicandmultiplication and fast exponentiation by exponent decomposition," IEICE Transaction on Fundamentals, vol. E80A, pp. 11601163, 1997.
[20] C. L. Wu, "An efficient commonmultiplicandmultiplication method to the Montgomery algorithm for speeding up exponentiation," Information Sciences, vol. 179, pp. 410421, 2009.
[21] M. E. Kaihara and N. Takagi, "Bipartite modular multiplication method," IEEE Transactions on Computers, vol. 57, pp. 157164, 2008.
[22] M. Knezevic, F. Vercauteren, and I. Verbauwhede, "Speeding up bipartite modular multiplication," Arithmetic of Finite Fields, LNCS 6087, pp. 166179, 2010.
[23] J. H. Yang and C. C. Chang, "Efficient residue number system iterative modular multiplication algorithm for fast modular exponentiation," IET Computers and Digital Techniques, vol. 2, pp. 15, 2008.
[24] B. J. Phillips, Y. Kong, and Z. Lim, "Highly parallel modular multiplication in the residue number system using sum of residues reduction," Applicable Algebra in Engineering, Communication and Computing, vol. 21, pp. 249255, 2010.
[25] P. L. Montgomery, "Modular multiplication without trial division," Mathematics of Computation, vol. 44, pp. 519521, 1985.
[26] P. Barrett, "Implementing the Rivest, Shamir and Adleman public key encryption algorithm on a standard digital signal processor," CRYPTO ’86, LNCS 263, pp. 311323, 1987.
[27] A. Bosselaers, R. Govaerts, and J. Vandewalle, "Comparison of three modular reduction functions," CRYPTO ’93, LNCS 773, pp. 175186, 1993.
[28] H. Eberle, S. Shantz, V. Gupta, N. Gura, L. Rarick, and L. Spracklen, "Accelerating nextgeneration publickey cryptosystems on generalpurpose CPUs," IEEE Macro, vol. 25, pp. 5259, 2005.
[29] A. Liu and P. Ning, "TinyECC: A configurable library for elliptic curve cryptography in wireless sensor networks," Proceedings of the 7th international conference on Information processing in sensor networks (IPSN'08), pp. 245256, 2008.
[30] C. Lederer, R. Mader, M. Koschuch, J. Grobschadl, A. Szekely, and S. Tillich, "Energyefficient implementation of ECDH key exchange for wireless sensor networks," WISTP 2009, LNCS 5746, pp. 112127, 2009.
[31] J. Grosschadl, "TinySA: A security architecture for wireless sensor networks," Proceedings of the 2006 ACM CoNEXT conference (CoNEXT '06 ), article no. 55, 2006.
[32] Y. Zhang and J. Grossch‥adl, "Efficient primefield arithmetic for elliptic curve cryptography on wireless sensor nodes," Proceedings of the 1st International Conference on Computer Science and Network Technology (ICCSNT 2011), vol. 1, pp. 459466, 2011.
[33] T. ElGamal, "A public key cryptosystem and a signature scheme based on discrete logarithm," CRYPTO ’84, LNCS 196, pp. 1018, 1985.
[34] A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, Handbook of applied cryptography, CRC Press, 1996.
[35] A. Karatsuba and Y. Ofman, "Multiplication of multidigit numbers on automata," Soviet Physics Doklady, vol. 7, pp. 595596, 1963.
[36] P. Comba, "Exponentiation cryptosystems on the IBM PC," IBM Systems Journal, vol. 29, pp. 526538, 1990.
[37] Texas Instruments, "TMS320C5510 Ultra Low Power DSP," Available: http://processors.wiki.ti.com/index.php/C5510.
[38] Texas Instruments, "OMAP (Open Multimedia Applications Platform)," Available: http://en.wikipedia.org/wiki/Texas_Instruments_OMAP/.
[39] Texas Instruments, "C5000 DSPs: architecture & peripheral features," Available: http://www.ti.com.
[40] C. K. Koc, T. Acar, and B. Kaliski, "Analyzing and comparing Montgomery multiplication algorithms," IEEE Micro, vol. 16, pp. 2633, 1996. 
論文使用權限 
同意紙本無償授權給館內讀者為學術之目的重製使用，於20130806公開。同意授權瀏覽/列印電子全文服務，於20180806起公開。 


