淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


  查詢圖書館館藏目錄
系統識別號 U0002-1406200616041300
中文論文名稱 基數八無進位式快速除法器架構之設計
英文論文名稱 Carry-Free Radix-8 Division Architecture and Implementation of the Divider
校院名稱 淡江大學
系所名稱(中) 電機工程學系碩士在職專班
系所名稱(英) Department of Electrical Engineering
學年度 94
學期 2
出版年 95
研究生中文姓名 桂志顥
研究生英文姓名 Chih-Hao Kuei
學號 789350153
學位類別 碩士
語文別 中文
口試日期 2006-06-07
論文頁數 108頁
口試委員 指導教授-余繁
指導教授-江正雄
委員-江正雄
委員-余繁
委員-李揚漢
委員-鄭國興
委員-呂學坤
中文關鍵字 New Svoboda-Tung  基底八  倍率單元  除法器  帶號數字 
英文關鍵字 New Svoboda-Tung  Radix-8  Prescale  Division  Signed-Digit Number 
學科別分類 學科別應用科學電機及電子
中文摘要 因為至今的高速運算以及多媒體應用盛行,在微處理器中的基本運算單元之設計就顯得相當重要。雖然除法器並不是最常被運用到的運算單元,但是它在算數運算單元中卻是扮演著相當重要的角色。位元遞迴的除法演算是一個簡易且常被運用的方法,為了加快運算的速度,我們可以增加基底數目來減少疊代運算的次數,若是採用基底β來運算,其中β=2m,則每次疊代運算可得到m位元的商數,所需要的疊代次數為n/m。我們提出了一個基底為八且相容於IEEE 754-1985的浮點運算除法器,整個除法器的運算是根據New Svoboda-Tung(NST)演算法以及基底為八的多餘數字系統。
除法器使用了無進位延遲的加法單元來完成簡單的遞迴運算,在運算元部份,是透過倍率單元來完成。我們整理了一個有系統的方法來完成倍率單元的設計,也提出了除法架構的硬體設計,在運算時間的複雜度方面是一個常數,不會因為位元長度而改變。在我們所設計的除法器中,因為運用了無進位延遲加法器,所以可以快速的處理任何位元的除法。在模擬結果中可知道,硬體電路的複雜度以及運算速度上的效能都比傳統SRT除法器來的好。
英文摘要 Due to the progress of high speed computation and multimedia application, the hardware implementation of all basic arithmetic operations becomes important in the design of microprocessor. Although division is not the most frequent arithmetic operation, Implementation of division is always one key point in arithmetic units. A simple and widely implemented class of division algorithm is digit recurrence. To speedup the division process, one may reduce the number of iteration steps by increasing the radix β of the process. Selecting β=2m allows the generation of m quotient bits at each step and the number of steps can be reduced to n/m. We propose a radix-8 floating point division that complies with the IEEE 754-1985. This method is based on New Svoboda-Tung(NST) algorithm and the radix-8 redundant number system.
The divider involves a simple recurrence with carry-free addition and employs prescaling of the operands. We summarize a general systematic method to accomplish the prescaling, and we also propose a hardware scheme such that the timing complexity is constant regardless of the bit length of the divisor. By taking the advantage of the carry-free addition, we can perform any bit length divider in a fast easy manner. The simulation results show that the hardware complexity and performance of this divider is competitive with conventional SRT dividers.
論文目次 中文摘要----------I
ABSTRACT---------- II
目錄----------III
圖目----------VI
表目----------IX
第一章: 緒論
1.1 研究動機與目標----------1
1.2 論文架構----------3
第二章: 除法器架構、原理簡介
2.1 IEEE 754浮點標準----------5
2.2 浮點除法運算的組織架構----------9
2.3 帶號數字系統與帶號數字加法器簡介----------12
2.4 各式除法架構、原理簡介----------20
2.4.1 除法架構分類----------20
2.4.2 Newton-Raphson除法簡介----------20
2.4.3 Goldschmidt除法簡介----------25
2.4.4 Division by convergence除法簡介----------29
2.4.5 Restore除法簡介----------31
2.4.6 Nonrestore除法簡介----------35
2.4.7 SRT除法簡介----------38
2.4.8 Svoboda-Tung除法簡介----------47
第三章: 高基底除法
3.1 高基底除法簡介----------58
3.2 高基底SRT除法----------59
3.3 New Svoboda-Tung除法----------63
3.3.1 Svoboda-Tung除法簡介----------63
3.3.2 New Svoboda-Tung除法簡介----------67
3.3.3 New Svoboda-Tung除數範圍討論----------68
3.3.4 重置條件----------69
3.3.5 New Svoboda-Tung除法分析----------70
3.3.6 倍率單元(Prescale Unit)分析----------73
3.3.7 New Svoboda-Tung演算流程----------74
第四章: NST基底八除法架構
4.1 除法器架構----------76
4.2 倍率單元----------78
4.3 疊代單元----------89
第五章: 模擬與驗證結果
5.1 驗證流程----------94
5.2 模擬結果----------96
5.3 效能比較----------98
第六章: 結論
6.1 結論----------101
6.2 檢討與建議----------101
參考文獻----------103圖 目
圖2.1 IEEE 754 單精準度格式與倍精準度格式·········································6
圖2.2 +0.75(10)轉換成IEEE 754 單精準度格式········································8
圖2.3 -0.75(10)轉換成IEEE 754 倍精準度格式·········································9
圖2.4 浮點除法運算組織架構··································································10
圖2.5 帶號數字加法器架構······································································13
圖2.6 數值置換架構··················································································16
圖2.7 牛頓-羅富森·····················································································22
圖2.8 Newton-Raphson 除法架構······························································24
圖2.9 三階Newton-Raphson 除法架構····················································25
圖2.10 Goldschmidt 除法架構···································································27
圖2.11 Peter Soderquist 提出的除法架構··················································28
圖2.12 存回式除法(Restoring Division)··············································33
圖2.13 存回式除法架構············································································34
圖2.14 非存回式除法(Nonrestoring Division)·····································37
圖2.15 非存回式除法架構········································································37
圖2.16 加入商數0 的非存回式除法························································40
圖2.17 SRT 除法························································································41
圖2.18 SRT 基本架構·················································································42
圖2.19 Radix-4 SRT 架構···········································································43
圖2.20 Radix-4 SRT 的P-D 圖··································································44
圖2.21 部份餘數的範圍表········································································44
圖2.22 部份餘數的重疊區域數值····························································44
圖2.23 Pentium 晶片的浮點運算錯誤散佈圖···········································46
圖2.24 Pentium 晶片的浮點運算錯誤曲面圖···········································46
圖2.25 Radix-4 NST MROR 的基本架構··················································50
圖2.26 Radix-4 Prescale 架構·····································································53
圖2.27 Radix-4 MROR 疊代單元架構·······················································56
圖3.1 部份餘數範圍圖··············································································60
圖3.2 SRT 除法選商圖···············································································61
圖3.3 SRT 除法的P-D 圖··········································································63
圖3.4 NST 除法理論樹狀圖······································································71
圖4.1 Radix-8 MROR NST 除法組織架構················································77
圖4.2 倍率常數範圍相關圖······································································83
圖4.3 倍率選取單元的架構圖··································································84
圖4.4 疊代單元架構··················································································89
圖4.5 rj
1a與rj
2a重置邏輯電路········································································91
圖5.1 ModelSim 模擬結果1 ······································································95
圖5.2 ModelSim 模擬結果2 ······································································96
圖5.3 邏輯閘層次模擬結果1 ···································································97
圖5.4 邏輯閘層次模擬結果2 ···································································98
表 目
表2.1 IEEE 754 標準浮點格式····································································6
表2.2 數值置換真值表··············································································16
表2.3 最後和產生電路真值表··································································17
表2.4 最後和產生電路真值表··································································18
表2.5 存回式除法範例··············································································34
表2.6 非存回式除法架構··········································································38
表2.7 Svoboda 演算法················································································48
表2.8 Svoboda 除法範例············································································48
表2.9 Radix-4 倍率常數表········································································52
表2.10 輸入數值與多工器輸出的真值表················································53
表2.11 Radix-4 MROR 變負單元的真值表···············································54
表3.1 各種基底查表電路所需要的儲存容量··········································62
表3.2 NST 除法理論δ值列表···································································72
表4.1 Radix-8 MROR 帶號數字二進制編碼表·········································77
表4.2 Radix-8 MROR di 數值表································································81
表4.3 Radix-8 MROR 倍率常數表····························································82
表4.4 倍率選取輸入與多工器的關係表··················································85
X
表4.5 變負單元真值表··············································································86
表4.6 rj
1a重置條件························································································90
表4.7 rj
2a重置條件························································································91
表4.8 補償單元選擇表··············································································92
表5.1 測試參數··························································································97
表5.2 單精確度的速度性能比較表··························································99
表5.3 倍精確度的速度性能比較表··························································99
表5.4 Gate Count 比較表··········································································100
表5.5 預估消耗功率比較表····································································100
參考文獻 參考文獻
[1] S.F. Oberman and M.J. Flynn, “Division Algorithms and Implementations,” IEEE Trans. on Computers, vol. 46, no. 8, Aug. 1997, pp. 833-854.
[2] S.F. Oberman and M.J. Flynn, “Minimizing the Complexity of SRT Tables,” IEEE Trans. on VLSI, vol. 6, no. 1, Mar. 1998, pp. 141-149.
[3] J.E. Robertson, “A New Class of Digital Division Methods,” IRE Trans. on Electronic Computers, vol. 7, Sep. 1958, pp. 218-222.
[4] T.D. Tocher, “Techniques of Multiplication and Division for Automatic Binary Computers,” Quarterly J. Mech. App. Math., vol. 2, pt. 3, 1958, pp. 364-384.
[5] A. Svoboda, “An Algorithm for Division,” Information Processing Machines, vol. 9, Mar. 1963, pp. 183-190.
[6] C. Tung, “A Division Algorithm for Signed-Digit Arithmetic,” IEEE Trans. on Computers, vol. 17, Sep. 1968, pp. 887-889.
[7] D.E. Atkins, “Higher-Radix Division Using Estimates of The Divisor and Partial Remainders,” IEEE Trans. on Computers, vol. 17, no. 10, Oct. 1968, pp. 933-937.
[8] M.D. Ercegovac, T. Lang, and P. Montuschi, “Very High Radix Division with Selection by Rounding and Prescaling,” Proc. 11th IEEE Symp. Computer Arithmetic, July 1993, pp. 112-119.
[9] A. Avizienis, “Signed Digit Number Representation for Fast Parallel Arithmetic,” IRE Trans. on Electronic Computers, vol. 10, Sep. 1961, pp. 389-400.
[10] S.F. Oberman and M.J. Flynn, “Design Issues in Division and Other Floating-Point Operations,” IEEE Trans. on Computers, vol. 46, no. 2, Feb. 1997, pp.154-161.
[11] L.A. Montalvo, K.K. Parhi, and A. Guyot, “New Svoboda-Tung Division,” IEEE Trans. on Computers, vol. 47, no. 9, Sep. 1998, pp. 1014-1020.
[12] H.R. Srinivas and K.K. Parhi, “A Fast Radix-4 Division Algorithm and Its Architecture,” IEEE Trans. on Computers, vol. 44, no. 6, June 1995, pp. 826-831.
[13] H.R. Srinivas, K.K. Parhi, and L.A, Montalvo, “Radix 2 Division with Over-Redundant Quotient Selection,” IEEE Trans. on Computers, vol. 46, no. 1, Jan. 1997, pp. 85-92.
[14] N. Burgess, “Prescaled Maximally-Redundant Radix-4 SRT Divider,” Electronics Letters, vol. 30, no. 23, Nov. 1994, pp. 1926-1928.
[15] M.D. Ercegovac and T. Lang, “Simple Radix-4 Division with Operands Scaling,” IEEE Trans. on Computers, vol. 39, no. 9, Sep. 1990, pp. 1204-1208.
[16] M.D. Ercegovac, T. Lang, and P. Montuschi, “Very-High Radix Division with Prescaling and Selection by Rounding,” IEEE Trans. on Computers, vol. 43, no. 8, Aug. 1994, pp. 909-918.
[17] “IEEE Standard for Binary Floating Point Arithmetic,” ANSI/IEEE Standard 754-1985, IEEE Computer Society, 1987.
[18] P. Montuschi and L. Ciminiera, “Over-Redundant Digit Sets and the Design of Digit-by-Digit Units,” IEEE Trans. on Computers, vol. 43, no. 3, Mar. 1994, pp. 269-279.
[19] P. Montuschi and L. Ciminiera, “Design of a Radix-4 Division Unit with Simple Selection Table,” IEEE Trans. on Computers, vol. 41, no. 12, Dec. 1992, pp. 1606-1611.
[20] S. Kuninobu, H.E.T. Nishiyama, T. Tanaguchi, and N. Takagi, “Design of High Speed MOS Multiplier and Divider Using Redundant Binary Representation,” Proc. 8th Symp. Computer Arithmetic, Como, Italy, 1987, pp. 80-86.
[21] M.D. Ercegovax and T. Lang, Division and Square Root. Norwell, Mass.: Kluwer Academic, 1994.
[22] N. Burgess, “A Fast Division Algorithm for VLSI,” in Proc. IEEE Int’l Conf. Computer Design: VLSI in Computers and Processors, Boston, Oct. 1991, pp. 560-563.
[23] J.-S. Chiang and M.-S. Tsai, “A Radix-4 New Svobota-Tung Divider with Constant Timing Complexity for Prescaling,” Kluwer Academic, vol. 33, no. 1-2, Jan. 2003, pp.117-124.
[24] M.C. Mekhallalati and M.K. Ibrahim, “New High Radix Maximally-Redundant Signed Digit Adder,” IEEE International Symp. on Circuit and System, vol. 1, June 1999, pp. 459-462.
[25] A. Nannarelli and T. Lang, “Low-Power Division: Comparison among Implementation of Radix 4, 8 and 16,” 14th IEEE Symp. on Computer Arithmetic, 1999, pp. 60.
[26] A. Nannarelli and T. Lang, “Low-Power Radix-8 Divider,” Proc. of International Conference on Computer Design, Oct. 1998, pp. 420-426.
[27] M.-S. Tsai, “The Design and Implementation of a High Speed Radix-4 Carry Free Division Architecture,” Master Thesis, Tamkang University, June, 2000.
[28] S.-C. Tsai, “Design of a Fast Radix-8 Divider with Operands Scaling,” Master Thesis, National Chung Hsing University, June, 1998.
[29] J.-R. Huag, “High Performance Radix-8 Divider,” Master Thesis, National Tsing Hua University, May, 1997.
[30] C.-H. Liu, “A Comparative Study of Short Word-Length Fixed-Point, Floating-Point, and LNS Arithmetic Units,” Master Thesis, Feng Chia University, June, 2005.
[31] M. Flynn, “On Division by Functional Iteration,” IEEE Transactions on Computers, Aug. 1970, pp. 702-706.
[32] D.L. Flower and J.E. Smith, “An Accurate High Speed Implementation of Division by Reciprocal Approximation,” Proc. 9th Symp. on Computer Arithmetic, 1989, pp. 60-67.
[33] Y.-L. Chen, “Design of a Fast Signed-Digit Divider for Floating-Point Arithmetic,” Master Thesis, Feng Chia University, June, 2000.
[34] C.L. Wey and C.P. Wang, “Design of A Fast Radix-4 SRT Divider and Its VLSI Implementation,” IEEE Proc. Comput. Digit. Tech., vol. 146, no. 4, July 1999, pp. 205-210.
[35] T. Coe, T. Mathisen, C. Moler and V. Pratt, “Computational Aspects of the Pentium Affair,” IEEE Computational Science and Engineering, Mar. 1995, pp. 18-31.
[36] T.-H. Pan, H.-S. Kay, Y. Chun and C.-L. Wey, “High-Radix SRT Division with Speculation of Quotient Digits,” IEEE International Conference on Computer Design (ICCD'95)., Oct. 1995, pp. 479-484.
[37] M.A. Thornton, “Signed Binary Addition Circuitry with Inherent Even Parity Outputs,” IEEE Trans. on Computers, vol. 46, no. 7, July 1997, pp. 811-816.
[38] B. Parhami, “Generalized Signed-Digit Number Systems: A Unifying Framework for Redundant Number Representations,” IEEE Trans. on Computers, vol. 39, no. 1, Jan. 1990, pp. 89-98.
[39] T. Carter and J. Robertson, “Radix-16 Signed-Digit Division,” IEEE Trans. on Computers, vol. 39, no. 12, Dec. 1990, pp. 1424-1433.
[40] S.F. Anderson, J.G. Earle, R.E. Goldschmidt, and D.M. Powers, “The IBM System/360 Model 91:Floating-Point Execution Unit,” IBM J. Research and Development, vol. 11, Jan. 1967, pp. 34-52.
[41] R.E Goldschmidt, “Applications of Division by Convergence,” Master Thesis, Dept. of Electrical Eng., Massachusetts Inst. Of Technology, June 1964.
[42] P. Soderquist and M. Leeser, “Area and Performance Tradeoffs in Floating-Point Divide and Square-Root Implementation,” ACM Computing Surveys, vol. 28, no. 3, 1996, pp. 518-564.
[43] T.-J. Kwon, J.-S. Moon, Jeff Sondeen and Jeff Draper, “A 0.18mm Implementation of a Floating-Point Unit for a Processing-In-Memory System,” IEEE International Symp. on Circuit and System, vol. 2, May 2004, pp. 453-456.
[44] X. Wang and B.E. Nelson, “Tradeoffs of Designing Floating-Point Division and Square Root on Virtex FPGAs,” Proc. 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, vol. 2, Apr. 2003, pp. 195-203.
[45] P. Montuschi and T. Lang, “Boosting Very-High Radix Division with Prescaling and Selection by Rounding,” IEEE Trans. on Computers, vol. 50, no. 1, Jan. 2001, pp. 13-27.
[46] J.-S. Moon, T.-J. Kwon, J. Sondeen and J. Draper, “An Area-Efficient Standard-Cell Floating-Point Unit Design for a Processing-In-Memory System,” Proc. of the Conference on European Solid-State Circuits, Sep. 2003, pp. 57-60.
[47] E. Rice and R. Hughey, “A New Iterative Structure for Hardware Division: The Parallel Paths Algorithm,” 16th IEEE Symposium on Computer Arithmetic (ARITH’03), June 2003, pp. 54-62.
[48] A.A. Liddicoat and M.J. Flynn, “High-Performance Floating Point Divide,” Euromicro Symposium on Digital Systems Design (DSD'01), Sept. 2001, pp. 354-361.
[49] E. Antelo, T. Lang, P. Montuschi and A. Nannarelli, “Digit-Recurrence Dividers with Reduced Logical Depth,” IEEE Transactions on Computers, vol. 54, no. 7, July 2005, pp. 837-851.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2008-06-21公開。
  • 不同意授權瀏覽/列印電子全文服務。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信