IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

International Journal of Research in Information Technology (IJRIT) www.ijrit.com

ISSN 2001-5569

Design of Double Precision IEEE Floating Point Adder K.Varun, M-Tech VLSI Design, E. Siva Kumar, M-Tech. Faculty Of Engineering and Technology, Dept of ECE,SRM University Email: [email protected] ; [email protected]

Abstract Floating-Point addition imposes a great challenge during implementation of complex algorithm in hard real-time due to the enormous computational burden associated with repeated calculations with high precision numbers. Moreover, at the hardware level, any basic addition or subtraction circuit has to incorporate the alignment of the significands. This work presents a novel technique to implement a double precision IEEE floating-point adder that can complete the operation within two clock cycles. The proposed technique has exhibited improvement in the latency reduction and operational chip area management.

Keywords: Floating-Point Addition, IEEE 754, Delay Optimization.

1. Introduction In recent years computer applications have increased in their computational complexity. The industry-wide usage of performance benchmarks, such as SPEC marks forces processor designers to pay particular attention to implementation of the floating point unit or FPU. Special purpose applications, such as high performance graphics rendering systems, have placed further demands on processors. High speed floating point hardware is a requirement to meet these increasing demands. This work examines the state-of-the-art in FPU design and proposes techniques for improving the performance and the performance/area ratio of future FPUs. A floating point number representation can simultaneously provide a large range of numbers and a high degree of precision. As a result, a portion of modern micro-processors is often dedicated to hardware for floating point computation. Previously, silicon area constraints have limited the complexity of the floating point unit or FPU. Advances in integrated circuit fabrication technology have resulted in both smaller feature sizes and increased die areas. Together, these trends have provided a larger transistor budget to the processor designer. It has therefore become possible to implement more sophisticated arithmetic algorithms to achieve higher FPU performance. Efficient use of the chip area and resources of an embedded system poses a great challenge while developing algorithms in embedded platforms for hard real-time applications, like control systems, digital signal processing, vision based sensing, and so on. Although, different computational requirement of the algorithms involves different degrees of precision in any engineering and scientific applications, the floating point operations are almost always employed in such applications for accurate and reliable algorithmic computations. However, addressing the problem of floating-point representation of numbers and the computational resources required while execution of the algorithm, at the software level, may not result into an optimal and dependable solution. Therefore, some hardware based solution at the chip development level is most suitable for this case where a dedicated digital circuit would be responsible for representation of the floating point numbers and as well as performing the arithmetic and logical operations as demanded by the algorithms. However, Development of such a digital circuit for the purpose of representation of the floating point numbers and as well as performing the arithmetic and logical operations on them is quite difficult at the chip level due to the high level of Complexities involved. This paper presents a novel technique to implement a double precision IEEE floating-point adder that can complete the operation within two clock cycles. This investigation presents a novel technique to implement a double precision IEEE floating-point adder that can complete the operation within two clock cycles. A number K.Varun,

IJRIT

468

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

of works have been reported in the literature with an aim to achieve a reduced latency realization of floating point operations.

II. REPRESENTATION OF IEEE 754 DOUBLE FLOATING PPOINT NUMBER In accordance with IEEE 754-2008 there are half, single, double and quadruple precision binary numbers having a mantissa of bit length 16, 32, 64, 128 respectively. Out of these, the double precision number is most widely used in the area of binary applications. This type of representation of the numbers is advantageous due to fact that a large spectrum of numbers can be expressed With a limited number of bits.

S-Sign bit E-Exponent bits F-Significand bits Example Example Decimal number=18.75 Convert the digits before and after the decimal point in binary format Value is 10010.1001011 Exponent consist of a bias and for double precision it is 0 < E < 2047 Actual exponent is: e = shifted places +1023 Hence, 1023+4=1027 1027 representation in binary=1000000011 Remaining mantissa=0010110000000000000000000000000000000000000000000000 III. General Algorithm The standard algorithm for the floating-point addition [7] has been described as follows. Let, the two input numbers, defmed by their bit vector be, (sa, ea, fa) and (sb, eb, fb), and say SOP denote the arithmetic operation that is required to be done (0 – for addition; 1 - for negation). The requested operation is the rnd(sum) = rnd(( _l)^sa ·2^ea . fa + (_l )^sOP+sb) • 2^eb • fb) Let, the effective sign of operation be, S.EFF = sa xor sb xor SOP So, for S.EFF = 0 the circuit will perform an essential addition and if S.EFF = 1 then the arithmetic operation will essentially be a subtraction. From these two numbers, and the exponent difference 0, the small operand is defined as(ss, es, fs) and the large operand is denoted as, (sl, el, fl).The resulting sum can be computed as Sum = (_1)^sl ·2^el . (fl + (_l)^sEFF (fs.2^|ᵟ |)

K.Varun,

IJRIT

469

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

IV. Proposed Algorithm The floating point arithmetic in is two stage pipe lined which are divided into two paths, namely "R-Path" and "N-Path". The two paths are selected on the basis of the exponent difference. The new algorithm has been arrived at by following a few implemental changes in the standard algorithm .This algorithm is broken into two pipeline stages, which are executed in two different clock cycles. The advantage of the pipelining mechanism is that, despite having a higher input-output sequential length, they offer an unmatched throughput by virtue of their assembly line structure.

Fig 2:Higher level representation of algorithm First cycle: • Normalization of the inputs. • Determination of the effective sign of operation.(S.EFF) • Determination of the alignment shift amount, δ or MAG_MED signal Second Cycle: • Addition of the Significand. • Rounding of the result. • Normalization of the result. Detailed representation of first cycle:

K.Varun,

IJRIT

470

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

Fig 3:first clock cycle First cycle: • Normalization of the inputs. • Determination of the effective sign of operation.(S.EFF) • Determination of the alignment shift amount, δ or MAG_MED signa Detailed representation of second cycle:

Fig 4:second clock cycle Second cycle: • Addition of the Significand. • Rounding of the result. • Normalization of the result.

K.Varun,

IJRIT

471

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

V.SIMULATION RESULTS Simulation of this circuit we have used Model sim-altera 6.6d. The top level source is Hardware Description Language type, with a Verilog code written and synthesis of this design is carried out in Altera Quartus II. Figure 5.a shows the simulation result using model sim and Figure 5.b shows the synthesis report carried out in Altera Quartus II

Fig 5.a

Fig 5.b

VI. CONCLUSIONS This paper has successfully demonstrated an implementation of a high speed, IEEE 754, double precision floating point adder with a significant decrease in latency. This paper has shown a latency of 4.6 ns. This manifest in the fact that FPGA based embedded systems has a higher advantage of lower computational aspects. Also, an implementation work of this algorithm, on the recent Xilinx Virtex-6 FPGA would give results with further improvement.

VII REFERENCES [1]

Peter-Michael Seidel, Guy Even, "Delay-Optimized Implementation of IEEE Floating-Point Addition", IEEE Trans. on Computers, vol. 53, no. 2, pp. 97-113, Feb. 2004.

[2]

Karan Gumber, Sharmelee Thangjam, "Performance Analysis of Floating Point Adder using VHDL on Reconfigurable Hardware", International Journal of Computer Applications, vol. 46, no. 9, pp. 1-5, May 2012.

K.Varun,

IJRIT

472

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: 468- 473

[3]

"An FPGA Implementation of a Fully Verified Double Precision IEEE Floating-Point Adder", Proc. of IEEE International Conference on Application-specific Systems, Architectures and Processors, pp. 8388, 9-11 July 2007.

[4]

A. Tyagi, "A Reduced-Area Scheme for Carry-Select Adders", IEEE t rans. on Computers, vol. 42, no. 10, pp. 1163-1170, Oct. 1993.

[5]

S. Oberman, H. A1-Twaijry, and M. Flynn, "The SNAP Project: Design of Floating Point Arithmetic Units", Proc. of 13th IEEE Symposium on Computer Arithmetic, pp. 156-165, 1997.

[6]

P. Farmwald, "On the Design of High Performance Digital Arithmetic Units," PhD thesis, Stanford Univ., Aug. 1981.

[7]

Manish Kumar Jaiswal and Ray C.C. Cheung” High Performance FPGA Implementation of Double Precision Floating Point Adder/Subtractor”, International Journal of Hybrid Information Technology Vol.4, No.4 October, 2011

[8]

A. Nielsen, D. Matula, e.N. Lyu, G. Even, "IEEE Compliant Floating-Point Adder that Conforms with the Pipelined Packet-Forwarding Paradigm," IEEE Trans. on Computers, vol. 49, no. 1, pp. 33-47, Jan. 2000.

[8]

N. Quach, N. Takagi, and M. Flynn, "On fast IEEE Rounding", Technical Report CSL-TR-91-459, Stanford Univ., Jan. 1991.

[9]

P.-M. Seidel, "On the Design of IEEE Compliant FloatingPoint Units and their Quantitative Analysis", PhD thesis, Univ. of Saarland, Germany, Dec. 1999.

[10]

Somsubhra Ghosh,Prarthana Bhattacharya and Arka Dutta"FPGA Based Implementation of a Double Precision IEEE Floating-Point Adder ",7 th International Conference on Intelligent Systems and Control (ISCO) 2013.

K.Varun,

IJRIT

473

Design of Double Precision IEEE Floating Point Adder - International ...

This work presents a novel technique to implement a double precision IEEE floating-point adder that can complete the operation within two clock cycles. The proposed technique has exhibited ... resources required while execution of the algorithm, at the software level, may not result into an optimal and dependable solution.

1MB Sizes 2 Downloads 163 Views

Recommend Documents

Design of the Floating-Point Adder Supporting the Format Conversion ...
Aug 8, 2002 - converted into absolute number. Then, as the normal- ization step, the leading zero is calculated in the lead- ing zero counter[9] for the absolute ...

Design of the Floating-Point Adder Supporting the Format Conversion ...
Aug 8, 2002 - Conference on ASICs, pp.223–226, Aug. 2000. [5] A.B. Smith, N. Burgess, S. Lefrere, and C.C. Lim, “Re- duced latency IEEE floating-point standard adder architec- tures,” Proc. IEEE 14th Symposium on Computer Arith- metic, pp.35–

Floating-Point Comparison
16 Oct 2008 - In Java, JUnit overloads Assert.assertEquals for floating-point types: assertEquals(float expected, float actual, float delta);. assertEquals(double expected, double actual, double delta);. An example (in C++): TEST(SquareRootTest, Corr

Floating-Point Comparison
16 Oct 2008 - In Java, JUnit overloads Assert.assertEquals for floating-point types: assertEquals(float expected, float actual, float delta);. assertEquals(double expected, double actual, double delta);. An example (in C++): TEST(SquareRootTest, Corr

Floating-Point Comparison
Oct 16, 2008 - More information, discussion, and archives: http://googletesting.blogspot.com. Copyright © 2007 Google, Inc. Licensed under a Creative ...

Practical Floating-point Divergence Detection
ing 3D printing, computer gaming, mesh generation, robot motion planning), ..... contract is a comparison between signatures of outputs computed under reals ..... platforms. Their targeting problem is similar to the problem described in [22], and it

Stochastic Optimization of Floating-Point Programs with ... - GitHub
preserve floating point programs almost as written at the expense of efficiency. ... to between 1- and 64-bits of floating-point precision, and are up to. 6 times faster than the ...... for the exp() kernel which trade precision for shorter code and.

Designing of High-Performance Floating Point Processing ... - IJRIT
Abstract— This paper presents the design and evaluation of two new processing ... arithmetic processing units which allow modular system design in order to ...

Designing of High-Performance Floating Point Processing ... - IJRIT
Digital signal processing and multimedia applications require large amounts of data, real-time processing ability, and very high computational power. As a result ...

Floating Point Verilog RTL Library User Guide - Pulse Logic
Differences detected while comparing Verilog simulation results against C/C++ ... THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR ... OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER ...

Unsafe Floating-point to Unsigned Integer Casting ...
and offer simple practical solutions based on static typing. 1 Supported in part by NSF CCF 7298529 and 1346756. 2 Supported in part ...... [17] Liu, W., B. Schmidt, G. Voss and W. Müller-Wittig, Accelerating molecular dynamics simulations using gra

The IBM System/360 Model 91: Floating-point Execution ... - CiteSeerX
performance computers are floating-point oriented. There- fore, the prime concern in designing the floating-point execution unit is to develop an overall organization which will match the performance of the instruction unit. How- ever, the execution

The design and implementation of calmRISC32 floating ...
Department of Computer Science, Yonsei University, Seoul 120-749 Korea. **MCU Team ... and low power micro-controller, not high performance micm- processor. ... shows the architecture of CalmRISC43 FPU and section 3 de- scribes the ...

The IBM System/360 Model 91: Floating-point ... - Semantic Scholar
J. G. Earle. R. E. Goldschmidt. D. M. Powers ..... COMMON R. PT. ET%:. RES STAT 1 ...... as soon as R X D is gated into CSA-C, the next multiply,. R X N, can be ...

A Distillation Algorithm for Floating-point Summation
|e| = |s − fl(s)| ≤ n. ∑ i=1. |xi|(n + 1 − i)η. (2). The main conclusion from this bound is that data values used at the beginning of the summation (x1, x2, x3, ...) have ...

Printing Floating-Point Numbers Quickly and Accurately ...
Jun 5, 2010 - we present a custom floating-point data-type which will be used in all remaining ..... exponent, a 32-bit signed integer is by far big enough. ... the precise multiplication we will use the “rounded” symbol for this operation: ˜r .

The IBM System/360 Model 91: Floating-point ... - Semantic Scholar
execution of instructions has led to the design of multiple execution units linked .... complex; the data flow path has fewer logic levels and re- requires less ...

Optimizing Precision Photodiode Sensor Circuit Design - Electronics
grows with increasing reverse voltage across the photodiode. Most manufacturers specify photodiode dark current with a reverse voltage of 10 mV. Figure 2.

Controlling Starting-Point Bias in Double-Bounded ...
Controlling Starting-Point Bias in Double-Bounded Contingent Valuation Surveys. Author(s): ... ciency. (JEL C35, Q26). I. INTRODUCTION. There exist several ways to elicit individ uals' willingness to pay for a given object or policy. Contingent valua

Low-power design - IEEE Xplore
tors, combine microcontroller architectures with some high- performance analog circuits, and are routinely produced in tens of millions per year with a power ...