Emerging Description Language Capabilities

Viewer
Transcript

Emerging Description Language Capabilities Matching Arithmetic Hardware Trends Alex Zamfirescu Alternative System Concepts Palo Alto, California, USA [email protected]

Abstract— Freeing the constraints that languages meant to

be used for software development, and those used to describe and design hardware should be similar with respect to their arithmetic capabilities, this paper attempts to motivate mixing dynamically changeable formats, in high level description and design languages. A perspective on widely used computer numeric formats precedes the presentation of a few important arithmetic design trends that benefit from intermixing of the formats, and mandate innovative dynamic format change capabilities. A map from desirable capabilities into HDL requirements is constructed. Two encouraging steps in modernizing HDLs are then exemplified. I.INTRODUCTION One of the first options faced by the designer of a digital system including arithmetic computations is wether that system should be implemented in hardware or software. For many years the software implementation was more attractive, mainly due to rapid progress of compiler technology, the mass-production of microprocessor devices and the availability of a good number of software engineers. Demand for increased DSP computational throughput and low power considerations lead to gradual migration from the generalpurpose processors to specialized programmable DSP devices, with the alternative to a microprocessor implementation being a custom digital hardware. The hardware approach improves speed and reduces the power consumed, but still suffers from the lack of mature high-level design tools. Development of expensive tools becomes more attractive when there is a better understanding of the problems they address, but the promise of fostering innovation and avoiding building commodities is preserved. This motivated the writing of this paper, which initially attempted the following: • Identify major trends in the design of custom hardware for arithmetic algorithm implementation; • Map those trends into desired capabilities for the methodologies based on popular hardware description languages (HDLs); • Extract requirements for the HDL extensions; • Discuss the first known steps in the right direction, made by standard organizations or private efforts. Additionally, an attempt was made to clarify why the arithmetic support mandated for classic programming languages, and that required for hardware description

languages (HDL) supporting system design and computer arithmetic optimization is and should remain different. The first kind of support is driven by the need to hide the hardware (platform) details from the language user, ensure portability, etc., while an HDL is best when it enables convenient control and refinement of the hardware details. On the other hand, poor HDL design equates software development language arithmetic requirements to those for languages describing hardware performing arithmetic computations. While this was less visible in the past, the massive recent progress in hardware capabilities, the trend to start the design above RTL, and the pressure to early-optimise designs for performance (including power [9]), makes mandatory the clear understanding of the difference, and its manifestation in modern HDLs. Adding to that is the need to conveniently describe, and many times refine, re-configurations of arithmetic computations, graceful degradations, redundancy in computations, all enabled by the dramatic recent progress in re-configurable devices [10-11]. Early concerns were only about integer numeric types for synthesis [4]. Those were followed by the need to address floating and fixed-point types [1]. The possibility of run-time dynamic HDL fixed and floating point types, based on unified (across HDL languages) type descriptors, was flagged in a report in early 2006 [5]. A study of HDL arithmetic capabilities as they appeared in VHDL, Verilog, and ELLA was published in [2]. Today there are good ISO/IEC standards that describe the software-driven language requirements for arithmetic support [6]-[8]. An implementation of variable precision hardware building blocks described in HDL appears in [12]. The structure of this paper is as follows. A perspective on the most often used computer numeric values, and some widely impacting basic properties (section II), is followed by the presentation of custom hardware arithmetic design trends (section III), and their map into HDL requirements (section IV), section V covers a few encouraging steps already taken using examples. Conclusions and some ramifications are summarized in section VI. Mathematical notations are only used to avoid lengthy repetitions. II.OVERVIEW OF COMPUTER NUMERIC VALUES Digital computer arithmetic deals with manipulating numeric values encoded into strings of bits, and belonging to subsets of Z or R , i.e. integer or real values. Of particular

interest are integer unsigned values, integer signed values, fixed-point values and floating-point values. All numeric values are in fact real1 values, and the boundaries of the subsets are determined by the length of the string of bits used for encoding, and by the representation scheme. Using a bit string of length n, at most 2n different values can be encoded, but the values may be either uniformly (equidistantly) or nonuniformly distributed. Throughout the paper notations that are closer to those used in the ISO/IEC standards will be used. Five families of sets are first introduced, and summarized in table I. Then a few interesting properties are enumerated.

TABLE I COMPUTER ARITHMETIC SETS OF REAL NUMBERS Set Families Forming



n n

Α.Families of Sets Containing Numeric Computer Values

values we will denote their set by

2' s

n, p

p ∈  ,  n , p will

denote the set of all fixed-point numbers which can be represented using a bit string of size n , where parameter p represents the displacement of the binary point from the LSB. When sign-magnitude encoding is used one bit encodes the sign, while with two’s complement the conventions are the same as for  n , and the value is multiplied by 2− p . When two’s complement representation is used for the fixed-point

n , p . Finally, the set

m ,e contains all floating-point number values, which can be

encoded, using the binary representation mandated by IEEE Std. 754r, 2008 [15] conventions, where m is the number of mantissa bits (significant field size), and e is the number of exponent bits.

 n ,  n , n , p , 2' s n , p , b m ,e form families indexed by positive integers ( n, m, e) , integer p , and the set {2,10} for b . The union of each such family will be denoted without indexes as , , , 2' s , b  or [ b ] . Let the union of all these sets be  . When the size of the bit string available (or used) to encode the  elements is limited to n , the set will be referred as n . Note that it could be shown that  includes  . However, n is just Parameterised sets

a subset of the rational numbers.

Bit Encoding

Set Name

Unsigned

2n− 1

…

Value n− 1

21 20

∑

i= 0

bn-1, bn-2,…b1, b0

Signed

2' s

b

Signmagnitude fixedpoint

n− 2

∑

∑

i= 0

b∈ {2,10}

 b m ,e 

Floatingpoint including special conditions

bn-1, bn-2,…b1, b0

(− bn − 1 2n − 1 + n− 2

∑

i= 0

b

exp

f

e

m

1 (− 1)s { }. f bexp 0

Fm ,e U {− ∞ , − 0, + ∞ , NAN }

1) Same Family Set Inclusions:

n ⊂ n+ 1

n ⊂ n+ 1

n , p ⊂ n + 1, p + 1 b

m ,e ⊂ m ,e ⊂

b b

m + 1,e

m.e + 1

2) Mixed Family Relations: These values are, restricted to be only rational numbers

{ p / q : p, q ∈ } , for which q is a power of the base, but to

simplify notation, we will refer to those rational numbers as just real numbers, especially when the fraction form is not relevant.

bi 2i ) / 2 p

Β.Computer Numeric Value Set Properties There are a few obvious, properties involving the sets just defined. The most important ones are categorised and listed below.

b

1

bi 2i

2p

s

Base b floatingpoint

bi 2i

n− 1

Two’s  n, p compleme nt fixedpoint

m ,e

bi 2i

− bn− 1 2n − 1 + i= 0

Given a positive integer n , the sets  n and  n will contain all unsigned and signed values respectively, which can be encoded with n bits using positional binary and two’s complement encoding respectively. Given

Details

n ⊂ n+ 1

 n = n ,0

0 ∈    2' s  

3) Fixed Point Sets Including Floating Point Sets: The floating-point formats provide for a much wider dynamic range than fixed-point formats. Therefore, it is not expected that for comparative sizes any set from the  family would be included in one from the  family. However, it is interesting to know what are the smallest numbers n and

p for which 2 m,e ⊂ n , p , and wether they always exist.

The following proposition gives the answer.

(m, e) of positive integers, 2 there is a pair (n, p) of integers, so that m ,e ⊂ n , p . Proposition 1: For any pair

One simple proof is to find the smallest values satisfying the inclusion relation. Those are:

nmin = 1 + e max + abs (e min) + m pmin = abs (e min) + m

where emax and emin are the maximum and the minimum values of the exponent of Example 1: Given

2

2

m ,e .

48,9 , we find that emax = 255

= -254, then nmin = 1+255+254+48=558 and pmin =255+48=303. 2 Therefore, 48,9 ⊂ 558,303 . and emin

4) Arithmetic Operations Closure: Defining arithmetic operations º ∈ {+ , − ,*, /} for sets  with  ∈ { n ,  n , n , p , 2' s n , p , b m ,e ,[ b m ,e ]} requires either that the result be considered in another set from the same family, or the reduction of the domain to less than  2 , or the use of special extra artificial elements. For example, the + operator can be evaluated for all pairs from

 n 2 only if the result is considered in  n+ 1 . This makes the correct definition for the addition in  n to be like +

something

:  n 2 →  n+ 1 .

The

set

 n + 1 − (+ ( n )) 2 is in general mapped back into  n 2

using either a convenient correspondence, or the mapping is left to be consequence of the hardware implementation. What was exemplified here for  n similarly occurs for all arithmetic operations and all sets  . There are three cases to be considered for º :  2 →  .

1.Result belongs to  . 2.Result is between two elements of  . In this case a rounding scheme has to be specified3, which

2

3

The '–' sign represents here the set difference, while '+' denotes the name of the addition operator (as a function). Like those classic and well known, to nearest, towards infinity, truncate, etc. The particular rounding scheme is not relevant for this discussion.

designates the choice for one element to be the result of that specific application of the operation. 3.Result is not between two elements of  . Special rules are specified about what is the result. This cases deal with pairs for which the operation is not defined, or it is outside of the range of  because  does not contain ± ∞ . The above three rules represent the semantic when the families (as categorized in Table I) are not mixed, and the best effort is made to provide a result in the set both operands belong to. While this could be a desideratum for some fixed type programming languages, simple examples show that with hardware which is performing computer arithmetic this is not always the case, result getting frequently outside of the initial operand set. Indeed, starting with the simple extra carry bit for addition, one can continue to imagine how more complicated operations on the significant field of floating point numbers place a partial result into a fixed point register (before normalization). Section III below identifies more cases where inter-mixing the sets is highly desirable. Note: Such cross family operators become much simpler when executed directly by the hardware, compared to emulating/modelling that, from within current software-driven programming languages. The emulation task is not impossible, but it is tedious, and that's why the modern hardware description languages are moving towards direct specification of cross family operators. This is something needed for the progress of arithmetic computation, and maybe its evolution from the state where “software is slow-dancing in the rhythm of hardware”, to the more desirable state where the “hardware generation is driven by the requirements enabling progress in computer arithmetic.”

5) Generalized Rounding: One key new ingredient for this is the capability to work with operations º ∈ {+ , − ,*, /} defined as  ×  →  , where the three

sets , ,  could be different, either from same family, but having other precision settings (sizes of the fields), or they could be sets from different families. An obvious requirement for, commutative operators, is that for any a and b, belonging to both  and  , a º b = b º a. When the exact result is not in  , its rounding should occur based on the same three rules specified in the previous paragraph, as they apply to set  . This generalized rounding should also be used as a basis for the (implicit or explicit) conversion between elements from the different sets. That way any particular real number could be “landed” in any particular set, meaning that any number from  can be stored – exactly, rounded, or “re-routed” (case 3 above) into any format. Note that the association of a particular rounding scheme could be done at the lowest granularity that is the operation level. III.MAJOR TRENDS

The general overview of the unsigned values, integer signed values, fixed-point values and floating-point values, presented in the previous section, motivated implications that mixing their sets in hardware operations is mathematically sound, feasible and comes hardware-natural. This section attempts to present compelling examples based on computing trends of why a language describing hardware should enable that mixing too. A second major direction appears to be the capability to change formats dynamically. First there are the changing (morphing) hardware modelling requirements, second there is the capability to evolve designs using automatic algorithms in order to explore the design space during simulation, and third there are improvements towards efficiency in current DSP design flows (like those involving float-fixed point migration). Therefore, need for the capability to dynamically change formats is well motivated. A.Exact Dot Product Hardware Solutions The computation of dot products like

x⋅ y =

n

∑

i= 0

xi ⋅ yi

plays an essential role in DSP, scientific computations, and in verification numerics.

Fig. 1 Accurately computing the dot product.

Most DSP processors come with a special multiplyaccumulate instruction to help accelerate the frequent computation of the dot product. Many scientific computations involving Hilbert spaces, norms, or matrix-based optimizations are also computing most of the time dot products. It is surprising that, the current general-purpose computers do not have yet a built in mode to help avoid errors that often occur when large (xi·yi) product pairs cancel after the contribution of a smaller value product was truncated. A powerful algorithm for this task is based on a long accumulator [3]. The accumulator (see Fig. 1) is a fixed-point accumulator, which satisfies proposition 1, (i.e. it is wide enough to fit any double size floating point accurate product). A few extra guard bits are added to avoid overflow after the

fixed-point addition. At the end of the accumulation phase the fixed-point accumulator is guaranteed to contain the exact dot product. If normalization occurs and the fixed-point accumulator content is read into a float some rounding may occur, but still the floating-point result is guaranteed to be the most accurate possible given the result format restriction. Example 2: Example 1 from section II computed that 2

48,9 ⊂ 558,303 . That means that dot product on vectors

from the space

558+ guard ,303

2

24,8 N can be accurately computed using a fixed-point

accumulator

register,

2 2 N < 2 guard , and one or more *: 24,8 × 24,8 →

2

if

48,9

multipliers, possibly working in parallel. During the accumulation phase some long carry chains may occur. One technique is to use some hardware “all one” flags and efficiently jump the carry. The experimentation with this technique and/or other such optimization has to be specified in the same HDL where the hardware simulation is performed. By being able to mix the sets, the designer and experimenter would be able to concentrate on just increasing performance. B.Predictable Accuracy for Computed Functions When the precision of the numeric value is fixed, it is relatively easy to provide accurate function results, because the optimization is done once only, on known sizes. The support for variable precision brings the challenge to provide predictable accuracy for the computed functions. This can be done by selecting algorithms that are known to converge to a specified accuracy in a number of steps like the CORDIC, or by estimating for each result an upper bound and a lower bound. C.Support for Interval Arithmetic A strong trend towards accurate arithmetic computations is building around interval arithmetic. This is involving full bounded intervals of real numbers, instead of single values. Remarkably, intervals of real numbers can be defined precisely with the sparse sets of computer arithmetic values. Statements about numbers that are not representable can be made based on arithmetic executed on the intervals containing them, which are bounded by representable numbers. The semantic of the operation on intervals is that the result interval contains all the possible results for one operand from the first interval and the second operand from the second interval. 4 However, the interval arithmetic has much wider ramifications in scientific computing, verification of crisp (one value) arithmetic, and proof of critical systems. Each interval operation ends up to be split in a set of cases, and in each case specific operations with rounding-up or rounding-down are performed. The support for convenient rounding specifiable at the level of operation is therefore a requirement, which could enable the implementations of efficient interval arithmetic machines. The usual exceptions of floating-point arithmetic like underflow, overflow, division by zero, or invalid EDA engineers are familiar with such techniques from the timing calculators. 4

operation do not occur in interval arithmetic. While this arithmetic is not applicable to all problems, it is becoming more important, and a request to start a study group for a standard dealing with interval arithmetic was already submitted to the IEEE. Modern computer arithmetic has also extended the precision of interval arithmetic by considering intervals of the form

x=

n

∑

i= 1

where

xi + (ε 1 , ε 2 )

xi , ε 1 , ε 2 ∈ 2 m ,e , x (bold x) is an interval with

boundaries ε 1 +

n

∑

i= 1

xi and ε 2 +

n

∑

i= 1

xi , and the significants

ε i are non-overlapping if accumulated to an appropriate element from n , p . Computations with such of all the xi , and

intervals involve mixing floating point and fixed-point accumulators. An interesting generalization of the normalization operation would bring back the sum of float representation from a long accumulator. This operation would look for the leading one extract a mantissa of size m and build a corresponding floating point. An HDL useful procedure

a of n , p , and return the first floating point x and a − x . The sizes of x could either be passed in or inferred directly form the sizes of a using again would take an element

proposition 1. Note that the operation a − x is a cross-family operation5. With availability of mixed operations the best choice for the hardware implementation of this kind of precise computation could be again well optimised, to take best advantage of the available hardware resources. D.Migration between Floating and Fix Point Solutions We can identify two phases in the ESL design. In a first phase designers concern is only with the numeric values, with the verification of the concept, and creation of a golden model. In a second phase focus is on the register sizes, and encoding, and tradeoffs between accuracy (quantisation errors, overflows etc.) and performance, area, and power are performed. Current HDLs do not support the two-phase approach to ESL design; users are forced to jump between software programs, which are designed for scientific computations and HDLs, or between custom C and HDLs. A particularly interesting phase, which is the transition of the design from floating-point to, fixed-point format, and more recently the decision about what format to use to continue the implementation should be feasible just by changing the descriptors of the types within a single HDL, and not by forcing the transition of the whole design into another environment. Many times designers like to keep parts of a non-transited design running in one format and use the live stimulus to exercise the changed part. The recognised trend is to use more floating-point units and that adds to the decision 5

The operation is just a reset of a particular field. Another mode would just return an integer pointing to the last visited bit.

complexity. E.Design Exploration of Multiple Word Length Paradigm In DSP when fixed-point is used there is a need to optimise the sizes of the registers, to avoid using larger than needed registers on one hand and to avoid unpleasant quantisation errors on the other. While this is done easily for linear systems, by using analytical approaches and some heuristics, the non-linear case (much more often) encountered, has to be solved using also simulation. The current simulations start with a fixed size registers. That’s why the optimization loop includes re-elaboration of the simulator and in many cases a designer intervention in the loop. The automatic, tight loop is feasible only if changes to formats are made at run time. F.Synthesis with Low Power Constraints During low power behavioural synthesis optimization changes are made to a computation graph, to explore the design space. Those changes include size of the operands changes, value encoding, and operator binding. In order to cover a large design space decisions for the next best move are taken based on an estimation of a cost function, which takes into consideration area, timing, and dynamic and static power. To determine the timing and the dynamic power the activity is simulated and statistics are extracted. The time to execute classic re-elaboration of static objects for each small incremental change or exploratory move is prohibitive. A good solution is to provide the functionality in the HDL, which will allow changes of the object characteristics during the same elaborated simulation. G.Base 10 Arithmetic Recent IEEE Std. 754R [14] includes support for decimal floating point. With the known benefits of decimal arithmetic out of the question, the trend is that more devices supporting decimal arithmetic will be built. HDLs used to design and verify such systems are the first to provide the decimal arithmetic capabilities. IV.REQUIREMENTS FOR HARDWARE DESCRIPTION LANGUAGES To each trend discussed in section III, a set of desired HDL capabilities were associated. The rationale was that if the capability were provided in the HDL, more progress would be achieved in the direction of the trend. This is essential given the need to fulfil the commercial requirement for the viability of the HDLs and the tools based on them. A clear requirement was then established for each capability. The summary of this exercise is provided in Table III.

TABLE III ESSENTIAL HDL ARITHMETIC REQUIREMENTS

A

B

C

D

E

F

G

Modern Design Trends Reflected into HDL Requirements Design HDL Desired Requirement Trend Capability Design Multiply pairs of Mixing different sizes Exact Dot floating point and different kinds Product numbers into a (fixed or floating point) of Computation double sized objects in operations and in Hardware float, accumulate assignments, while into a large observing the numeric fixed-point value semantic. register, and use just one final rounding. Predictable Predictable Provide HDL predictable Functions accuracy for all and accurate solutions precision ranges. for all common algebraic and transcendental functions (i.e. sqrt, sin, cos, tan, log, sh, ch, etc.) Implement Operator driven Run time easy change of Interval rounding mode. rounding mode (specified Arithmetic in a procedure call, or Processors flagged by special operator symbol in the HDL) Float-Fix Manageable Provide capability (i.e. a Migration change between descriptor) to change for DSP floating and numerical Design fixed-point representations without representations. any code re-write. Multiple Run time tight Support dynamic (run Word loops including time) format size Length simulation and changes. Exploration format size changes driven by automatic measurements. Power Run time Support dynamic (run Constrained numerical time) representation Synthesis representation changes. changes during design convergence towards minimal power. Decimal Standard decimal Support decimal floatingArithmetic floating point point arithmetic, and with accurate verification customisable mechanisms. choices.

V.STEPS IN THE RIGHT DIRECTION A few good steps to enhance the HDL capabilities to handle variable precision types, and provide for dynamic, run time changes of the type descriptor were taken by both standard organizations and the private sector. This section briefly discusses the new enhancements of the VHDL language IEEE P1076 in ballot at the time of this writing, and will use the result computed in the example 1,

section I, to present how computation of the dot product in the space with 100 dimensions

2

24,8100 using an accumulator

from Q558,303 is described and simulated using Fintronic FinSimMath by extending a popular HDL, the Verilog® HDL. A.Enhancements to VHDL There is good news for the VHDL supporters doing arithmetic with fixed-point and floating-point types. It appears that the proposed new draft contains capabilities to declare types representing values from 2' s

2

Fm ,e and both Qn , p and

Qn , p . The sizes should be provided before elaboration,

and cannot be changed during simulation. Conversion is explicit (requires always a function), and there is a good set of functions supported. From the requirements shown in Table III, A is feasible only with explicit conversions; B is supported but the burden to bring the result close to the desired value is left to the user who has to specify (by intelligent guessing) things like the number of required iterations; C is possible; and D, E, F, G are not possible. Note that fulfilling requirement G is not out of reach, but it will require another specialized package. B.FinSimMath Extensions to Verilog® FinSimMath, an extension of Verilog for Mathematical computations is described in chapter 8 of FinSim's User's Guide [13]. Those extensions satisfy all requirements A, B, C, D, E and F from the table III. G is also feasible. These requirements are satisfied for all families listed in Table I. A FinSimMath tutorial [15] provides running examples in extended Verilog that show how to: (1) modify during the simulation the format (floating point or fixed point), as well as the number of bits used for each field of the format in the model of a low pass filter, (2) perform without the need of explicit conversion functions arithmetic operations with operands of type complex (Cartesian or polar) or of matrices having elements of type complex, including the computation of the inverse of such matrices, (3) compute the pseudo inverse of matrices, (4) separate data from its location using high level constructs such as the "View as" and “InitM” constructs for multi-threading processing, (5) perform FFT, and fast auto-correlation, (6) exchange FinSimMath data between modules, and (7) monitor special conditions such as overflow or underflow. Example 3: The example selected here shows how the dot product in

2

F24,8100 , a real space with 100 dimensions, is

computed accurately using a long accumulator tmp2 which was designed based on the results obtained in Example 2. The full listing was given here just for information, and the details of the descriptor specifications are all available online in chapter 8 of [13]. First line included in the first for loop containing the code

tmp2 = 0; for (i = 0; i < SIZE; i = i + 1) begin tmp1 = V1[i]*V2[i]; tmp2 = tmp2 + tmp1; end v = tmp2; $display("using temporary registers: v = %k\n",

tmp1 = V1[i]*V2[i];

is in fact performing an operation

*: 2 24,8 × 2 24,8 →

2

48,9 v);

and the line tmp2 = tmp2 + tmp1;

+ : 2 F24,8 ×

2' s

performs

Q608,304 →

2' s

Q608,304

while the assignment v = tmp2; which follows after the for loop has an implicit conversion from a fixed point value into a floating point value. The data for the example was chosen for a case where the result without using the long accumulator is wrong. This can be seen in the results printed by the simulator. The command $VpSetDescriptorInfo… before the last for loop of the code example is a dynamic change of the descriptor, reducing the mantissa. The lost in precision appears in the result. A.Code for Example 3 module top; `include "finsimmath.h" parameter SIZE = 100; VpDescriptor d1, d2, d3; VpReg [0:32] V1[SIZE-1:0], V2[SIZE-1:0]; VpReg [0:283] tmp1; VpReg [0:607] tmp2; VpReg [0:32] v; integer i; initial begin $VpSetDescriptorInfo(d1, 8, 24, `FLOATING, `TO_NEAREST_INTEGER_IF_TIE_TO_MINUS_INF, `SATURATION+`WARNING, 1); $VpSetDescriptorInfo(d2, 9, 48, `FLOATING, `TO_NEAREST_INTEGER_IF_TIE_TO_MINUS_INF, `SATURATION+`WARNING, 1); $VpSetDescriptorInfo(d3, 304, 304, `SIGN_MAGNITUDE, `TO_NEAREST_INTEGER_IF_TIE_TO_MINUS_INF, `SATURATION+`WARNING, 1); $VpSetDefaultOptions(8, 24, `FLOATING, `TO_NEAREST_INTEGER_IF_TIE_TO_MINUS_INF, `SATURATION+`WARNING, 1); $VpAssocDescrToData(V1, d1); $VpAssocDescrToData(V2, d1); $VpAssocDescrToData(v, d1); $VpAssocDescrToData(tmp1, d2); $VpAssocDescrToData(tmp2, d3); $InitM(V1, (($I1 == 0.0) ? -11.0 : 1+2**-13)); $InitM(V2, (($I1 == 0.0) ? 9.0 : 1+2**-13)); $PrintM(V1, "%k"); $PrintM(V2, "%k");

v = 0; for (i = 0; i < SIZE; i = i + 1) begin v = v + V1[i]*V2[i]; end $display("without using temporary registers: v = %k\n", v);

$VpSetDescriptorInfo(d2, 8, 24, `FLOATING, `TO_NEAREST_INTEGER_IF_TIE_TO_MINUS_INF, `SATURATION+`WARNING, 1); tmp2 = 0; for (i = 0; i < SIZE; i = i + 1) begin tmp1 = V1[i]*V2[i]; tmp2 = tmp2 + tmp1; end v = tmp2; $display("using temporary registers with small mantissae: v = %k\n", v); end endmodule

B. Example 3 Simulation Results Simulating until no event ... V1[ 0]=11000010.011000000000000000000000 V1[ 1]=00111111.000000000000100000000000 V1[ 2]=00111111.000000000000100000000000 ... V1[99]=00111111.000000000000100000000000 V2[ 0]=01000010.001000000000000000000000 V2[ 1]=00111111.000000000000100000000000 V2[ 2]=00111111.000000000000100000000000

... V2[99]=00111111.000000000000100000000000 using temporary registers: v = 00111001.100011000000011000110000 without using temporary registers: v = 00111001.100011000000000000000000 using temporary registers with small mantissae: v = 00111001.100011000000000000000000 Ending at time 0s.

VI.CONCLUSIONS The tremendous progress in computer technology should be accompanied by extension of the mathematical capacity of the computer. A balanced standard of computer arithmetic should require that the basic components of modern computing (floating-point arithmetic, interval arithmetic, and an exact dot product) should be provided by the computer’s hardware. We presented how those and many other DSP and hardware arithmetic design problems do benefit from enhancements to HDLs including mixing formats and dynamic variable precision.

ACKNOWLEDGMENT The author wishes to acknowledge Dr. Alec Stanculescu for the help in developing and verifying the FinSimMath code example used in this presentation. REFERENCES [1]A. Zamfirescu, “Floating Point Types for Synthesis,” in Proc. of VIUF Fall 2000 Workshop, Orlando, Florida, Oct. 2000.

[2]A. N. Zamfirescu, co-author, J. P. Mermet, editor, Fundamentals and Standards in Hardware Description Languages, Kluwer Academic Publishers, Norwell, MA, 1993.

[3]U. Kulisch, W. Miranker, “The arithmetic of the digital computer: A new approach.” SIAM Rev., 28(1), pp.1-40, 1986.

[4]A. N. Zamfirescu, "Numeric Types for Synthesis," in Proc. of the VHDL International Users Forum, Fall 1992.

[5]A. Zamfirescu, “Modern Numeric Capabilities in Hardware Description Languages,” [Online]. Available: http://alex.zamfirescu.googlepages.com/num_May_10_2006.pdf

[6]Information

technology -- Language independent arithmetic -- Part 1: Integer and floating point arithmetic, ISO/IEC 10967-1:1994

[7]Information

technology -- Language independent arithmetic -- Part 2: Elementary numerical functions, ISO/IEC 10967-2:2001

[8]Information

technology -- Language independent arithmetic -- Part 3: Complex integer and floating point arithmetic and complex elementary numerical functions, ISO/IEC 10967-3:2006

[9]A.

Sinha, A; A. P. Chandrakasan, “Energy efficient filtering using adaptive precision and variable voltage,” in Proc. Twelfth Annual. IEEE International ASIC/SOC Conference, 1999 pp. 327 - 331

[10] J. West, “Raytheon’s MONARCH,” [Online] Available http://insidehpc.com/2007/03/26/raytheons-monarch/

[11] J. Suh, J. O. McMahon, “Implementations of FIR for MONARCH

Processor,” [Online] Available http://www.ll.mit.edu/HPEC/agendas/proc06/ Day1/17_Suh_Abstract.pdf

[12] R. Kirchner, U. Kulisch “Hardware support for interval arithmetic,” Reliable Computing, Vol. 12:3, pp. 225–237, 2006.

[13]

FinSim's User's Guide, Fintronic USA, [Online] Available: http://www.fintronic.com/manual/simug10.1.pdf

[14] Standard for Binary Arithmetic, IEEE Std. 754r , 2008. [15] FinSimMath Tutorial, Fintronic USA, [Online], Available: http://www.fintronic.com/finmath.html

Emerging Description Language Capabilities

be used for software development, and those used to describe and design hardware should be similar with respect to their arithmetic capabilities, this paper ...

Download PDF

279KB Sizes 3 Downloads 174 Views

Report

Emerging Description Language Capabilities

Recommend Documents