1

Memory BIST area estimator using Artificial Neural Networks Aymen Lamine, Nabil Chouba, and Laroussi Bouzaida, {aymen.lamine, nabil.chouba, laroussi.bouzaida}@st.com STMicroelectronics 2083 La Gazelle Ariana, Tunisia

Abstract—Time to market constraints pushes more and more designers to make area estimations early during the design process. Estimating the Built-In Self Test (BIST) area is only possible once the different design memories BIST are synthesized. This is time consuming and not realistic for a large circuit such as a SOC which can include hundreds of memories. In this paper we propose a push button solution for BIST area estimation called BARES based on Artificial Neuronal Networks handling. Experiments have been performed on different STMICROELECTRONICS BIST memories and results prove the efficiency of the proposed method. Index Terms—Artificial Neural Network, Area estimation, Backpropagation algorithm, BIST, memory, Perceptron.

I. INTRODUCTION

B

UILT-in self test (BIST) is the main DFT technique used for memory testing [10, 11, 12, 13]. Compared to other techniques, it is characterized by its good fault coverage, test speed and enabling direct test access to embedded memory; however it increases the area and gives small information for diagnosis. Even though the BIST area is very small compared to the memory one, it represents a real issue in nowadays circuits (e.g. SOC) where the number of integrated memories is continuously growing and accumulating their BIST areas leads to a significant global area overhead. Considering BIST strategy, designers are facing a big dilemma: How to estimate the total area overhead early in the design flow and hence the DFT memory cost? How to save test run time and at the same time control the circuit overall area overhead? The problem is that the best estimation of the BIST area overhead, before going far in the design flow, is obtained after RTL synthesis which is time and synthesis license consuming especially when the number of BIST in the design is important leading designers to spend lot of time in exploring BIST strategies [10, 11, 12, 13] and options as bitmap, redundancy, etc… Actually, SOC designers need to have, early during the design flow, an area estimation in order to be able to make some modifications or optimizations to full fit their

specifications. Time-to-market constraints push more and more designers to reduce designing time, and SOC is in a large part contributing in this strategy, so it is not acceptable to spend a lot of time calculating BIST area overhead for different architectural configurations. Thus, there’s a need of a faster way of estimation, let say an instant BIST area estimation for a given memory CUT. Within STMICROELECTRONICS, a statistical based technique has been developed to respond to this need, for confidentiality reasons we can’t reference it, but let say that it is able to furnish instantly the designer with a good estimation of the BIST area but it has to be customized for each kind of memory and in some case for some special configuration and this induces complexity and time consumption in the estimator itself development. It is important to notice that there’s a large subset of memories with different kinds of options in addition to new ones that are continuously implemented following customer’s requests and technology evolutions. In this paper we present a new method implemented within a tool we named BARES. This method is built around the artificial neural networks (ANN) technique. BARES instantly and accurately approximates the bist area variation considering the memories parameters. The area estimation is based on multilayer neural networks integrated into the tool. These networks are trained on different memory CUT BIST area with different parameters and BIST options using the backpropagation learning algorithm. Experiments have been performed on different STMICROELECTRONICS BIST memories and results prove the efficiency of the proposed method. This paper is organized as follows: the second section describes the issue and presents an overview of the proposed solution. The third section focuses on the memories parameters that can affect the BIST area and should be considered by our estimator. The fourth section describes the neural networks learning and validation as well as their integration within the tool. Next, we present the results of the experiments we performed and the level of accuracy we reached. Finally, conclusions constitute the last part of the paper.

2 II. ISSUE, STATE OF THE ART & PROPOSED SOLUTION A. Core issue For a given memory CUT (words number, bits number, MUX factor, etc…) calculate instantly, during the system level design step and before synthesis, the area overhead due to BIST insertion. As a prelude, we decided that the rough calculation delta between estimated area and the synthesis resulting one should be less than 10%. As far as we know, such issue has never been treated in the different IEEE publications. The unique project which succeeded to deliver a solution has been conducted within one of the STMICROELECTRONICS divisions. We will not go through its details; however in the introduction it has been mentioned. The solution we present uses an artificial neural networks technique. Several ANN are built and the principle is to make them learn the BIST area variation while varying some parameters like the number of words or bits for and right through different memories or BIST specific configurations such as redundancy, mask_bits, and also synthesis constraints like frequency. Learning phase produces a set of small files storing the ANN pertinent information (set of specific weights). BARES assembles all those files, each one integrating an ANN of a specific memory BIST configuration, and once invoked, through a simple command, it uses the appropriated ANN weights to calculate instantly the estimated area. For the following, let’s make the convention that integrating a memory within BARES means that we generate its ANN set of files and we link them to BARES kernel. III. PRELIMINARY STUDIES In order to specify the different parameters that can have an influence on BIST area estimation we did several preliminary studies. The idea was to only concentrate the learning phase on those who really impact area when varying and in the same time increase the ANN convergence speed. Without going in details, let’s say that we reached the conclusion that the following memory or BIST specific characteristics have a real impact on the area when we are varying: A. Memory & BIST specific parameters • CMOS technology • Words • Bits • BIST frequency • Redundancy • Mask_bits • Synthesis timing constraint The synthesis timing constraints are related to slack problems on the memory busses. In order to avoid them we fixed the maximum delay on those busses to 1.1 ns which according to our specific case will avoid such problems.

For each memory, four ANN are obtained through a learning phase: • Standard configuration • Mask_bits • Redundancy • Mask_bits and redundancy. During the learning phase of those four ANN, the parameters, Words, Bits and BIST frequency vary within limits we fix according to memories and technologies constraints. In another hand, it has been decided to ignore the MUX factor which doesn’t have any impact on the BIST area except in the specific case of small sized memories where the MUX factor can change the BIST configuration itself. The results presented in (Figure 1) show that: • In 97.4% of cases, the area variation due to MUX one is lower than 4%. • In 88.5% of cases, the area variation due to MUX one is lower than 2% The average of variation is 1.06%.

Figure 1: BIST area variation for different memories with several mux factors compared to mux factor 4. B. Targeted memories BARES targets SRAM memories: • Single port SRAM • Dual port SRAM C. Black boxes or synthesis memory synthesis models This study scope was to evaluate the influence, on the BIST area, of synthesizing the BIST with, or without, a memory synthesis models. The obtained results (Table 1) show that, for the targeted memories, the area variation between using a memory synthesis model and considering the memory as black box doesn’t exceed in any case 9%, while the average variation is about 2%. The cases where the difference exceeds 5% are very limited. This low variation allows us to consider the use of black boxes instead of memory models. In fact, this will give us the possibility to easily and quickly build a large database for the learning phase. We simply avoid memory synthesis model generation for each element of the database (long internal procedure).

3 Table 1 : BIST area variation with and without use of memory model Synthesized BIST Memory BIST area (µm2) parameters Variation frequency with without (%) (MHz) memory memory word bit mux model model 128 4 4 200 3320.24 3403.66 2.51 256 4 4 200 3404.75 3485.98 2.39 2048 4 4 200 3641.83 3730.74 2.44 128 8 4 200 3521.11 3584.76 1.81 1024 4 4 200 3827.33 3822.94 0.15 256 32 4 200 4637.36 4788.83 3.27 2048 32 4 200 4876.64 5024.81 3.04 4096 32 4 200 4949.08 5084.18 2.75 128 64 4 200 6009.36 6150.97 2.35 512 64 4 200 6160.82 6315.59 2.51 4096 64 4 200 6392.42 6560.35 2.63 IV. IMPLEMENTATION A. The Multi-layer feed-forward networks As mentioned before, the neuronal network used in our case is the multilayer feed-forward network with a backpropagation learning rule. A feed-forward network has a layered structure. Each layer consists of units which receive their inputs from units from the layer below (n-1) and send their outputs to units in a layer above (n+1). There are no connections within a layer. Although backpropagation can be applied to networks with any number of layers, it has been shown in ([5], [6], [7] and [8]) that only one layer of hidden units suffices to approximate any function with finitely many discontinuities to arbitrary precision, provided the activation functions of the hidden units are non-linear (the universal approximation theorem). In our application the feed-forward network with a three layers of hidden units is used with a sigmoid activation function for the units. The backpropagation algorithm [10] is a gradient-descent method minimizing the squared error cost function. It is the algorithm used for the learning phase. 1) Networks architecture In our specific application, each network has 5 layers and each layer has a specific neurons number: • Input layer: 3 neurons (fix). • 3 hidden layers: 4, 6 and 3 neurons to start the ANN building flow, but their number can vary during optimizations. • Output layer: 1 neuron (fix). 2) ANN building flow The number of neurons in each hidden layer is set equal for all the ANN during the initial state. During learning phase the number of neurons can be increased or decreased depending

on the learning convergence of each network (i.e. memory type). The neurons connection weights are all initialized to a very small random values for all networks. a)

Learning rate

The learning procedure requires that the change in weight should be proportional to the error delta between two output patterns. True gradient descent requires that infinitesimal steps are considered. The constant of proportionality is the learning rate. For practical purposes we choose a learning rate as large as possible while controlling at the same time the oscillations. In our case, we did choose a learning rate equal to 0.3; this is coming from several experimental tests. b)

Building databases

As mentioned before, the use of ANN for the BIST area estimation requires a large amount of data. These databases, containing massive information on a given memory in the shape of CUTs configuration associated to their synthesized BIST area, are used for the learning step of an ANN. Those databases are tabular files containing the inputs (words, bits, mux and frequency) and outputs (BIST area) of the ANN. The same databases are used later in order to test and validate the built networks. The first step of a memory integration within BARES is building the database. Making the decision to use black boxes for memories simplified a lot this step since we just need to synthesize BIST which we is generated automatically with our internal STMICROELECTRONICS BIST generator ( ugnBIST ). The task is done by an automatic database generator, which reads a specification file indicating the targeted memory and the BIST specific configuration, and then goes through the synthesis flow and reports the areas. The resulting database is then split into two parts, one used for the learning phase (75% of the initial database) and the second one for test and validation (the remaining 25%). This split is randomly done. Finally, both databases are formatted to fit the input and output format of the neural network where all the values must be in the range] 0, 1[. c)

Learning step

During this process, a network, defined by its architecture and initial connections weight, is trained to a targeted database. The learning phase of the multilayer feed-forward network is supervised since it uses the backpropagation learning rule. To get a sufficient control on the learning phase, we report, during this stage, both the error and the squarederror. While the error is the principle criteria to end the learning, the use of the squared-error makes us predict the behavior of the network. Thus we can early decide whether the network will converge or it needs more optimization

4 d)

Learning optimization

• Learning rate optimization One way to optimize the learning rate is to completely avoid oscillations. This is done by making a correlation between the new weights introduced in the network and the previous ones, by adding a momentum term. When no momentum term is used, it takes long time before the minimum is reached with a low learning rate. For high learning rates the minimum is never reached because of oscillations. When adding the momentum term, the minimum is reached faster and local minima are avoided. In our case we did set the momentum to 0.25. This optimization method is the first one we invoke for its efficiency. Indeed, it is hard-coded in the backpropagation algorithm. • Network architecture optimization This optimization is done manually by editing the network architecture when the learning algorithm fails to converge or when we observe an over learning. In both cases a modification in the neurons number in each hidden layer is necessary: in the first case we add some units while in the second case, we remove some of them. • Database random selection As it is mentioned in section IV-1-b, the more randomly the database patterns are selected, the more efficient the learning will be. In case of a failing learning process, we can restart it, without re-synthesizing, from the same initial database by doing another random selection. This optimization is usually used as last alternative and this was decided throughout multiple experimental processes. 3) Validation step In this stage, each configured network is tested using the validation database. For each memory CUT the synthesized BIST area is compared to the estimated one and the difference error is calculated. If the delta error exceeds the threshold (%10), another neural network is built up with a different architecture and trained again for the same memory CUT and BIST configurations. In case of complying results, the neural network and its synaptic weight configuration is saved and integrated to BARES.

A. Single port The figures below (Figures 2, 3 and 4) show the obtained results for different kind of single port memory CUT with different BIST configurations: For the SPSMALL memory BIST with default options, shown in Figure 2, only 5% of memory CUT had a delta error greater than 2%. Moreover, in about 75% of cases the delta error didn’t exceed 1%. Figure 3 shows the results obtained for a SPHD memory BIST which is larger than the SPSMALL one. We got only 7% of cases were the delta error exceeded 2%, however, the cases where the delta error was lower than 1% decreased to 63%.

Figure 2: Error distribution for single port memory (SPSMALL) with default BIST options.

V. RESULTS The results shown in this section are the calculated delta errors between synthesized and estimated BIST area as obtained during the validation step (section IV-3). Histograms in figures hereafter ( Figures 2 to 7) plot the error distribution for single port memory BIST reporting the error percentile range on X axis and the number of memory CUT on Y axis. Experiments have been performed on a large set of STMICROELECTRONICS SRAM (single or dual port) memories. The worst delta never exceeded 10%, (the threshold fixed for this estimator). Moreover, the maximum one was only 6% and in about 90% of cases we got a delta under 2%.

Figure 3: Error distribution for single port memory (SPHD) with default BIST options. The last example shown in Figure 4 is about the SPHS memory BIST used with both redundancy and mask_bits options. With these configurations, this BIST is the biggest one

5 amongst the ones shown for the single port memories. The obtained results show that in about 14% of cases we got delta error greater than 2%. Meanwhile, the cases where the delta error didn’t exceed 1% were only about 60%.

The last example shown in Figure 7 is about the DPHS memory BIST used with both redundancy and mask_bits options. The obtained results show that in about 25% of cases we got delta error greater than 2%. Meanwhile, the cases where the delta error didn’t exceed 1% were only about 51%.

Figure 4: Error distribution for single port memory (SPHS) with redundancy and mask_bits BIST options enabled. B. Dual port The figures below (Figures 5, 6 and 7) show the obtained results for different kind of dual port memory CUT with different BIST configurations: For the DPHD memory BIST with default options, shown in Figure 5, only 11% of memory CUT had a delta error greater than 2%. But, in about 60% of cases the delta error didn’t exceed 1%. Figure 6 shows the results obtained for the same type of memory but with a different BIST configuration, actually enabling the mask_bits option. We got only 17% of cases were the delta error exceeded 2%. However, the cases where the delta error was lower than 1% decreased to 52%.

Figure 6: Error distribution for a dual port memory (DPHD) with mask_bits BIST option enabled.

Figure 7: Error distribution for dual port memory (DPHS) with redundancy and mask_bits BIST options enabled. VI. BARES MEMORY INTEGRATION PLATFORM

Figure 5: Error distribution for a dual port memory (DPHD) with default BIST options.

Now, BARES is used in two different divisions of STMICROELECTRONICS worldwide and users are very satisfied by its efficiency. As adding new memories is time consuming we developed a complete automatic integration platform (Figure 8). This includes databases generation, learning steps, optimization and integration within the tool. For a new memory, the user just needs to prepare a specification file including all necessary information about the memory, its BIST, the development platform and CAD tools (BIST generator & synthesizer).

6 [3] [4]

Neuronal network topology Memory parameters BIST parameters Synthesis and ugnBIST version

[5]

CMOS Corelib technology files

[6]

Specification files

[7] [8]

Database generation [9]

Synthesis Test pattern

[10]

Learning pattern

!

Normalization

[11]

Learning

[12]

! Testing resulting network Automatic neuronal network integration platform

Neuronal network topology Synaptic weight Result files used by BARES

Figure 8: Automatic integration platform flow VII. CONCLUSIONS In this paper we described the methodology we did follow to build up our solution for BIST area estimation and the different steps this one, implemented through a software tool, follows in order to give accurate BIST area estimation for a given memory and BIST configuration. This accuracy and flexibility was made possible using the Artificial Neural Networks concept. Building ANN requires large databases which cover the maximum of memories and BIST configuration. The integration platform we deliver with BARES (section VI) overcomes this constraint and offers the users an extensible tool giving very efficient BIST area estimation. REFERENCES [1]

[2]

H. Narazaki & A.L. Ralescu, “An improved synthesis method for multilayered neural networks using qualitative knowledge,” IEEE Trans. on fuzzy systems, vol. 1, No. 2, May 1993. P. Ruzicka, “Learning neural networks with respect to weight errors,” IEEE Trans. on circuits and systems – Fundamental theory and applications, Vol. 40, No. 5, May 1993.

[13]

B.M. Wilamowski, O. Kaynak, S. Iplikci & M.Ö. Efe, “An algorithm for fast convergence in training neural networks”. B. Kröse and P. van der Smagt, “An introduction to neural networks”, online document, Univ. of Amsterdam, 8th edition, Nov 1996, Available: http:// http://neuron.tuke.sk/math.chtf.stuba.sk/pub/vlado/NN_books_texts/Kro se_Smagt_neuro-intro.pdf. K. Hornik, M. Stinchcombe and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, Vol. 2, 1989, pp. 359-366. K. Funahashi, “On the approximate realization of continuous mappings by neural networks,” Neural Networks, Vol. 2, 1989, pp. 183-192. G. Cybenko, “Approximation by superposition of a sigmoidal function,” Mathematics of control, signals and systems, Vol. 2, 1989, pp. 303-314. E. J. Hartman, J. D. Keeler and J. M. Kowalski “Layered neural networks with Gaussian hidden units as universal approximations,” Neural Computation, Vol. 2, 1990, pp. 210–215. C. Touzet “Les réseaux de neurons artificiels, Introduction au connexionnisme,” Jul. 1992, Available: http://www.up.univ-mrs.fr. Gorsche, S.S. “An efficient memory fault-test technique for ASIC-based memories” Communications, 1992. ICC 92, Conference record, SUPERCOMM/ICC '92, Discovering a New World of Communications. IEEE International Conference on 14-18 June 1992 Page(s):136 – 141 vol.1 Yuejian Wu; Gupta, S. “Built-in self-test for multi-port RAMs”, Test Symposium, 1997. (ATS '97) Proceedings., Sixth Asian 17-19 Nov. 1997 Page(s):398 – 403 Schober, V.; Paul, S.; Picot, O. “Memory built-in self-repair using redundant words” Test Conference, 2001. Proceedings. International 30 Oct.-1 Nov. 2001 Page(s):995 – 1001 Bahl, Swapnil; Srivastava, Vishal; “Self-Programmable Shared BIST for Testing Multiple Memories” European Test, 2008 13th 25-29 May 2008 Page(s):91 - 96

Memory BIST area estimator using ANN

Considering BIST strategy, designers are facing a big dilemma: How to ... customer's requests and technology evolutions. In this paper .... estimation requires a large amount of data. ... for the learning phase (75% of the initial database) and the.

340KB Sizes 1 Downloads 126 Views

Recommend Documents

Estimator Position.pdf
Must be computer literate and proficient in Microsoft Word, Excel, and Outlook;. Positive Attitude;. Customer Service Orientated;. Office Skill: Phones ...

Memory-Efficient GroupBy-Aggregate using ...
Of course, this requires a lazy aggregator to be finalized, where all scheduled-but-not-executed aggregations are performed, before the aggregated values can be accessed. Lazy aggregation works well in scenarios where ..... for collision resolution;

ANN Based Speech Emotion Using Multi - Model Feature Fusion ...
... categories e ∈ E using a stan- dard linear interpolation with parameter λ, for i = 1 . . . k: P(wi ... by A to a string x is [[A]](x) = − log PA(x). ... with model interpolation. ... ANN Based Speech Emotion Using Multi - Model Feature Fusio

Improving FPGA Performance and Area Using an ... - Springer Link
input sharing and fracturability we are able to get the advantages of larger LUT sizes ... ther improvements built on the ALM we can actually show an area benefit. 2 Logic ..... results comparing production software and timing models in both cases an

Urban-Area Segmentation Using Visual Words
BoW approach to image analysis tasks, we first need to define a visual analogy for word ... clustering and retrieval in areas of large image data sets, video data, and medical ..... mining in remote sensing imagery,” IEEE Trans. Geosci. Remote ...

A User Location and Tracking System using Wireless Local Area ...
A User Location and Tracking System using Wireless Local Area Network. Kent Nishimori ... Area Network signal strength and Geographical. Information ..... The initial K-nearest neighbor algorithm [1] takes all of the K selected reference points and a

Energy Efficient Area Monitoring Using Information ...
1. ' &. $. %. Outline. • Introduction. • Background. • Coverage. – Physical Coverage Vs Information Coverage. • Proposed GB-FAIC Algorithm. • Result and ...

Terminal Area Trajectory Optimization using Simulated ...
the source, we can send out explorers each travelling at a constant speed and ... time, it would need a significant amount of time to pre-build the whole network with ..... pilot/FMS optimal trajectory data from a ground-station with a high-speed ...

Improving FPGA Performance and Area Using an ... - Springer Link
that a 4-LUT provides the best area-delay product. .... This terminology is necessary in order to account for area later. ... a 12% overall savings in ALM area.

APRIL-ANN - GitHub
1.2 Auto-completion . ... 2.17 Matrix class operations (no instance methods) . ...... A token represents anything, and is used in several parts of the toolkit for ...... It writes to the given filename a DOT graph which can be transformed in PDF.

APRIL-ANN - GitHub
training epoch is the training with the full dataset. This simple ...... greedy layerwise algorithm but introducing at noise input of each layerwise auto-encoder.

Bayesian Estimator of Selfing (BES) - GitHub
Next, download some additional modules for particular mating systems. These files are .... This variant is run by specifying -m AndroID.hs on the command line.

Using Remote Memory Paging for Handheld Devices in ...
that finds out whether there is any remote paging service available. If the device ... stored in a hard disk for desktop machines) are sent to the remote server where they are ..... Program. www.microsoft.com/msj/0598/memory.htm (1998). 10.

Using the EEPROM memory in AVR-GCC - GitHub
Jul 17, 2016 - 3. 2 The avr-libc EEPROM functions. 4. 2.1 Including the avr-libc EEPROM header . .... Now, we then call our eeprom_read_byte() routine, which expects a ... in size) can be written and read in much the same way, except they.

Memory-Efficient and Scalable Virtual Routers Using ...
Mar 1, 2011 - C.2 [Computer Communication Networks]: Internet- working routers. General ... random access memory (DRAM/SRAM)-based solutions. In.

APRIL-ANN - GitHub
It allows to auto-complete pathnames, global names, table fields, and .... Second, the training input and output dataset are generated following this code: -- TRAINING -- ...... for dropout (see dropout http://www.cs.toronto.edu/~nitish/msc_thesis.pd

Sparse Distributed Memory Using Rank-Order Neural ...
a decreasing trend as the number of 1's in each address decoder is increased, although ... The performance of a rank-order SDM under error-free input conditions is shown in ..... tronics industry, with clients including Pace Micro. Technology.

Sequence modeling using a memory controller ...
This includes language modeling [2] machine translation [3], analysis of audio [4] and video [5], acoustic modeling of speech [6] and modeling clinical data .... sensor was used for the purpose of monitoring and anomaly detection (see 4a). In figure

On using network attached disks as shared memory - Semantic Scholar
Permission to make digital or hard copies of all or part of this work for personal or classroom use ... executes requests to read and write blocks of data. It can be.

Dialog Act Tagging using Memory-Based Learning - CiteSeerX
1. “Dialog Systems” class, Spring 2002. - TERM PROJECT -. Dialog Act Tagging using Memory-Based Learning. Mihai Rotaru ... Different DA schemes have been devised in recent years, some relevant to particular ..... Confusion matrix for high frequen

Learning Area
A. Using an illustration. Identify the ... tape, a piece of flat wooden board. lll Procedure: ... connect the bulb to the switch, (as shown in the illustration below). 4.