IEEE International Workshop on Virtual and Intelligent Measurement Systems Budapest, Hungary, May 19−20, 2001.

High speed stand−alone visual decision maker device based on Focal Plane Analogic Processor Array Chip L. Török, Á. Zarándy, T. Roska {torok, zarandy, roska}@sztaki.hu

Analogical & Neural Computing Systems Laboratory at Computer and Automation Research Institute of the Hungarian Academy of Sciences, H−1111 Budapest, Kende u. 13−17, Hungary Abstract Newly emerging problems require high speed decision making based on visual perception of the environment. A project was set up to construct an intelligent agent like self−contained device that is capable to act in real−time and show collaborative behavior. Giving the hardware basis for decisions to be made a cellular non−linear network CNN [1,2] chip implementation’s optical input is used in combination with the corporative devices’ information that is received via binary ports and serial ports. The proposed design is a self−contained compact device that is prepared to operate stand−alone for up to 10 hours running on medium sized batteries while doing measurements, logging and collaborating with its environment via parallel port (for image transfer), RS−232 port (using modbus, profibus, PPP protocols) and binary i/o−s. Intelligent power module, optical isolation, watch dog capability was also considered.

64x64 pixels[4]. Experiments show that its optical input is in the magnitude of 200−1000 fps but the cell complexity that is essential during algorithm optimization is significantly higher. The successor chip generation (ACE16k) that is just in production is expected to operate at 10,000 fps in 128x128 pixel size.

Keywords high speed imaging, stand−alone system, real−time imaging, Cellular Neural Network, Focal Focal Plane Analogic Processor Array, decision maker

I. INTRODUCTION The CNN Universal Machine is a massively parallel analog processor array architecture [3]. The current implementation named ACE4k [4] already exceeds 1 Terra Operation Per Second calculation capability. The already existing experimental installations are based on the ancestor generation of this chip family (ACE440) was capable to sense and process images at speed of 50.000 frame per second (fps) in 20x22 pixel size [5,6]. The construction of a self−contained CNN device that satisfies industrial criteria has been a long standing need. The introduced device uses a chip (ACE4k) with size of

Figure 1: The current status of the device and the ACE4k.

The Current Status of the Project Hardware has been built. Software integration is on the way. It is very likely that the time of conference the engine board will be fully programmed.

II. ACE4K IS NOW APPLIED AS FOCAL FOCAL PLANE ANALOGIC PROCESSOR ARRAY The central visual microprocessor is a CMOS implementation of the CNN Universal Machine [3]. The basic principle behind the theorem is to form a grid of cells that are connected to each other along with the edges of the grid and in the grid points are the processing elements. Each processing element can compute two convolutions of the neighbors by an NxN size matrix, named template, and use a sigmoid alike transfer function. As long as one of the convolutions is a feed forward of the input map, the other one is a feedback of the transformed current map (also named state). The sum of this computation defines the derivative of the state value. The mentioned operation can formalized by the following differential equation,

x˜ =Bx i,j A



C k,l ∈S r i,j



C k,l ∈S r i,j

A i,j;k,l y k,lA

B i,j;k,l u k,lAz i,j

(1)

with the output equation,

y i,j = f x i,j =

1 1 xA1 B xB1 2 2

The major application area is image processing among others. Operation on segmented pictures is often a must and there was a possibility to shape binary logic on chip nearby the fore−mentioned processing element. For establishing communication with the outside world ACE4k has analog, binary and control buses. The chip itself contains light sensors integrated that can be used as high speed optical input. A sharp picture focused on the chip can be used as input for a template operation.

III. HARDWARE ARCHITECTURE The ACE4k is a passive element from traditional aspect of CPU. Its buses and lines significantly differ. They must be interfaced to traditional buses using A/D converters and a PLD device. In this project we aimed to exploit advantages of the ACE4k in a mobile box and have a considerable calculation power in a compressed form, hence a low power Texas digital signal processor (DSP) is also embedded. Ports were also added to give possibility for communication with the outside world. However serial port (RS−232C) is the umbilical cord, optically isolated binary input/output ports also took place on board. If higher communication speed is required, for example image transfer purposes, parallel port can be used. This can be useful in collaborative environment in which overall decision depending on the current scene is result of a collaborative work of other devices.

(2)

where

x i,j y i,j u i,j z i,j t A, B



current map state at position i,j



output map



input map

− − −

independent bias value time template matrices.

In the first equation, the template values can be programmed to achieve the desired transformation of the map. One can see that a and b terms are responsible for the feedback and the feed−forward operations, respectively [1,2]. There are extensions to this simple equation in the central role that leads to the non−linear forms [7]. They are much more powerful but causes even higher increase in complexity in the realization. Our chip ACE4k implements this calculation in analog way. The inherent convolution size is 3x3. It is also named as template size. Due to this massively parallel architecture the speed of convergence that can be characterized by tcnn is in the magnitude of few microseconds. ACE4k also consist of several other elements to attain the role of the central processing unit such as memories for maps that can be used as input, output at will.

Figure 2: Architecture of CNN engine board . Several component makes the device reliable and still handy in industrial environment. Real−time clock, flash storage device and intelligent power module is always a must in embedded environment.

The board Since ACE4k is a low voltage element (3.3V) all components were selected from CMOS low voltage families. The former designs that drove Ace440 used C25 of the Texas DSP’s so for software compatibility reasons some the derivatives was selected. Apparently the LC206 runs low voltage and can be driven 80Mhz so it seemed to be a

reasonable choice with its serial and flash booting capability. To run a board on 80Mhz and other aspects coming from analog operation required hand−made design to minimize cross−talks, EMC, etc... Now the board reliably operates on full speed. The ACE4k requires double power feeding according to its analogical construction that is analog and logical at same time. The chip in central role consumes current of 1.2A from analog supply in certain circumstances but only for a few nanoseconds. The board had to be prepared to feed it without disturbing other analog and digital lines with extreme ceramic and tantalum decoupling capacitors parallel. Power lines follow radial arrangement from the center of board, named power center. The middle two layers of the board were dedicated to power lines exclusively but shared between analog and digital lines. ACE4k communication elements The given chip’s analog bus consists of 16 analog lines. Generally, they are used for input and output as well but our construction used them for output exclusively. They were interfaced with two A/D converters from Analog Devices with 8 parallel channels. (AD7829) The 2 mega sample per second (MSPS) in 8 bit precision conversion speed is fully exploited. For interfacing logic and control buses one of the Xilinx cPLD was found to fit our requirements which covered the complexity, number of pins, macrocell−speed, and low voltage capability. For system bus interfacing (i.e.: processor, memory and ports bus) an advanced double buffered 16 bit transceiver/receiver was used. (SN74ALVCH16646) Power module This is always crucial to the system stability. It consists of two of the 1.5A fixed voltage Low Dropout Regulators (LDO) for Linear Technology. (LT1083−33) Its ripple− rejection and RMS output noise characteristics seemed to satisfy our requirements. Maxim’s microprocessor supervisory circuit (MAX793) was designed for portable devices and provides backup battery switchover among other features such as low line indication signal that is connected to processors’ non−maskable interrupt (NMI), write protection loop for CMOS memory, watchdog and manual reset. Its reset stability made this component essential element of the system health monitoring part. A 12 bit serial AD converter for power and temperature monitoring for the similar reason.

greatest help during debugging that communicates on this port as well. The same port is to be used for establishing communication with other systems in given protocols. A charge pump equipped driver IC was selected for voltage conversion that has automatic shutdown/wake−up feature as well as 15kV ESD protection capability.(MAX3233) Parallel port driving capability is used for high speed image transfer but in during industrial environment it is not intended to be used. The communicational components took place on the second board so easy removal can be done if the space for board is a limitation. (See figure 1.) Optically isolated binary i/o interface took place on the base board.

IV. SOFTWARE ARCHITECTURE Software architecture is one of the most critical part from the point of the application. Former software constructions supported the download and run once methodology in use. Now the new architecture require much more sophisticated handling of resources. Setting up communication, while processing and interaction with ACE4k require obviously multitasking environment. To exploit the highest computation speed and compensate typical analog signal problems, critical timings have to be meet, so an independent real−time scheduler is designed into the device. uC/OS [8] was found to be small enough for easy handling, versatile to port and still powerful to do timing jobs. The software architecture shows layered organization. The kernel of the device consists of standard inter process communication elements, device service components such as serial port handler and task serializer etc. Several task takes place on user layer. One of them constantly monitors system health (temperature, stack usage). Monitoring can happen by a standard terminal software. Depending on the designated application either C generated code or an interpreter can run that can load and run codes generated in a PC software named Visual Mouse.

V. APPLICATION AREAS The uniqueness of the device is the high speed computation capability in remote places. Even eyes cannot recognize what it can process and act in fraction of time. Subconscious phenomena can be identified with such devices that we do not know yet. Assumptions can be judged that could not be done else way.

Accessing the outside world

Designated areas

RS−232 is far the most viable solution to do that. Eventually the software development happens by downloading code time−to−time on this navel string. A terminal software is the

It is hard to say what will be the major application area, indeed, but it is very likely that safety technology and medical systems need such computation speed for example.

The new area of industrial sensors require devices that can process and behave intelligent manner. The newly developed device has the capability to run a machine learning codes however it is very unlikely to run a complete learning course on embedded hardware. Its ports are fundamentally designed to own the capability of easy interfacing with commonly used PLC’s or even with its counterparts. Implementation of standard protocols is still a due. Experiments are designed to achieve person independent face countenance recognition and neural network based truck coordination to target points.

References [1]

L.O. Chua and L. Yang, "Cellular Neural Networks: Theory", IEEE Trans. on Circuits and Systems, (CAS), Vol.35. pp. 1257−1272, 1988

[2]

L.O. Chua and L. Yang, "Cellular neural networks: Applications", IEEE Trans. on Circuits and Systems, (CAS), Vol.35. pp. 1273−1290, 1988

[3]

T.Roska and L.O. Chua: ‘‘The CNN universal machine: an analogic array computer’’, IEEE Trans. on Circuits and Systems II: Analog and Digital Processing, (CAS−II), Vol. 40, No. 3, pp. 163−173, 1993.[4] Á. Rodriguez− Vázquez, S. Espejo, R. Dominguez−Castro, and G. Linan ‘‘The 64x64 Analog Input CNN Universal Machine Chip and its ARAM’’, Proceedings of International Symposium on Nonlinear Theory and Applications, (NOLTA ’98), pp. 667− 670, Le Régent, Switzerland, 2−88074−381−5, 1998

[5]

S. Espejo, A.Rodriguez−Vázquez, R. A. Carmona, P. Földesy, Á. Zarándy, P. Szolgay, T. Szirányi, and T. Roska, “0.8mm CMOS Two Dimensional Programmable Mixed− Signal Focal−Plane Array Processor with On−Chip Binary Imaging and Instruction Storage”, IEEE Journal on Solid State Circuits, Vol. 32. No. 7. pp.1013−1026. July, 1997.

[6]

Á. Zarándy, M. Csapodi, T. Roska, "20 microsec Focal Plane Image Processing", Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA’2000), pp. 267−272, Catania,ISBN 0−7803−6344−2, 2000

[7]

Cs. Rekeczky "Dynamic Spatio−Temporal Nonlinear Filtering and Detection on CNN Architecture − Theory, Modeling and Applications" Ph.D. Dissertation, Analogical and Neural Computing Systems Laboratory,Computer and Automation Institute, Hungarian Academy of Sciences, Budapest 1998.

[8]

uC/OS official web site: www.ucos−ii.com

Completed projects Textures having equal "darkness" has been classified using templates optimized by genetic algorithm [9]. A single template separated four different textures by transforming the average darkness of the pictures to four identifiable folds. Textile error detection in which pattern size is not larger than the chip size (e.g.: 64x64) has been achieved. In operation, the device’s optical input scans the surface of the textile while controls the sliding motor or gives warning signal to the power−loom. (See figure 3.) Experiments were made on other equipment but using only those operations that are available on the newly developed device.

Error free texture

Knob appears in ACE4k window

A knob clearly identified

Figure 3: Phases in the textile error detection

Acknowledgements This work has been supported by the DICTAM (ESPRIT IS 1999−19007) project.

[9] T. Szirányi, M. Csapodi, "Texture Classification and Segmentation by Cellular Neural Network using Genetic Learning", Computer Vision and Image Understanding, (CVIU), Vol.71. No.3. pp. 255−270, 1998

I. INTRODUCTION High speed stand−alone visual ...

Software integration is on the way. It is very likely .... the software development happens by downloading code ... Figure 3: Phases in the textile error detection.

614KB Sizes 2 Downloads 36 Views

Recommend Documents

Improving Visual Servoing Control with High Speed ...
[email protected]. Abstract— In this paper, we present a visual servoing control ... Electronic cameras used in machine vision applications employ a CCD ...

High Speed Networks
as an alternative for Internet applications that use multiple TCP connections. To allow ... Key words: LEDBAT, congestion control, high-speed networks, real-time applications, peer-to-peer applications ...... value of Gm, that is close to Gtstdy.

pdf-90\high-speed-networking-a-systematic-approach-to-high ...
Page 1 of 11. HIGH-SPEED NETWORKING: A. SYSTEMATIC APPROACH TO HIGH- BANDWIDTH LOW-LATENCY. COMMUNICATION BY JAMES P. G.. STERBENZ, JOSEPH D. TOUCH. DOWNLOAD EBOOK : HIGH-SPEED NETWORKING: A SYSTEMATIC. APPROACH TO HIGH-BANDWIDTH LOW-LATENCY ...

High-speed network of independently linked nodes
Dec 22, 2005 - Management a. 394. 386. 388 ... ties also communicate over much smaller-scale networks, such as .... business, or other type of communicating station on the NAN. ... In one embodiment NAN software operates on the server,.

pdf-90\high-speed-networking-a-systematic-approach-to-high ...
There was a problem loading more pages. pdf-90\high-speed-networking-a-systematic-approach-t ... mmunication-by-james-p-g-sterbenz-joseph-d-touch.pdf.

EC1009 High speed Networks.pdf
Explain the single- server and multi server queering models. (16). 2. At an ATM ... Distinguish between inelastic and elastic traffic. 5. Define the format of DS field.

design of high speed impellers
Keywords: Rotating disk, Variational method, Plastic limit, Burst speed, Yield criteria ... solid disk systems, and obtained the burst speed for impending failure.

High-Speed Block-level I/O over RDMA-capable NICs ...
Computer Architecture & VLSI Laboratory (CARV). Institute of Computer Science (ICS). High-Speed Block-level I/O over ... •RDMA initiation cost. Techniques:.

Processing speed in recurrent visual networks ...
Tel: + 44 (0) 1392 264626; fax: + 44 (0) 1392 264623; e-mail: [email protected]. Sponsorship: .... Stimuli were generated on a PC and displayed on a 19-inch.

High-Speed Network Modeling For Full System ...
to fine-tune hardware and software and can be particularly important in those ... extract an analytical representation of the workload that is common in many ...

National High Speed Rail Corporation Limited.pdf
Diploma or B. Tech/B.E (Electronics and. Telecommunication/Computer Engg/Information Technology) from. any recognized university. Eligibility criteria.

Low-Complexity Shift-LDPC Decoder for High-Speed ... - IEEE Xplore
about 63.3% hardware reduction can be achieved compared with the ... adopted by high-speed communication systems [1] due to their near Shannon limit ...

High-Speed Compressed Sensing Reconstruction on ...
tion algorithms, a number of implementations on graphics processing ... Iterative thresholding algorithms, such as AMP, are another class of algorithms that refine the estimation in each iteration by a thresholding step. We chose OMP and AMP as two o

Read PDF High Speed Signal Propagation: Advanced ...
... a b t i g h t p r o f e s s i o n a l f r e e w h e n y o u n e e d i t V P N s e r v i c e H ..... app High Speed Signal Propagation: Advanced Black Magic ,epub website ...