JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 12, ISSUE 1, MARCH 2012 5

Computer Architecture and Algorithms for High Performance Computing through Parallel and Distributed Processing P. Venkata Subba Reddy Abstract—There is a very high need of High Performance Computing (HPC) in Many applications like space science to Artificial Intelligence. HPC shall be achieved through Parallel and Distributed Computing. In this paper, Parallel and Distributed algorithms are discussed based on Parallel and Distributed Processors to achieve HPC. The Programming concepts like threads, fork, sockets and par do are discussed with some simple examples for HPC. Index Terms— High Performance Computing, Parallel and Distributed Algorithms, Computer Architecture, Computer Programming

——————————  ——————————

1 INTRODUCTION Computer Architecture and Programming play an important role for High Performance computing (HPC) in large applications Space science to Artificial Intelligence []. The Algorithms are problem solving procedures and later these algorithms transform in to particular Programming language for HPC. There is need to study algorithms for High Performance Computing. These Algorithms are to be designed to computer in reasonable time to solve large problems like wather forecasting, Tsunami, Remote Sensing, National calamities, Defense, Mineral exploration, Finite-element, Cloud Computing, and Expert Systems ect. The Algorithms are NonRecursive Algorithms, Recursive Algorithms, Parallel Algorithms and Distributed Algorithms. The Algorithms must be supported the Computer Architecture. The Computer Architecture is characterized with Flynn’s[2] Classification SISD, SIMD, MIMD, and MISD. Most of the Computer Architectures are supported with SIMD (Single Instruction Multiple Data Streams). The class of Computer Architecture is VLSI Processor, Multiprocessor, Vector Processor and Multiple Processor[1,3].

2

ALGORITHMS

There are Non-Recursive Algorithms, Recursive Algorithms, Parallel Algorithms and Distributed Algorithms[5,10]. In the following Algorithms, Computer Programming and Architectures are discussed

2.1 Non-Recursive Algorithms Non-Recursive Algorithms are systematically applied to the problems by analyzing the efficiency. Consider the algorithm of finding maximum element in the array A ([0 .. n-1]) maxval  A[0] for i0 to n-1 do

if A[i] > maxval axval  A[i] return maxval The problem will be analyzed as C(n) is number of computations. Where n is input size. The number of times of operations in execution n-1 C(n)= ∑ 1 = n=1 ЄӨ(n) i=1 2.2 Recursive Algorithms The recursive algorithm is binary expansion whose number of executions is multiplication. For instance n! = n*n-1*n-2*…*1 Consider the algorithm of finding n! If n=0 return 1 Else return F(n-1)*n The number of multiplications M(n) needs to compute M(n)=M(n-1) +1 for n>0 M(0)=0 2.3 Parallel Algorithms Parallel Algorithms are designed to apply on problem to compute parallel whose number of executions is independent. For instance parallel sum of odd and even number up to A[n]. Consider the algorithm to find parallel sum of odd and even numbers unto A[n]. odsum0, evensun  0 for i0 to n-1 stem 2 do oddsum oddsum +A[i}]

© 2012 JCSE www.journalcse.co.uk

6

return oddsum for i1 to n-1 stem 2 do evensum evensum +A[i}] return evensum Consider the algorithm to find parallel matrix multiplication of A[n, n] and B[n,n]. for i0 to n-1 do in parallel for j0 to n-1 do in parallel C[I,j]0 for k0 to n-1 do in parallel C[I,j]  C[I,j] + A[I,k] * B[k,j] return C

}

if (fork()) { #Transaction1; else { #Transaction2

3 PARALLEL AND DISTRIBUTED ARCHITECTURE AND PROGRAMMING Parallel and distributed Computer Architecture of processors is defined through Flynn’s classification (SISD, SIMD, MIMD, MISD) Parallel

Bus

and

Tightly

Distributed

Loosely

Couple

Computers

Couple

d Multiproces-

d MuliCompu-

sors

ters Switched

Bus

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

4x4 mesh Processing Elements of VLSI Processor

2.4 Distributed Algorithms Distributed Algorithms are designed to apply on problems to compute on Distributed Systems whose executions are independent and distributed . Consider the algorithm to compute two transactions The algorithm of the problems is designed to execute on Distributed Systems.

}

PE

Switched

Consider the Parallel algorithm for odd and even sum of n elements for VLSI Processor. Compute oddsum, evensum in parallel One of the PEs set to oddsum: The Perl program for above parallel algorithm using threads is given by The “use thread” creates one or more threads. use threads; $thr1= threads->new(\&ascending); $thr2= threads->new(\&decending); ; sub ascending { my $num=0; do { $num=$num+1; print " $num\n"; } while ( $num<10) } sub decending { my $num=10; do { print " $num\n"; $num=$num-1; } while ( $num>0) } $thr1-> join; $thr2->join; 3.2 Multi Processor Multiprocessor Computer Have been modeled as n Processor and Parallel Random Access Machine (PRAM) with shared memory. The Parallel Algorithms will be implemented with PRAM.

Parallel and Distributed Computer Systems P1

There are four minimum number of Architecture are VLSI Processor, Multiprocessor, Multiple Processor(Multi Computer) and Vector Processors . These Architectures are discussed in the following

P P2

P

Pn

P

3.1 VLSI Processor VLSI Chip has Computer components such as Processor arrays ( Processing Elements), Memory array and large scale switching networks. Communicate the PEs for implementing Parallel Algorithms with VLSI Chip.

Shared

Multiprocessor System Memory

Consider the Parallel Algorithm for computation for Multiprocessing System oddsum0

7

for i0 to n-1 stem 2 do oddsum oddsum +A[i}] return oddsum

) or die "Couldn't connect to Server\n"; $socket->recv($recv_data,1024); if($recv_data){ localtime()=$recv_data; print "Recieved :$recv_data\n"; }

one of the PEs set to evensum: evensun  0 for i1 to n-1 stem 2 do evensum evensum +A[i}] return oddsum The Perl Program for above problem using fork for Parallel Processing 3.3 Multi Computer System Multi Computer involves sequence of routers and channels for as number of Computer Systems with Message Passing Interconnection Network. The Parallel Algorithms will be implemented with this Network .

P

P

P

M

M

M

M P

P M

Message-Passing

M P

P M

Interconnection Network

P

P

P

M

M

M

Multi Computer System Multi computer System/ Distributed system Consider the Parallel/Distributed algorithms in Multi Computer/distributed System. These algorithms will be computed in two ways Client/Server and Remote Procedural Calls Remote Procedural Computation The Client request the Data from the Server and the Server sends the Data from the Server buffer. The Perl Program gets Time from Server. CCC Request Host Kernel

Reply Target Kernel

Remote Procedural Computation #Perl Client Program #!/usr/bin/perl use IO::Socket; $socket = new IO::Socket::INET ( PeerAddr => '127.0.0.1', PeerPort => 7008, Proto => 'tcp',

#Perl Server Program #!/usr/bin/perl use IO::Socket; $| = 1; $socket = new IO::Socket::INET ( LocalHost => '127.0.0.1', LocalPort => '7008', Proto => 'tcp', Listen => 5, Reuse => 1 ); die "Coudn't open socket" unless $socket; print "\nTCPServer Waiting for client on port 7008"; while(1) { my($new_sock,$buf); $buf=sum(); $client_socket = ""; $client_socket = $socket->accept(); $peer_address = $client_socket->peerhost(); $peer_port = $client_socket->peerport(); print "\n I got a connection from ( $peer_address , $peer_port ) "; $client_socket->send($buf); close $client_socket; sub sum() { return 2+3;} } The Host Machine sends the Data to the Target Machines and Target Machine processes the Data and send result Data to the Host Machines. The Perl program for distributed algorithm may implement using socket & fork. 3.4 Vector Processors Super Computers are model with Vector Processor. Super Computers are specified by 5-tuples M = Where N=number of processors C= Set of instructions I is set of instructions for parallel execution M= Set of Processors R= Set of routing functions

8

Scalar Processor

Vector Processor

Grid Computing and Cloud Computing. Usually Fotran is used for HPC. The Perl and Java Programming are laso usefull for HPC[11].

Scalar function-

ACKNOWLEDGMENT

al pipelines Scalar control

The author wishes to thank B.Tech., M.Tech., and Ph.D Students for helping me.

Vector control unit

unit

Vector

func-

tional pipelines

Main Memory

[1]

(Pro-

Vector

gram&Data

regis-

Vector

ters

tional pipelines

Mass

Host

Storage

Computer

REFERENCES

func-

I/O (User)

The Architecture of Vector Supercomputer Parallel Algorithms are designed to compute big problems like Weather forecasting, Remote sensing, Mineral exploration, Oceanography ect in parallel using Super Computers. Consider the algorithm to find Parallel Matrix Multiplication A[nxn] and B[nxn], where n is very large. The Perl program for above parallel algorithm for matix multiplication is given by for i0 to n-1 do in parallel for j0 to n-1 do in parallel P[i,j] set to C[i,j]0 for k0 to n-1 do in parallel A[I,k] and B[k,k]broadcast to P[i,j] C[I,j]  C[I,j] + A[I,k] * B[k,j] return C The Perl program for above parallel programming for matix multiplication using par do in FORTRAN is given by

par do 300 i = 1, n par do 200 j = 1, n par do 100 k = 1, n a(i,k) = a(i,k) + b(i,j) * c(j,k) 100 continue 200 continue 300 continue

4 CONCLUSION High Performance Computing is required when large computations of the problems. HPC shall be performed through the Parallel and distributed Algorithms. The Parallel and Distributed are discussed based on Computer Architecture. The Class of Algorithms and Class of Computer Architecture are discussed. The Programming concepts like threads, fork, sockets and Par Do are discussed for HPC. Some simple examples are discussed for HPC. The examples shall be extending to large problems like

Kai Hwang, Advanced Computer Architecture, McGraw-Hill, New Delhi, 1993. [2] M. J. Flynn, “Some Computer Organizations and Theire Effectiveness”, IEEE Transactions on Computers, vol.29, n0.9, pp.948-960, 1972. [3] K. Hwang, Advanced Parallel Processing and SuperCompuer Architecture”, Proceedings of IEEE, vol.75, 1987. [4] K. Hwang anf F. A Briggs, Computer Architecture and Parallel Processing, McGraw-Hill, New Delhi, 1992. [5] Aho, Hopecroft and Ulman, Design and Analysis of Computer Algorithms, pearson, 2002. [6] Martin Brown, Perl The Complete Reference, Tata Mc GrawHill, New Delhi,2001. [7] John C. Knight, The current status of super computers Original Research Article Computers & Structures, Volume 10, Issues 1–2, Pp.401-409, April1878. [8] Horst D Simon Erich Strohmaier, Jack J Dongarra, Hans W Meuer, The marketplace of high-performance computing Original Research ArticleParallel Computing, Volume 25, Issues 13–14, pp. 1517-1544, Decmber 1999. [9] Guillermo L. Taboada, Sabela Ramos, Roberto R. Expósito, Juan Touriño, Ramón Doallo ,Java in the High-Performance Computing arena: Research, practice and experience Original Research ArticleScience of Computer Programming July 2011. [10] N. Sim, D. Konovalov, D. Coomans High-Performance GRID Computing in ChemoinformaticsComprehensive Chemometrics, pp. 507-539, 2009. [11] P. Venkata Subba Reddy, “Object-Oriented Software Engineering through Java and Perl”, CiiT International Journal of Software Engineering and Technology, July 2010.

Dr. P. Venkata Subba Reddy was Professor and Head, Department of Computer Science and Engineering, MeRITS, Udayagiri, India during 2006-07. He is currently Associate Professor in Department of Computer science and Engineering, College of Engineering, Sri Venkateswara University, Tirpathi, India working since 1992. He did Ph.D in Artificial Intelligence, 1992). Sri Venkateswara University, Tirpathi, India. . He did Post Doctoral/Visiting fellowship in Fuzzy Algorithms under Prof. V. Rajaraman, SERC,IISC/JNCASR, Bangalore, India. He is actively engaged in Teaching and Research work to B.Tech., M.Tech., and Ph.D students. He is actively in doing research in the areas of fuzzy systems, database systems, Software Engineering , Expert Systems and Natural language processing. He published papers in reputed National and International journals. He is an Editor for JCSE

Transactions Template

Computer Architecture and Programming play an im- portant role for High Performance computing (HPC) in large applications Space science to Artificial ...

482KB Sizes 3 Downloads 93 Views

Recommend Documents

Transactions Template
Published results show that these strategies effectively improve both the data rate and .... ed estimates to the decoder for error correction. Unlike the Viterbi decoding .... Error Probability for Data Services in a Terrestrial DAB Single Fre-.

Transactions Template
INTERNATIONAL JOURNAL OF ELECTRICAL, ELECTRONICS AND COMPUTER SYSTEMS (IJEECS),. Volume 1, Issue 2, April 2011. .... system integrates both graphical and textual password scheme and has high level security. .... and the list of grid cells of these th

Transactions Template - IJEECS
INTERNATIONAL JOURNAL OF ELECTRICAL, ELECTRONICS AND COMPUTER SYSTEMS (IJEECS),. Volume ... ployed to validate the present theory for various .... Journal of Radio and Space Physics, vol. 35, pp. 293-. 296, 2006.(Journal).

Transactions Template
In this paper we evolve a signature based intrusion detection system based on Neural ... Training and testing data we obtain from the real network traffic by using ...

Transactions Template
using sensors, 3G cell phone network and social media to be applied to the design of small ..... Systems, Computer Networks acting on the following themes:.

Transactions Template
http://sites.google.com/site/journaloftelecommunications/. Model for remote data ... analysis of these sensors can be acquired and transmitted remotely through the 3G network, directly to an operations room, or also be made available on the .... (pre

Transactions Template
overcome this problem is to have a good management and control of signal traffic lights. For this ... programmable logic controller and wireless sensors for a real time implementation. ... interested in managing urban traffic areas and road net-.

Transactions Template
tion of Internet Banking, as it reduces the customer‖s re- quirements to just a .... Taiwan launched a trial on over 5000 Visa payWave stores, in mid-2008.

Transactions Template
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 2, ISSUE 1, JULY 2010. 32 ... Arjan Singh is with the Baba Banda Singh Bahadur College of Engi- neering ... ranking of the V-N collocations based on their relative.

Transactions Template
dresses to the honeypot template (bind 10.3.0.2 Linux & bind 10.3.0.3 Linux). .... service, email platform, etc. the impact would be high and the image of the ...

Transactions Template
sit fleet by way of internet-enabled mobile devices. WAP- ... transit region is sent to the user mobile phone. From the ... converting the plate number into text file, and finally (4) running the .... If number at the free box is uncompleted or wrong

Transactions Template
puters are in the data processing classrooms and its use continues being ... Habib M. Fardoun is with the Institute of Computer Science Research. Institute of Albacete and .... which we can specify and use educational activities to allow work ...

Transactions Template
an Intranet and Internet, servers and workstations for operations, ... tion of new business models, and changes in the bounda- .... optical fibre or radio.

Transactions Template
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 4, ISSUE 1, NOVEMBER ... audio and video data separately, this research presents a.

Transactions Template
Abstract— The Semantic Web presents new opportunities for enabling modeling, sharing and reasoning with knowledge available on the web. These are made possible through the formal representation of the knowledge domain with ontologies. Ontology is a

Transactions Template
fined by the distance to the nearest training pattern. ... set cs(yj) j=1…m , cs(yj) ϵ { 0 1 ….9} which defines .... B.E. degree in 2007 from Rajasthan University.

Transactions Template - arXiv
registered with respect to the centre of the fingerprint image. The dimensionality of .... tions are then normalized into the domain from 0 to , and the certain values ...

Transactions Template
and integrating multiple telecommunication services into single device. The typical sierpinski gasket antenna has been introduced by [3]. Recently various ...

Transactions Template
by analyzing both audio and visual data. ... As tools and systems for producing and disseminating action data improve significantly, the amount of human action.

Transactions Template
We focused on intersections as a traffic scene to be covered by the system. At some blind ...... C. Sugimoto received the B.S. degree in Engineering, and the M.S..

Transactions Template
models and propose QoS in WSNs considering the packet to be small in size so that it can travel faster through the network by avoiding collision. In this way we ...

Transactions Template - IJEECS
ISSN: 2221-7258(Print) ISSN: 2221-7266 (Online) www.ijeecs.org. Modified ..... vanced Information Networking and Applications Workshops. (AINAW 07), vol. 2.

Transactions Template
present, there are no proper measures for software main- tainability[1]. ..... AT&T Bell Labs at Columbus, Ohio, USA and has also worked as a consultant in the ...

Transactions Template
The MANETs are also suitable when network setup is difficult, costly and required to be done quickly ... hop fashion without any centralized administration [1]. Significant examples of ..... Aircraft Ad-hoc networks, Network Security & VLSI Design.