Book Book Review
Computational Intelligence: Methods and Techniques, by Leszek Rutkowski, ISBN: 978-3-540-76287-4, Berlin: Springer-Verlag, 2008.
T
his book is about methods and techniques popularly employed in the field of computational intelligence (CI). According to [1], CI can be broadly defined as “the branch of science studying problems for which there are no effective computational algorithms”. These include problems such as extracting meaning from perception, understanding language, solving illdefined vision tasks, for which formulating exact, analytical algorithms is either not possible or computationally very demanding. Seeing from this viewpoint, it is instructive to define CI based on the problems studied, rather than the types of method used for solutions. Nevertheless, the term CI has been interpreted in different, and somewhat narrow, ways. In [2], for instance, a system is regarded as computationally intelligent when it “deals only with numerical (low level) data, has a pattern recognition component, and does not use knowledge in the artificial intelligence (AI) sense”. Also, the book [3] introduces CI as “the study of adaptive mechanisms to enable or facilitate intelligent behavior in complex and changing environments. As such, computational intelligence combines
Digital Object Identifier 10.1109/MCI.2011.942585 Date of publication: 20 October 2011
76
Meng Joo Er and Richard J. Oentaryo, Nanyang Technological University, SINGAPORE
artificial neural networks, evolutionary computing, swarm intelligence, and fuzzy systems.” These are the main topics actively pursued by the present CI community, but such definition seems to leave aside other key subjects such as the Bayesian foundation of learning, probabilistic reasoning, kernel methods, metalearning algorithms, etc. This book focuses on the fields of neural networks, fuzzy and rough systems, as well as evolutionary computation, in a similar vein with [3]. It provides in-depth treatments on both single (pure) CI techniques developed in each field and hybrid CI techniques that combine these single techniques. The book consists of ten chapters in total. In Chapter 1, CI is introduced as “solving various problems of artificial intelligence with the use of computers to perform numeral calculations”. This bears some similarities to the definition given in [1], but may not capture methods and techniques developed in classical AI well. It is further mentioned that such calculations involve the applications of six major techniques: neural networks, fuzzy logic, evolutionary algorithms, rough sets, uncertain variables, and probabilistic methods. Some of these techniques have been extensively covered in the book.
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | NOVEMBER 2011
Chapter 2 presents selected key issues in AI, starting with some background information on various definitions of AI as well as the classic Turing test and “Chinese room” paradigm. The chapter proceeds by introducing and discussing selected issues in a number of AI fields: expert systems, ro botics, natural language processing, heuristics and search strategies. The next part of the chapter discusses several issues in cognitive science and modeling, aiming at understanding the human mind and its underlying phenomena. Following this are introductions on the issues of ant colony optimization, artificial life, and intelligent software bots. Some important perspectives of AI development are then given at the end of the chapter, summarizing the views of different experts in the field on what future machine intelligence would or should be. Chapter 3 provides a comprehensive review on rough set theory (RST) as a means for granular knowledge representation, accompanied by many numerical illustrations. The chapter starts with a series of basic definitions in RST, such as information system, decision table, indiscernibility relation, lower/upper approximation of a set, positive/
1556-603X/11/$26.00©2011IEEE
negative/boundary region of a set, lower/upper approximation of a family of sets, positive/negative/boundary region of a family of sets, etc. Subsequently, the chapter presents the analysis of decision tables based on RST, with which the dependencies among features (attributes) in an information system can be examined. The last part of the chapter is dedicated to the application of the Learning from Examples based on Rough Sets (LERS) software, which can automatically generate a set of rules based on examples entered and test the rules generated or those prepared independently. Chapters 4 and 5 are essentially devoted to fuzzy set theory. In chapter 4, the basic terms and definitions of (type-1) fuzzy set are first introduced, followed by detailed discussions on the basic fuzzy set and algebraic operations, extension principle, fuzzy numbers, triangular norm and negation, and fuzzy relations. Subsequently, the chapter elaborates the concept of approximate reasoning as well as the methods to construct fuzzy inference systems. Finally, some illustrative examples of the applications of fuzzy set theory in forecasting, planning, and decision making domains are presented. Building upon the principles described in chapter 4, chapter 5 presents the extended concept of type-2 fuzzy set, beginning with its basic definitions, set operations, and type-2 fuzzy relations. Based on these principles, the concept of type reduction is subsequently presented, which involves transforming type-2 into type-1 fuzzy sets for output defuzzification. The chapter concludes by presenting methods for designing type-2 fuzzy inference systems. Chapter 6 is about artificial neural networks. First, several popular mathematical models of (biological) neuron are introduced, such as perceptron, Adaline, sigmoidal, and Hebb neurons. The next part discusses the structure and functioning of multilayer feed-forward networks. A number of popular neural learning methods, e.g. back-propagation, variable-metric, Levenberg-Marquardt, and recursive least squares algorithms are also presented, along with discussions on
the issue of selecting network structures. In addition, the chapter describes the idea of recurrent neural networks (having feedback connections), with prime examples being Hopfield, Hamming, Elman, real-time recurrent networks, and bidirectional associative memory. The chapter then discusses self-organizing neural networks with (unsupervised) competitive learning abilities, including winner-takes-all and winner-takes-most methods. The final part of the chapter covers the topics of adaptive resonance theory, radial basis function, and probabilistic neural networks. Evolutionary algorithms (EA) constitute the main theme of chapter 7. The chapter begins by discussing three major types of optimization methods: analytical, enumerative, and random optimization, and then how evolutionary algorithms differ from these traditional optimization methods. Subsequently, four families of evolutionary algorithms are presented, including genetic algorithm, evolutionary strategies, and evolutionary and genetic programming. In addition, some advanced issues in EA are described, covering subjects such as exploration-exploitation tradeoff, selection procedures, fitness scaling, reproduction procedures, coding of chromosomes, as well as crossover and mutation operations. The last part of the chapter describes hybrid techniques combining EA with neural networks and fuzzy systems, which are commonly referred to in the literature as evolutionary neural networks and evolutionary fuzzy systems, respectively. Chapter 8 presents various algorithms for automatic data partitioning and clustering. This is a relatively short chapter, comprising only 21 pages. First, the definitions of hard, fuzzy, and possibilistic partitions are given. Next, several distance metrics commonly adopted in clustering methods are presented. The subsequent part of the chapter describes the five most popular data clustering algorithms: hard c-means (also widely known as k-means) clustering, fuzzy c-means clustering, possibilistic c-means clustering, Gustafson-Kessel algorithm, and fuzzy maximum likelihood
From the Beginning In 1913, the Proceedings journal covered numerous key events: Q Edwin
H. Armstrong, the “father of FM radio,” patented his regenerative receiver
Q William
David Coolidge invented the modern X-ray tube
Q Lee
De Forest’s Audion, the first triode electron tube, was installed to boost voice signals
Q The
first issue of Proceedings of the IRE published
Discover 95 years of groundbreaking articles Call: +1 800 678 4333 or +1 732 981 0060 Fax: +1 732 981 9667 Email:
[email protected] www.ieee.org/proceedings
NOVEMBER 2011 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 07-PUBS-0258b Prod 3rd Vert.indd1 1
77
12/5/07 11:57:18 AM
… this book responds to its main objective, which is to provide in-depth treatment on the contemporary CI approaches, from basic topics such as neural networks, fuzzy systems, and evolutionary algorithms to advanced topics such as flexible neuro-fuzzy systems. estimates algorithm. Finally, the chapter provides several validity measures that can be used as the means for evaluating the quality of clustering results. Chapters 9 and 10 are centered around the topic of neuro-fuzzy system (NFS), a hybrid approach which synergizes the learning, fault-tolerance, and parallelism traits of neural networks with the linguistic and approximate reasoning features of fuzzy rule-based systems. Three major types of NFS are described in chapter 9: the Mamdani, logical, and Takagi-Sugeno models. For comparative analysis of these models, a number of approximation (regression) and classification datasets are used. The particular focus of this chapter is the comparison of (Mamdani, logical, or Takagi-Sugeno) models, with or without weights that reflect the importance of fuzzy rules and those that reflect the importance of input linguistic variables. It is concluded that incorporating the weights can significantly help to improve the performance of the models. In the second part of the chapter, learning algorithms of the NFS models are presented, focusing mainly on the back-propagation method. Lastly, the chapter discusses several model evaluation criteria that account for both system performance (e.g., error rate) and complexity (e.g., number of rules). The content of chapter 10 is largely based on the book author’s research work on the so-called flexible neuro-fuzzy system (flexible NFS) [4]. The main idea of the flexible NFS is to have the inference procedure (Mamdani or logical type) computed as a result of learning rather than being specified/ fixed a priori. This is achieved via the use of specially designed adjustable tri-
78
angular norm operators. In the first part of the chapter, introduction on the concepts of soft, parameterized, and adjustable triangular norms is given. Building upon these concepts, flexible NFS models that can flexibly change between the model of Mamdani or logical type are then formulated. The second part of the chapter describes the back-propagation learning procedures used to adapt the flexible NFS model, and finally, simulation studies using two approximation and classification datasets are presented to evaluate the efficacy of the proposed model. All in all, this book offers good reference to the state-of-the-art methods and techniques in CI. The book contains hundreds of examples to illustrate various concepts of CI, and perhaps is the only book presenting fuzzy and rough set theories in a rather friendly manner. One may find, however, that this book is not sufficiently comprehensive, due to the lack of coverage on several prominent CI algorithms. For example, despite being a popular CI approach, only brief introduction on swarm optimization algorithms (e.g. ant colony optimization) is given in chapter 2, and they are not further elaborated in the subsequent chapters. Other approaches not covered in the book include Bayesian methods, reinforcement learning, feature extraction and selection, ensemble systems, etc. It would also have been useful to link back and see how the methods and techniques described in chapters 3-10 can be used to address the key AI issues raised in chapter 2. Regardless, this book responds to its main objective, which is to provide in-depth treatment on the contemporary CI approaches,
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | NOVEMBER 2011
from basic topics such as neural networks, fuzzy systems, and evolutionary algorithms to advanced topics such as flexible neuro-fuzzy systems. Hence, the book should serve as a useful reference for both established and young researchers. Over the years, CI has gone a long way and been successfully employed in numerous real-world applications. Given the plethora of methods and techniques in CI, however, there is a need to formulate an integrative theory that can provide good foundations or directions in order to further develop the field. Several proposals for CI foundations have been outlined in [1], such as cognition and computing as compression, meta-learning via search in the model spaces, (dis)similarity-based framework for meta-learning, and a more general approach based on chains of transformations. A promising area in which such theories can be applied is the development of integrated cognitive architectures [5], which constitute generic blueprints for building intelligent agents, integrating and testing various models of knowledge representation, reasoning, learning, etc. Building such architectures would make it possible to create large-scale, multicomponent intelligent systems that are substantially more effective than they are today. This would, in turn, open a door for novel applications in tasks requiring robust learning and social cognition, thereby eventually realizing true, human-like machine intelligence. References [1] W. Duch and J. Mandziuk, Challenges for Computational Intelligence, vol. 63. 2007. [2] J. C. Bezdek, “What is computational intelligence?” in Computational Intelligence: Imitating Life. New York: IEEE Press, 1994, pp. 1–12. [3] A. P. Engelbrecht, Computational Intelligence: An Introduction: Hoboken, NJ: Wiley, 2003. [4] L. Rutkowski and K. Cpalka, “Flexible neuro-fuzzy systems,” IEEE Trans. Neural Networks, vol. 14, pp. 554– 574, 2003. [5] W. Duch, R. J. Oentaryo, and M. Pasquier, “Cognitive architectures: Where do we go from here?” in Frontiers in Artificial Intelligence and Applications, vol. 171, P. Wang et al., Eds. IOS Press, 2008, pp. 122–136.