` DI ROMA “LA SAPIENZA” UNIVERSITA

` DI SCIENZE MATEMATICHE, FISICHE E NATURALI FACOLTA

Dottorato in Fisica - Ph.D. in Physics XIX Ciclo November 2006

THESIS:

Statistical mechanics approach to language games Candidate: Andrea Baronchelli Supervisors: Prof. V. Loreto and Prof. L. Pietronero

in memory of Ulisse, a beloved cat

Contents Acknowledgments

ix

List of publications

xi

Introduction 1 Modeling the emergence of language 1.1 Introduction . . . . . . . . . . . . . . 1.2 A landscape for different models . . 1.3 The Naming Game . . . . . . . . . . 1.3.1 The origin . . . . . . . . . . . 1.3.2 The model . . . . . . . . . . 1.3.3 Behind the Naming Game . . 1.3.4 Stages in language games . . 1.4 The Evolutionary Language Game . 1.5 Models of opinion dynamics . . . . . 1.6 Interaction topologies . . . . . . . .

1 . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

7 7 8 9 9 11 13 14 16 17 20

2 The Naming Game 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2.2 The model . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Basic phenomenology . . . . . . . . . . . . . 2.3 The role of the system size . . . . . . . . . . . . . . . 2.3.1 Scaling relations . . . . . . . . . . . . . . . . 2.3.2 Rescaling curves . . . . . . . . . . . . . . . . 2.4 The approach to convergence . . . . . . . . . . . . . 2.4.1 The domain of agents . . . . . . . . . . . . . 2.4.2 The domain of words . . . . . . . . . . . . . . 2.4.3 Network view - The disorder-order transition 2.4.4 The overlap functional . . . . . . . . . . . . . 2.5 Single games . . . . . . . . . . . . . . . . . . . . . . 2.6 Convergence Word . . . . . . . . . . . . . . . . . . . 2.7 Symmetry Breaking - A controlled case . . . . . . . 2.8 Discussion and conclusions . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

23 23 24 26 28 28 31 34 34 36 37 38 41 45 47 49

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

vi

CONTENTS 3 The role of topology 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 3.2 Regular lattices . . . . . . . . . . . . . . . . . . . . 3.2.1 Analytical approach for 1-d lattices . . . . . 3.2.2 Extensions to higher dimensions . . . . . . 3.3 Small-world networks . . . . . . . . . . . . . . . . . 3.3.1 A crossover between two regimes . . . . . . 3.4 Complex networks . . . . . . . . . . . . . . . . . . 3.4.1 Networks definition and main properties . . 3.4.2 Direct and reverse Naming Game . . . . . . 3.4.3 Global quantities . . . . . . . . . . . . . . . 3.4.4 Clusters statistics . . . . . . . . . . . . . . . 3.4.5 Effect of the degree heterogeneity . . . . . . 3.4.6 Effect of the average degree and clustering . 3.4.7 Effect of hierarchical structures . . . . . . . 3.4.8 The role of community structures . . . . . . 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

4 Microscopic activity patterns 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Agents activity on different topologies . . . . . . . . . 4.3 Master equation approach to agents internal dynamics 4.3.1 Transition rates in the reorganization region . . 4.3.2 Transition rates during the convergence process 4.3.3 Validation of the adiabatic approximation . . . 4.3.4 General expression of the adiabatic solutions . 4.4 Explicit solution for some interesting cases . . . . . . . 4.4.1 The case of homogeneous networks . . . . . . . 4.4.2 High-degree nodes in heterogeneous networks . 4.4.3 Power-laws on the complete graph . . . . . . . 4.5 Conclusions and discussion . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

53 53 54 57 62 64 65 70 71 73 75 78 78 83 86 88 91

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

95 95 96 100 101 103 104 105 106 107 108 109 110

5 Beyond the Naming Game 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Stochastic update . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Some remarks on the generalized Naming Game 5.3 Asymmetric update . . . . . . . . . . . . . . . . . . . . 5.3.1 Stochastic asymmetric update . . . . . . . . . . . 5.4 Deterministic word-selection strategies . . . . . . . . . . 5.4.1 The play-smart strategy . . . . . . . . . . . . . . 5.5 A not efficient Naming Game . . . . . . . . . . . . . . . 5.6 Some remarks on weights . . . . . . . . . . . . . . . . . 5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

113 113 114 119 120 122 124 127 130 132 135

CONTENTS 6 Random walk on complex networks 6.1 Introduction . . . . . . . . . . . . . . . 6.2 A new process - rings . . . . . . . . . . 6.3 Explicit calculation for random graphs 6.3.1 Static . . . . . . . . . . . . . . 6.3.2 Dynamics . . . . . . . . . . . . 6.4 Extensions to other networks . . . . . 6.5 Conclusions . . . . . . . . . . . . . . . Conclusions and outlook

vii

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

137 137 139 141 141 143 147 151 153

A A short overview of complex networks 157 A.1 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 A.2 Examples of real networks . . . . . . . . . . . . . . . . . . . . 160 A.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Bibliography

167

Acknowledgments First of all I must thank Vittorio Loreto, with whom I have had the chance to work during these three years of PhD. With generosity, patience and trust he has guided me, always letting me follow my curiosity. I am profoundly grateful to him. I am also obliged to Luciano Pietronero, for having offered me the possibility to enter and work in the fascinating field of statistical mechanics and complex systems, and for the enlightening discussions we have had. Working with them has been a pleasure and an honor for me. I have also benefited of close collaborations with fantastic persons. First of all Luc Steels, whose enthusiasm made me discover the intriguing domain of Semiotic Dynamics. Then l’extraordinaire Alain Barrat, Emanuele Caglioti, Maddalena Felici, and, of course, my “Parmesan” great pal, Luca Dall’Asta. I believe they all have enriched me, and for sure most of the work presented in this thesis would not have been done without them. I have to mention also Ciro Cattuto and Andrea Puglisi: our collaboration is more recent, but not less precious. I spent most of the time of this thesis in the very stimulating environment of the Department of Physics at “La Sapienza” University in Roma. The presence of many people that work in statistical mechanics, and the strict ties with the INFM-CNR Institute of Complex System, have allowed me to meet many interesting persons. Among them, I am particularly thankful to Andrea Baldassarri, Guido Caldarelli, Emmanuele Cappelluti, Claudio Castellano, Fabrizio Coccetti, Andrea Gabrielli, Francesco Rao, Vito Servedio, Francesco Sylos Labini and Stefano Zapperi for their kindness and for useful discussions. Then, of course, I cannot avoid to mention my PhD “colleagues” with which I shared, in different periods, the same office: Valentina Alfi, Benedetta Cerruti, Giulia De Masi, Miguel Iba˜ nez de Berganza, Francesco Rizzo and Daniele Vilone. We have been companions of many nice moments. Open exchanges of ideas have contributed to shape this thesis, and I thank Melanie Aurnhammer, Joachim de Beule, Bart De Vylder, Fr´ed´eric Kaplan, Martin L¨otzsch, James Minett, Remi van Trijp, Wouter Van den Broeck, and William Wang. I wish to acknowledge also the Laboratoire de Physique Th´eorique of the Paris XI University at Orsay, for the hospitality I received in the fall

x

Acknowledgments of 2004. I spent two pleasant and fruitful months working in the group of Alain Barrat. Finally, I would like to thank Alessandro Vespignani, who has accepted to be the external referee of my Ph.D. program and of this thesis, for his suggestions and encouragements. Roma, November 2006

List of publications The results presented in this thesis have been the object of a series of publications that we report here, divided by argument. ˆ The Naming Game model (Chapter 2)

– A. Baronchelli, M. Felici, E. Caglioti, V. Loreto and L. Steels, Sharp transition towards shared lexicon in multi-agent systems, J. Stat. Mech. P06014 (2006). – A. Baronchelli, E. Caglioti, V. Loreto and L. Steels, Complex systems approach to language games, Proceedings of the Second European Conference on Complex Systems ECCS’06 (2006). – A. Baronchelli, M. Felici, E. Caglioti, V. Loreto and L. Steels, Self-organizing communication in language games, Proceedings of the First European Conference on Complex Systems ECCS’05 (2005). ˆ The role of topology on the Naming Game model (Chapter 3)

– A. Baronchelli, L. Dall’Asta, A. Barrat and V. Loreto, Topology induced coarsening in language games, Phys. Rev. E 73:015102(R) (2006). – L. Dall’Asta, A. Baronchelli, A. Barrat and V. Loreto, Agreement dynamics on small-world networks, Europhys. Lett. 73:969 (2006). – L. Dall’Asta, A. Baronchelli, A. Barrat and V. Loreto, Non-equilibrium dynamics of language games on complex networks, Phys. Rev. E 74:036105 (2006). – A. Baronchelli, L. Dall’Asta, A. Barrat and V. Loreto, Bootstrapping communication in language games. Strategy, topology and all that, in “The Evolution of Language” ed. by A. Cangelosi, A. D.

xii

List of publications M. Smith and K. Smith, World Scientific Publishing Company (2006). ˆ Agents microscopic activity in the Naming Game (Chapter 4)

– L. Dall’Asta and A. Baronchelli, Microscopic activity patterns in the Naming Game, J. Phys. A: Math. Gen. 39:14851 (2006). ˆ Modified rules for the Naming Game (Chapter 5)

– A. Baronchelli, L. Dall’Asta, A. Barrat and V. Loreto, Nonequilibrium phase transition in negotiation dynamics, submitted for publication (2006). – A. Baronchelli, L. Dall’Asta, A. Barrat and V. Loreto, Strategies for fast convergence in semiotic dynamics, Artificial Life X, edited by L. M. Rocha et al., pages 480–485, MIT press (2006). ˆ Random walks on Complex Networks (Chapter 6)

– A. Baronchelli and V. Loreto, Ring structures and mean first passage time in networks, Phys. Rev. E. 73:026103 (2006). ˆ Other works published during PhD

– A. Baronchelli, E. Caglioti and V. Loreto, Measuring complexity with zippers, Eur. J. Phys. 26, S69-S77 (2005). – A. Baronchelli, E. Caglioti and V. Loreto, Artificial sequences and complexity measures, J. Stat. Mech. P04002 (2005). – A. Baronchelli, E. Caglioti, V. Loreto and E. Pizzi, Dictionary based methods for information extraction, Physica A 342, 294-300 (2004).

Introduction In recent years, statistical mechanics and complex system science have started to contribute to the study of the self-generating structures of language systems. Language can indeed be seen as a complex adaptive system [180], i.e. a set of socially accepted conventions constantly (re-)shaped by its users. In this thesis we introduce a simple model able to account for the emergence of a shared set of conventions in a population of agents, and we investigate it in depth, resorting both to numerical simulations and analytical approaches. Our work answers some of the open questions in the study of language selforganization (e.g. the role of the system size, the role of topology, etc.). Moreover, it introduces different elements of novelty, like memory and feedback, in the context of physicists’ approaches to social dynamics.

Physics and complex systems. In the last decades physicists have devoted great attention to the study of collective phenomena and complex systems. Large systems made up of simple components (for instance atoms or molecules, animal or human agents) can in fact self-organize themselves, i.e. “acquire a functional, spatial or temporal structure without specific interference from the outside” [104]. More precisely, the constituents of such systems are able to develop a complex collective behavior not trivially deducible from the knowledge of the rules that govern their mutual interactions [186, 187]. A classical example is the Ising model, in which many coupled spins produce the emergence of paramagnetic or ferromagnetic order. Often, however, the internal structure of interacting entities has little impact on their collective behavior, which is largely determined only by the way in which they interact. This has allowed to enlarge the field of application of statistical mechanics methods and tools to a wide range of systems, apparently much more complex than ideal ferromagnets, such as assemblies of molecules [42], of granular particles [77] and of biological entities [13] (bacteria, ants, birds, humans, etc.). Along this path, the recent past has witnessed an important development of the activities of statistical physicists in the area of social sciences (for a recent collection of papers see [97]). Here single units are human be-

2

Introduction ings, and both their internal description and interaction rules are obviously extremely complex. However, simple models have been set forth to describe the emergence of global behaviors out of purely local rules. A typical example is provided by opinion dynamics models, in which agents update their state (or opinion) through interactions with other individuals, and the question is whether the population will end up with a single shared opinion, or more states will survive [175]. In general, the approach has been twofold. On the one hand there has been the attempt to modify well established models of statistical mechanics to capture the essence of the observed large scale human dynamics [193]. On the other hand, many new agent-based models have been proposed. These are often quite hard to tackle with usual analytical tools, so that computer simulations have acquired a central role. Yet, some of them, like for instance the Axelrod model [15], have attracted physicists’ attention well beyond the problem they originated from, posing new exciting challenges. Language as a complex system. - As the interest of statistical physicists towards human dynamics was growing, an important revolution was taking place in linguistics. While the traditional paradigm, due to the work of Noam Chomsky, looked at language mainly as an aspect of individual psychology [62, 64], a more recent line of research has pointed out the importance of considering it as a product of social interactions [112]. Thus, the study of the self-organization and evolution of language and meaning has led to the idea that a community of language users can be seen as a complex dynamical system which collectively solves the problem of developing a shared communication system [180]. In this perspective, which has been adopted by the novel field of Semiotic Dynamics, the theoretical tools developed in statistical physics and complex systems science acquire a central role for the study of the self-generating structures of language systems. From a scientific point of view, the interest is not under discussion. Language is one of the most distinguishing traits of humans [63, 78, 65, 163], and it has always attracted a huge attention of scientists belonging to different fields [4]. Several different aspects have been investigated, like for instance the study of the diversity of human languages [101, 58] or that of the development of language in children [150], not to mention the investigation of language structure [33]. But it is probably the fundamental issue of how language emerged and evolved [78, 113, 50] that has gained the greater advantage from the new contributions of complex systems science. Indeed, the possibility of testing different hypothesis resorting to computer based models, or experiments involving artificial agents, has shifted the focus on how language evolved, reducing the speculations on why it appeared. Additionally, the issue of the emergence of language, traditionally of pure academic relevance, has been experiencing a strong renaissance due to new technological and social demands. On the one hand, for example, new popular web-tools (such as del.icio.us [108] or Flickr [109]) enable human

Introduction web-users to build up and maintain social networks founded on simple semantic systems, usually based on tags. Tracking the emergence of new tags, it is then possible to monitor the behavior of a real system which somehow is self-organizing its own language [99, 57]. Thus, a deeper comprehension of how a language emerges would allow to interpret and understand better social phenomena presently occurring. On the other hand, there is currently also a growing number of experiments where artificial software agents or robots bootstrap a shared lexicon without human intervention [178, 120]. Their results will be needed when groups of robots will have to deal autonomously with unforeseeable tasks in largely unknown environments, such as in the exploration of distant planets or deep seas, hostile environments, etc. By definition it will not be possible to specify all the needed communication conventions and ontologies in advance, and robots will have to build up and negotiate their own communication system, situated and grounded in their ongoing activities [181]. Therefore, understanding the mechanisms that lead a population to converge on a shared communication systems could provide helpful hints to design efficient agents and technological systems. Complex networks. - Finally, a fundamental point has to be considered. Traditionally, both complex systems and semiotic dynamics models have mostly dealt with single units either able to interact with all the others units (mean-field case), or sitting on the nodes of regular lattices. These situations are often unrealistic, but have the advantage to be accessible with the usual tools of statistical physics. However, recently, the fast-growing field of complex networks has pointed out that many real systems can be represented as sets of nodes (the units of the system) connected by links (the interactions occurring between single units). Most of these networks, either related to social, natural or technological systems, exhibit a set of peculiar properties whose characterization has become object of an intense effort [6, 82, 162, 47]. For instance, most real networks are “small world” [192] and exhibit a scale free degree distribution [16]. The first is the name attributed to the evidence that the average distance between two nodes is small. The second is the fact that, said degree of a node the number of links which connect it to other nodes, the degree distribution is heavy-tailed, often close to a power-law. Hence, often there are few individuals (the hubs) with a very large number of neighbors, whose impact on the different processes taking place on the network can be dramatic. In brief, complex network theory has become a fundamental element for the study and modeling of complex systems, since it provides crucial hints to address and understand relevant aspects of different real world situations from a unified perspective. In this thesis, we address the question of how new (linguistic) conventions, developed in local interactions among few individuals, can spread and become stable in the whole population. In particular, we introduce a simple

3

4

Introduction model in which agents build up a shared vocabulary out of pairwise negotiations and without any global co-ordination. Then, we investigate it resorting to the set of tools put forth in statistical mechanics to deal with complex systems. In general, we can say that our model fulfills two crucial requirements. First, it is reasonable from a linguistic-dynamic point of view. Second, it is very simple and allows for an extensive quantitative study. Indeed, we have defined it taking inspiration from a pre-existing model of semiotic dynamics, the so called Naming Game [177, 176], and we have substantially reduced the complexity of both the agents architecture and of their interaction rules. Also the choice to deal with the emergence of a shared vocabulary meets a simplicity criterion, since this issue can be seen as the most basic step in the understanding of the emergence of language. Our strategy has indeed generated some remarkable results. First of all, our effort towards simplicity bears the precious side-effect of clarifying which elements are crucial to obtain the desired global co-ordination properties. As we shall see, they are indeed few and extremely simple, and this is one of the major contributions of this thesis. Then, we have performed an in-depth analysis of our model resorting both to numerical simulations and analytical approaches. This kind of investigation, which is common in statistical physics, is on the other hand unprecedented in the field of semiotic dynamics. We have thus been able to answer some of the open questions in that field [179], identifying, for instance, the scaling laws governing the memory usage and the time required to reach a consensus, or clarifying the impact of complex interaction topologies on the global behavior of the population, etc. Finally, as it will be clear, our model can be successfully applied to different situations in which individuals take part in a decision process, such as problems of opinion formation. In this context, for example, it introduces several new features, like negotiations, memory and feedback, which, on the other hand, are fundamental ingredients in the other fields of research we have interacted with. All these aspects have been almost neglected in most opinion-dynamics models so far, but are indeed important characteristics of most human social dynamics.

The thesis is organized as follows. In the first chapter we provide some necessary background to understand the original work presented in this thesis. At first, we review the main directions of research in the complex systems approach to the evolution of language. In particular, we concentrate on the Naming Game model we have been inspired by, and on the orthogonal approach of the Evolutionary Language Game. We also describe some of the opinion dynamics models studied by physicists that, as it will be clear, contain several points of interest for us. Finally, we state the importance of the underlying topology defining the set of possible interactions among the

Introduction agents. In particular, we point out the relevance of complex networks theory, whose elements are introduced throughout the thesis when needed. However, for convenience, we also provide a short overview of complex networks theory in Appendix A. In the second chapter, we introduce our model and we investigate it in the mean-field case, i.e. considering a totally unstructured population in which each agent can in principle interact with everybody else [22]. Employing both numerical simulations and analytical approaches, we describe at first the evolution of the basic quantities of the model, and then we investigate its dynamics under many respects. We look at such issues as the role of the system size, the mechanisms leading the population to reach a consensus, the different timescales involved in the convergence process, etc. Then, having gained a clear picture of our model, we discuss in details the different elements that characterize it. In the third chapter we take into account the role of different underlying topologies, which turn out to have a crucial role in the overall dynamics of our model [21, 73, 74, 18]. We begin with low dimensional lattices, on which the analytical techniques developed in statistical physics can be applied, and we go on with more complex interaction patterns, such as homogeneous and heterogeneous networks. This series of results allows us to identify the topological properties mainly responsible for the particular behaviors observed in the different cases. In the fourth chapter, we continue to address the role of the topology on our model, but we focus on the microscopic activity patterns of single agents, rather than on the behavior of the global quantities [72]. Interestingly, it turns out that there is a non-trivial relation between the memory of an agent and the number of its neighbors. By means of a master equation approach, we show that different behaviors can be classified taking into account only the first two moments of the degree distribution of the underlying network. Moreover we identify two different temporal regimes depending on how much the system is close to the convergence. In the fifth chapter, we return to the definition of our model. More precisely, we modify different aspects of the microscopic rules in order to understand better their role, or to include new features. We investigate each variant and we obtain several interesting results, along with stimulating hints for future work. For example, we show that the introduction of a simple parameter yields to a series of transitions to asymptotic states different from consensus, while other modifications allow for better global performances [20] or the possibility of introducing naturally a broadcasting rule to replace the pairwise interaction scheme. We also perform some other modifications in order to compare our results to those of other pre-existing models. In the sixth chapter, finally, we address a different problem. Inspired by the semiotic dynamics issue of how new conventions spread in a popu-

5

6

Introduction lation of agents, we focus on the general problem of the mean first passage time of a random walker on a complex network [23]. We propose an approximated model that simplifies drastically the problem, mapping the original Markov chain in a new process with a much lower number of states. The method provides very accurate results when random graphs are considered, but it is also reliable when more complex topologies are taken into account.

Chapter 1

Modeling the emergence of language 1.1

Introduction

In this thesis we propose a simple model of Semiotic Dynamics that addresses the issue of the emergence of a shared lexicon in a population of agents. Of course, the topic has already been investigated in different fields ranging from Artificial Intelligence to Cognitive Psychology, and our work can not leave these contributions out of consideration. On the contrary, we were decisively inspired by pre-existing approaches. Moreover, physicists have been attracted by human dynamics for quite a long time, and of course their contributions provided us with fundamental hints. In this chapter we want to provide the reader with some necessary background to understand the original work that will be discussed in detail in the following of the thesis. In Sec. 1.2 we analyze the fundamental directions adopted in complex systems research to study the emergence of language. Sec. 1.3 deals with the Naming Game, which has been the fundamental inspiration for the model we shall introduce in the next chapter. Then, the Evolutionary Language Game is also reviewed (Sec. 1.4), as it is an important example of an orthogonal approach to the same problems. Sec. 1.5, on the other hand, deals with opinion dynamics models which are examples of the statistical physicists’ efforts to describe and capture social dynamics. As it will be clear in the following chapters, our approach to language shares many points of contact with this area. Finally, in Sec. 1.6 we shall introduce the fundamental problem of topology. Indeed, when addressing issues related with social phenomena, a crucial aspect is that of understanding how different interaction patterns influence the dynamics of the considered problems. Thus, complex networks become fundamental tools to deal with.

8

Modeling the emergence of language

1.2

A landscape for different models

Language is an abstraction used to refer to a set of conventions socially shared by a group. More precisely, it is the union of a system of symbols and the rules by which they are manipulated. How these conventions come to be accepted at a population-level is a very interesting question which does not have a uniquely accepted answer. Interestingly, however, in the last decade there has been a growing effort to tackle it resorting to agent based models, or communication games, and mathematical approaches [45, 49, 65, 182, 189]. The general issue has been often restricted, in this context, to that of the emergence of a shared vocabulary (i.e. a set of word-meaning associations), although there are remarkable exceptions that deal with higher level tasks [152, 153, 44]. The literature in this field is already very rich, and we do not want give a complete review of it. Instead, we prefer to provide a general picture of the different approaches, in order to set our work into a precise line of research. A first distinction among proposed models concerns sociobiological and sociocultural explanations. The sociobiological approach [111, 164], to which belongs also the well known Evolutionary Language Game [152], is based on the assumption that successful communicators, enjoying a selective advantage, are more likely to reproduce than worse communicators. Moreover, communication strategies are innate and are transmitted genetically across generations. Thus if one of them is better than the others, then in an evolutionary time-span it will displace all the rivals, possibly becoming the unique strategy of the population. The term strategy acquires a precise meaning in the context of each particular model. For instance, it can be a strategy for acquiring the lexicon of a language, i.e. a function from samplings of observed behaviors to acquired communicative behavior patterns [111, 156, 157, 154], or it can simply coincide with the lexicon of the parents [152], but other possibilities exist [182]. In this thesis we introduce a model, inspired to the Naming Game first proposed in [177, 176], that belongs to the sociocultural family [114, 176, 119, 129]. Here, good strategies do not provide higher reproductive success, but only better communication abilities. Agents can select better strategies exploiting cultural choice and direct feedback in communications. Moreover, innovations can be introduced due to the inventing ability of the agents. Thus, global coordination emerges over cultural timescales, and language is seen as an evolving and self-organized system [180]. Another important degree of freedom that can differentiate single models is the choice of the transmission structures among individuals. The three main forms are the vertical, the oblique (or role-model) and the horizontal (or peer to peer) schemes [37, 59]. Vertical transmission refers to the cultural transferral across generations, like for instance from parents to children, and is usually associated with the sociobiological approach. In this

1.3 The Naming Game framework, the actors of the transmission play different roles, since at each time the population is heterogeneous, made of already informed people and non instructed learners. Such heterogeneity is maintained also in the oblique approach, in which teachers and pupils are present, but in this case they do not belong necessarily to different generations. Finally, in horizontal transmission all agents are peers and can act both as teachers and students in each interaction. As we shall discuss later, we adopt this last mechanism. A further, fundamental, distinction among different models concerns the adopted mechanisms of social learning describing how stable dispositions are transmitted among individuals [37]. The two main approaches are the so called observational learning model and the operant conditioning model [170]. In the first, often associated with the sociobiological approach [111, 157, 154, 152], observation is the main ingredient of learning and statistical sampling of observed behaviors determines their acquisition. The second, on the other hand, emphasizes the inferential nature of communication, in which the stimulus and the response to a stimulus play a central role. In our work, we adopt the operant conditioning approach, as in [114, 176, 129], according to which language learning is mainly functional, i.e. directed toward the communication of meaning. In summary, our approach, inspired by the Naming Game [177], follows the sociocultural approach adopting the horizontal cultural transmission scheme and the operant conditioning model. The next two paragraphs are devoted to a deeper analysis of the Naming Game and of the Evolutionary Language Game [152] respectively. Thus the differences between two somehow orthogonal approaches to the emergence of a vocabulary will be clarified with concrete examples. It is however worth mentioning that the distinctions we have just discussed, though extremely significant, have to be considered only as references to classify existing models, since different hybrid approaches exist [182].

1.3 1.3.1

The Naming Game The origin

The Naming Game model constitutes a fundamental inspiration to the original work presented in this thesis. It was expressly conceived to explore the role of self-organization in the evolution of language [177, 176]. In the original paper [177], dated 1995 (even though published in 1996), Luc Steels focused mainly on the formation of vocabularies, i.e. a set of mappings between words and meanings (for instance physical objects). In this context, each agent develops its own vocabulary in a random private fashion. But agents are forced to align their vocabularies in order to obtain the benefit of cooperating through communication. Thus, a globally shared vocabulary emerges, or should emerge, as a result of local adjustments of individual

9

10

Modeling the emergence of language word-meaning association. We now discuss briefly the first implementation of the model proposed in [177], and in the immediately subsequent [176]. Consider a population of N agents. Each agent has an associated set of meanings Ma . For instance a certain agent a1 may want to communicate to another agent a2 the presence of a vital object in their environment. This communication is crucial for a1 because it helps him to recover the good it looks for, but it is also important for a2 which may perform a similar communication in future. Thus, both agents benefit from communication (reciprocity), but only if it is successful, i.e. if they share the same vocabulary. Each agent has a set of words, Wa , and can randomly associate a word w to a given meaning m. In case it would like to draw the attention of someone else to m it will therefore use the word a. Of course, the communicative success of the considered agent depends on the fraction of the population that adopts the same (m, w) coupling. Crucially, however, an agent can change its associations. In particular, given a certain success rate with the particular association, the agent will evaluate the possibility of changing it, assigning randomly a new word to the meaning. In the first version of the model this evaluation is done according to a sigmoid function that strongly promotes new associations when the success rate is low and abruptly encourages conservation above a given threshold of successes. The communication evolves through successive conversations, i.e. events that involves a certain number of agents (two, in practical implementations) and meanings. Conversations are particular cases of language games, which, as already pointed out by Wittengstein [195, 194], can be used to describe linguistic behavior, even if they can include also non-linguistic behavior, such as pointing. The two agents play asymmetric interactions, in which one of them plays as speaker and the other as hearer. The speaker tries to draw the hearer attention to a particular meaning using previously known words, or can create new associations for new meanings. If the hearer is able to identify the object, the game is a success, otherwise the game is a failure and the speaker is able to show the object to the hearer (for instance pointing at it), so that, if the hearer knows no words for that meaning, it can store the newly heard word. The agents evaluate their success, and eventually change their associations. In [177] it is shown with an example that the population actually converges on the word to be linked with a given meaning. This first version of the Naming Game is of course somehow still involved and contains many parameters, some of which may seem arbitrary, like for instance the sigmoid function for the evaluation of the word-meaning associations. However, the original paper pointed out clearly the idea that language is an autonomous system that forms itself in a self-organizing process. Moreover, remarkably, it suggested also a way of implementing and testing this hypothesis using computer-based simulations.

1.3 The Naming Game

1.3.2

11

The model

The model we have described above has gone through many adjustments. We want to describe now one of the latest formulations of the Naming Game, that constituted a fundamental starting point for the original work of this thesis. In particular, we focus on the version of the game proposed in the “Naming Game” section of [179], using some more formal contributions from [129]. A population of N agents is embedded in an environment containing a certain number of objects. Each object can be identified by different features, or meanings. For instance a red ball could be thought of both as “red object” and “round object”, and each of these descriptions could assure a non-ambiguous discrimination of the ball in a environment in which there are not other spherical or red objects. In [179] it is assumed that, for every agent, each meaning always refers only to a single object, and that a given object is conceptualized in the same way by all agents. Thus, in this context, the words “meaning” and “object”are equivalent. We will return on this point in the next section. Thus, all agents share a vector D of m associations: D = {(o1 , d1 ), (o2 , d2 ), .., (om , dm )}.

(1.1)

Agent i is then characterized by a second vector L, which is not shared, containing associations between words and meanings: Li = {(d1 , w1i ), (d1 , w2i ), .., (dj , w2i ), .., (dm , wli )},

(1.2)

where ambiguity can arise both because a given word refers to more than one object (homonymy) or because an object is associated with different words (synonymy). Next, each agent is given a finite collection of words W , whose size is w, which is shared between all individuals. Thus, in this implementation of the model, words are not invented, but instead are given a-priori. The number of word-meaning associations is n = w × d, and each pair (dik , wli ) is given a i , with v i ∈ [0, 1]. Hence, the lexicon Li can be defined as strength value vkl kl a matrix with rows and columns specifying the strength of the associations between meaning and words:    L =   i

i i v11 v12 ··· i i v21 v22 · · · .. .. .. . . . i i vd1 vd2 · · ·

i v1w i v2w .. .

   .  

(1.3)

i vdw

As we have seen in Section 1.2, matrix (1.3) must be dynamically transformed according to the social learning process and the cultural transmission

12

Modeling the emergence of language rule. The first determines how the matrix elements change after an interaction, and the latter whose values change (i.e. both speaker and hearer or some other configuration). In our context the adopted social learning mechanism is the naming game (that becomes the name of the model itself). As we have seen, it is played by two individuals in a common environment. The aim is to develop a common vocabulary. Individuals play either as speaker or as hearer, so that the interaction rules are not symmetric. The cultural transmission scheme, on the other hand, is peer-to-peer interaction or horizontal transmission. In the Naming Game only the speaker, i, determines a discriminative meaning for the objects in the environment, even if also the hearer, j, would in principle be able to do it. The speaker utters a word wai that is associated i . This word is received by with the meaning dik with the highest value vka the hearer, that checks for all the meanings associated with it. The agent j linking it with the word then selects the meaning djl with the higher score vlb i in consideration, wa . If both meanings are the same, the game is a success, otherwise it is a failure. To check whether the game was a success or not the hearer may use some non-linguistic capability, such as pointing at the considered object. After the game, both agents update their lexical matrices (1.3). There are several ways in which this can be done, but in all cases a further parameter δ is needed, with δ ∈ [0, 1]. In case of success, both agents increase the i → v i + δ and v j → v j + δ. strength of the right associations by δ, i.e. vka ka lb lb Moreover, they decrease the strength of the competing associations of a quantity smaller than, or equal to, δ performing the so-called lateral inhibition. In case of failure, on the other hand, the strength of the used associations is decreased, while that of competing pairs can be increased. Numerical experiments have shown that the combination of the agents definitions and interaction rules we have just presented allow a population of N individuals to develop a shared lexicon, so that they are sufficient ingredients to achieve the goal the model was proposed for. According to [179] some points must be considered crucial for the success of the model, since “when any of these characteristics of the agent architecture or the game are eliminated, the system does not work”. They are: 1. Agents must be able to perform multiple associations (one form, or word, can be associated with many meanings and one meaning with many forms); 2. An agent must be able to record a score for each associations; 3. Agents must be able to invent new words when no forms are available (in the framework discussed above this corresponds to asking that the number of words must be larger than the number of meanings, w > m);

1.3 The Naming Game 4. Agents must perform lateral inhibition; 5. Agents must get feedback in case of failure. In particular, it is worth stressing that concerning point 2, there is the explicit affirmation that “the score is necessary for the agent to decide which meaning or which form should be preferred in a particular interaction”, and, most remarkably, that “when random choices are made, lexicons do not converge”. Remarkably, as we shall see in the next chapter, our work proves that this is not true.

1.3.3

Behind the Naming Game

To a physicist, the Naming Game we have just explained may seem quite complicated. Indeed, the next chapters of this thesis are devoted to introducing and studying a simplified model. However, from a different point of view the perspective is exactly reversed, and the model may already appear only an abstraction. In particular, when aiming at investigating the actual mechanisms leading at the evaluation of theoretical aspects on the emergence of language, one has to consider a large set of problems coming from the need of embodying virtual agents in real bodies. It is most remarkable that such an experiment, involving real robots called Talking Heads, has indeed been run in the late 0 90s by a consortium of universities and research centers in Tokyo, Paris, Brussels, Amsterdam, London, Cambridge, Lausanne and San Jose [179]. Thus, it is worth pointing out briefly some of the major assumption behind the Naming Game. First of all it is assumed that each individual can direct the others attention towards a given object, i.e. that agents can perform some non-linguistic actions, like for instance pointing. The second and third assumptions are that agents must be able to perceive correctly a scene, and to apply meanings to them. The technical and conceptual problems raised by this point are so complex that even their mere enumeration would bring us too far in our discussion. Finally, there is a very subtle, yet crucial, point that the Naming Game does not consider. In the previous section we have said that meanings and objects were equivalent. This is clearly a very strong, and in general false, assumption. Indeed, in realistic games agents cannot inspect each others’ brain. There is no feedback about the meaning of a word, but only about its referent, i.e. the object, in our case. After a successful interaction the two agents can only know that they apply the same word to the same referent, and in case of failure the speaker can only show the object it wanted to speak about. But it is not possible to transfer the right intended meaning, and often more than one meaning allows to distinguish the referent from other objects in the same environment. This is the famous gavagai-problem in which an anthropologist has to understand what a native, speaking an

13

14

Modeling the emergence of language unknown language, might mean when he utters “gavagai” while pointing to a white rabbit scurrying by [165]. The assumption that in The Naming game agents get direct feedback about the meaning of a word translates in requiring that all agents share the same perception, that they already share a repertoire of shared meanings, that for every agent a particular meaning always picks out a single referent, and that the same referent is conceptualized by the same meaning by every agent. In conclusion, we can sketch briefly the actual steps that two embodied agents have to go through to play a language game: 1. Both agents have to perceive the scene. This involves the acquisition of an image and a post-processing to be performed on the basis of sensory characteristics of each part of it. For instance the color, shape, movement of each object; 2. The speaker must conceptualize the scene. It has to find a category, or a combination of categories that distinguish the referent from the other objects in the context, and that will act as the meaning in the communication; 3. The speaker has to verbalize this conceptualization. It must use its language system to find words and syntactic constructions expressing the meaning; 4. The hearer must proceed in the opposite direction. It must interpret the utterance in order to find out which conceptualization constitutes the meaning; 5. It has then to apply the meaning to identify the referent in the perceived scene; 6. Finally, the hearer must act upon the outcome of meaning identification. It points to the referent it has identified. In this step both agents must co-ordinate through the external world. Obviously we shall completely neglect this full picture in our work, assuming that our agents can perform all the tasks required to come to communication. It is however convenient, according to us, bearing in mind what we are actually assuming.

1.3.4

Stages in language games

In the Naming Game agents aim to give proper names to objects. Meanings corresponds to objects and the goal is understanding how a population is able to agree on the use of conventions. This can be seen as the first of

1.3 The Naming Game Stage I II III

Meaning Individuals Single Categories Multiple Categories

15 Form Proper Names Single words Multiple Words

Breakthrough Convergence Co-evol. lang/meaning Compositionality

Table 1.1: First stages in the evolution of language.

different steps in the evolution of a language. Indeed, both for research and engineering reasons it can be useful to identify different stages in the evolution of language because simpler forms of communication can be studied before tackling more complex situations. Thus, each stage corresponds to an increased level of complexity of both the meanings and the forms used by the agents, and requires a major breakthrough to be reached from simpler forms of languages. Similar stage approaches are quite common in different fields, like for instance in biology, where this point of view is applied to the transition that would lead from cell to more complex life forms [136]. However, for what concerns linguistics, it must be stressed that this way of casting the problem is object of an intense debate, and there is no consensus on the possible nature and number of levels. We review here very briefly the schematization in seven stages proposed in [182] for completeness, and we refer the interested reader to the same source for more details on this subject. In Table 1.1 we report the first three identified in [182]. We have already discussed stage I, which is addressed by the Naming Game. Stage II involves the notion of single category and that of words to name it (for instance “smooth” or “red”). The aim is to understand how language and meanings can co-evolve at a global level. Different categories can then be needed to identify an object (“red small object”). In Stage III one or more words can be used to talk about the combination of categories. The goal is to observe the emergence of compositionality, i.e. of a communication system in which an utterance can contain several parts and the meaning of the total is a combination of the meanings of the parts. Next four steps involve syntax and are summarized in Table 1.2. Understanding how agents can start to use syntactic patterns for linking meanings of individual words is the goal of Stage IV. Stage V invokes the use of syntactic categories, such as nouns, verbs, etc. In Stage VII predicates referring to other predicates should appear (such as the adverb “very”in “the very big ball”). Finally, Stage VII arises when it becomes possible to use language to speak about language itself, thus allowing for a more rapid cultural spreading of linguistic conventions. Of course, the higher we go in this hierarchy, the less results have been obtained and the more controversial becomes the debate.

16

Modeling the emergence of language Stage IV V VI VII

Meaning Multiple objects + predicates id. Second order predicates Meta-level

Breakthrough Constructions Meta-grammar Second order expression Level formation

Table 1.2: Higher syntactical stages in the evolution of language.

1.4

The Evolutionary Language Game

According to the sociobiological approach [111, 156, 157, 154], evolution is the main responsible both for the origin and the emergence of natural language in humans [164]. Consequently, natural selection is the fundamental driving force to be introduced in models. Evolutionary game theory [135] was formulated with the aim of adapting classical game theory [158] to deal with evolutionary issues, such as the possibility for agents to adapt, learn and evolve. The approach is phenotypic, and the fitness of certain phenotype is, roughly speaking, proportional to its diffusion in the population. Strategies of classical game theory are substituted by traits (genetic or cultural), that are inherited, possibly with mutations. The search for Nash equilibria becomes the quest for evolutionary stable strategies. A strategy is stable if a group adopting it cannot be invaded by another group adopting a different strategy. Finally, a fundamental assumption is that the payoff from a game is interpreted as the fitness of the agents involved in the game. The Evolutionary Language Game [154, 152, 80], aims at modeling the emergence of language resorting to evolutionary game theory and the concept of language game [195, 194] discussed in Sec. 1.3.1. We analyze now in some detail how the problem of the evolution of a common vocabulary is addressed in this framework [152]. A population of agents lives in an environment with n objects. Each of them has a repertoire of m words (or “signals”, in the original terminology) to be associated with object. Individuals are characterized by two matrices P and Q, which together form a language L. The entries of P denote the probability of speaking with word j when seeing object I, whereas qij denotes the probability for a listener to associate sound j with object i. When trying to communicate, two agents, identified by the two languages L and L0 , get the payoff: n X m 1X 0 F (L, L ) = (pij qji + p0ij qji ). 2 i=1 j=1 0

(1.4)

Thus, both agents are treated once as hearer and once as speaker, and, like in

1.5 Models of opinion dynamics the Naming Game, they both receive a reward for successful communication. In each round of the game, every individual communicates with every other individual, and the accumulated payoffs are summed up. As we have said, the payoff is interpreted as fitness: agents with higher payoff have a higher survival chance and leave more offspring who learn the language of their parents by sampling their responses to individual objects. The model can then be enriched adding a probability of errors in perception. Moreover, the traits transmitted to the progeny can be different from the language itself. For instance, in [154] the agents inherit a learning strategy specifying whose language must be sampled among those of the parent, a randomly selected agent or a successful communicator. It is also worth noting that the presence of two completely uncorrelated matrices for the production, P , and comprehension, Q, modes, already present in [111, 156, 157], has been object of criticism in [123], where a single matrix is adopted for both tasks. The differences between the Evolutionary Language Game and the Naming Game are manifest. First of all the fundamental assumptions are orthogonal, involving evolution and self-organization, respectively. Second, cultural traits (i.e. words) are transmitted horizontally in the case of the Naming Game and vertically in the case of the Evolutionary Language Game. Third, the Naming Game adopts the operant conditioning model of social learning, whereas the Evolutionary Language Game adopts the observational learning one. Finally, it must be stressed that while the Naming game was conceived to be experimentally testable with embodied agents, the Evolutionary Language Game prescribes highly abstract interaction rules, which rely on the possibility of the agents to inspect each other languages. Finally, it must be stressed that also in the case of Evolutionary Language Game the emergence of a shared vocabulary only represents a first stage in the evolution of Language. In [152], higher stages are proposed, ending up with the emergence of grammar.

1.5

Models of opinion dynamics

Physicists have been attracted by human dynamics for quite a long time, and their contribution in this area has been an important source of inspiration for us. The traditional approach consists in “exporting”, and possibly adapting, well known models of statistical physics. In particular, often one hopes that simple schematizations could describe some universal properties connected, for instance, with the self organization of real systems or the emergence of characteristic features, such as power laws [185]. As a consequence, the microscopic aspects of different models can be far from being realistic, while the results aim at describing real global aspects of social phenomena [14, 10]. Different kinds of social dynamics have been investigated adopting the so-called “socio-physics” approach, ranging from diffusion processes (e.g.

17

18

Modeling the emergence of language spreading of innovations [169], ideas [32], knowledge [69], rumors [71] or gossip [131], etc) to the interesting area of strategic games (e.g. the minority game [60], the prisoner’s dilemma [166], etc). However, we want to focus here on the specific issue of opinion dynamics [175, 97], that has interesting contact points with our main subject of the emergence of a shared vocabulary. In fact, following the sociocultural approach (Sec. 1.2), linguistic global coordination emerges from a negotiation process. Individuals compare their points of view and try to agree on a shared convention. It is then evident that the mechanism can be seen as very close to that in which agents develop and modify their opinions under the influence of others. In general, beyond the specific microscopic rules, most opinion-dynamics models aim at describing the evolution of a population of agents towards some clearly identifiable final state. This can be a situation of consensus, where all agents agree on a given opinion, of fragmentation, where most of the initial opinions survive, or of polarization, where only a finite number of initials opinions (usually two) survive. Again, the similarity with a linguistic scenario is striking: the final state could represent the agreement (or stable disagreement) on given convention (e.g. the name to assign to an object). A complete review of the large amount of work made in the field of opinion dynamics is out of the focus of this thesis. Instead, we describe some of the proposed models in order to give the reader a flavor of the state of art in this field, focusing on those traits that present a closer connection with the model that we will introduce in next chapter. In this perspective, we list below some of the most significative models. The Axelrod model [15], describes agents with a rich repertoire of opinions. More precisely, individuals σi are endowed with F cultural traits (σi,f , f = 0, .., F − 1), each of which can assume one out of q values. The agents are embedded on a square lattice and at each time step two of them are randomly selected along with a cultural trait f . If σi,f 6= σj,f nothing happens. Otherwise another cultural trait f 0 is randomly chosen and its value is set equal for the two agents, i.e. σj,f 0 → σj,f 0 = σi,f 0 . Absorbing states are those in which, across each bond, all features are equal (σi,f = σi,f ∀f ) or different (σi,f = σi,f ∀f ), and final configurations can be described in terms of distributions of different clusters of homogeneous agents. In this respect, it is worth mentioning that a non-equilibrium phase transition from a culturally polarized phase (all agents belong to the same cluster) and culturally fragmented one (finite size, O(1), clusters) has been shown to occur as q grows [53, 121]. In the Sznajd-Weron and Sznajd model [185], agents are represented by spins, and the inspiring social dynamics is that of voters that have to choose between two candidates. The fundamental principle is that the opinion of pairs of agents influences that of their neighbors. So, if the considered spins “agree”, then they convince their nearest neighbors, otherwise each member of the couple causes its nearest neighbor to disagree with it. The model can

1.5 Models of opinion dynamics be enriched allowing single spins to have more states, or even a continuum of possible opinions. In this case the further ingredient of bounded confidence can be added: each agent is willing to be influenced by another agents only if their opinions are not too different, i.e. when they are in a confidence bound. It turns out that consensus is almost always reached when the number of states is less or equal to three, while such a consensus is rare for four or more possible opinions (with a confidence bound equal to one). When the opinions form a continuum, on the other hand, the individuals always agree in the end, for any confidence bound. The Hegselmann and Krause model [105] describes the opinion formation of an individual as an averaging process on the opinions of all its neighbors, provided they fall inside a given confidence bound ². This is the fundamental parameter of the model, since three different final states can be reached depending on its value. In particular, starting from a uniform distributions of opinions, that here are real numbers between 0 and 1, the final state consists of a plurality of stable opinions (small ²), a consensus on a single opinion (large ²), or, interestingly, a polarization of the population on two or three different opinions for intermediate values of the confidence bound. In the model proposed by Deffuant et al. [79], agents perform pairwise interaction. Opinions are real numbers, and when two agents i and j meet, they make their opinion Oi and Oj closer by a certain amount µ|Oi − Oj |, withµ ≤ 0.5, provided that they find each other in their confidence bound, i.e. |Oi − Oj | < ². Also in this case the fundamental parameter is ², and its magnitude determines the number of opinions that survive in the final state. The parameter µ, on the other hand, only influences the convergence time. The voter model [130, 124, 126, 30, 81, 90, 173] assigns to each agent a discrete variable, i.e. an opinion, that can assume two values. The dynamics then evolves following the rule that at each time step the opinion of a randomly chosen agent is made equal to the one of its neighbors (selected at random on its turn). The process ends up with a complete order on low dimensional lattices but different blocked configurations survive in more complex topologies [54, 188]. Finally, it is worth mentioning also the so called Galam’s majority rule model [92, 93] which simply exploits the Ising model majority rule [61, 138]. All these models have been extensively studied and many variants have been proposed, (see [175], for a recent review of some significant contributions). For our purposes it is worth remarking that, in all cases, bounded confidence is the mechanism determining the final state. In the Axelrod model, in particular, the idea that close agents (in the space of opinions) should become closer is part of the definition itself of the model. Indeed, the phase transition found for large number of the possible values of each cultural trait can be seen as resulting from a modulation of the confidence bound. This is very interesting for us, since in Sec. 5.2 we shall observe a

19

20

Modeling the emergence of language similar consensus-to-polarization transition in our model, where it is generated by a completely different mechanism.

1.6

Interaction topologies

In the Naming Game model described above, at each time step two agents are randomly chosen, thus implying that the population is completely unstructured (i.e. we are in the mean-field case). In general, this assumption is made by many of the models we have described in previous sections. Alternatively, agents are embedded in regular topologies. For instance in the Axelrod model individuals are placed on the nodes of a regular square lattice [15]. Both situations are of course highly unrealistic, since in real social systems the possible interactions of an agent are not unlimited, nor reduced to just four. Moreover, crucially, not all individuals have necessarily the same number of acquaintances. More realistic alternatives to regular structures are given by complex networks [6, 82, 162, 47]. A network is, roughly speaking, an ensemble of nodes connected by links (or edges). Examples of such structures are common, Internet and the World Wide Web being the most obvious. However, networks are an extremely powerful tool to describe a wide range of different systems every time that one is interested in coarse-grained analysis, and do not want to take into account specific microscopic aspects [6, 82, 162, 47]. In this perspective, recent years have witnessed a great effort towards the characterization of several natural or social systems in terms of nodes and links representing their interactions. As examples we can mention social networks in which people are the nodes and their social relations are the links [190], scientific collaboration networks, where two scientist are connected by a link if they have co-authored at least an article [148], metabolic networks in which nodes are the substrates and edges are chemical reactions in which the substrates participates [118] and food webs in which the nodes are species and the links represent predator-prey relationships [96]. The attention towards complex networks was originated by the finding that the Internet and the World Wide Web presented unexpected topological properties, such as heterogeneous connectivity patterns [89]. Indeed, up to that point, there was a general consensus on the fact that the topology of real networks could be described in terms of regular structures (e.g. grids or lattices) or random graphs (i.e. networks in which all nodes have approximately the same connectivity [68]). Immediately after it was realized that, as we have seen, most natural or artificial networks shared a certain number of features that were not captured by known models. Among the unforeseen characteristics of real networks, the most peculiar are the scale free degree distribution [16] and the “small world” property [192]. The first is the fact that, said degree k of a node the number of

1.6 Interaction topologies links which connect it to other nodes, the degree distribution P (k) presents fat tails that often follow a power law behavior P (k) ∼ k −γ , with 2 ≤ γ ≤ 3. This means that the second moment of the distribution is diverging and there are large fluctuations in the connectivity of the nodes. The presence of very few nodes with very high connectivity (the hubs), in general play a central role in the structural and dynamical properties of the system. The “small world” property, on the other hand, is the name attributed to the evidence that the minimal hop distance between each pair of nodes scales logarithmically with the network’s size instead of algebraically as in usual regular lattices. This happens also in the case of random graphs, but in complex networks it comes along with the presence of many triangles or small cycles of motifs, which are completely absent in random graphs. Statistical physics immediately showed itself to be a powerful and natural framework to characterize more precisely complex networks. At the same time, physicists have tried to reproduce the observed features. New models have been defined and extensively studied, and new paradigms have been introduced. Along with the topological properties of complex networks, however, also the dynamics, and its interplay with the topology itself, has been object of great attention. First of all the number of nodes and links in complex networks is not fixed, since nodes and link can appear and disappear at any time. Thus, models have had to take into account such processes as growth, or rewiring (the process of moving an edge of a link from one node to another), which were not considered by traditional static models of graph generation. Then, the topology has a major role also when the interest is toward dynamical processes taking place on networks. These can be either phenomena naturally related to networks, like for instance routing problems, or traditional paradigms of statistical mechanics, like random walks or Ising-like models, whose known properties can be altered by the effects of a complex topology. Finally, there is the fascinating aspect of the interplay between the topology and the dynamics taking place on it, which has been addressed only more recently. This aspect becomes crucial when the characteristic timescales of the dynamics occurring on networks coincide with those of the network evolution and the two phenomena influence each other. In this thesis we shall investigate carefully the effect of different underlying topologies on the model we shall introduce in next section. As we shall see, different interaction patterns affect dramatically the properties of our model, and we shall investigate many different situations, ranging from the mean field fully connected graph, to one dimensional ordered lattices to artificial and real world networks. In Chapter 6, finally, inspired by the problem of the spreading of conventions we shall investigate some properties of random walk on networks. We shall discuss and present in detail the different properties and models when we need them, but for convenience Appendix A contains a short overview of the most fundamental aspects of

21

22

Modeling the emergence of language network theory.

Chapter 2

The Naming Game 2.1

Introduction

In this chapter we introduce and begin to analyze a simple model which is able to account for the emergence of a shared set of conventions in a population of agents [22]. Central control or co-ordination are absent, and agents perform only pairwise interactions following extremely uncomplicated rules. Our approach is inspired by the Naming Game model [177, 176], and inherits its conceptual background. This means that we adopt the sociocultural perspective together with the horizontal transmission structure and the operant conditioning model as a mechanism of social learning (see Sec. 1.2 and Sec. 1.3). Thus, our modus operandi is somehow orthogonal to the classic socio-physics method. Indeed, we do not “export” nor adapt any statistical physics model to other disciplines (Sec. 1.5), but rather we “import” a pre-existing model in the statistical mechanics context. Since there is no possibility of any confusion, from now on we shall refer to our model as the “Naming Game” to stress its origins. Due to the simplicity of the interaction scheme, the dynamics of the model can be studied by means of both massive simulations and analytical approaches. This is a crucial point. Indeed, very often sociocultural frameworks lack quantitative investigations [179], contrarily to what happens in sociobiological approaches, where the concepts and techniques of (evolutionary) game theory have been extensively exploited (Sec. 1.2), and in socio-physics models. For instance, we shall discuss in great detail how the main features of the process leading the population to a final convergence state scale with the population size, while in general other models concentrate on studying very small populations, even composed of only two players [172]. The price to pay for an extensive quantitative comprehension is that of taking into account a smaller number of aspects of the considered phe-

24

The Naming Game nomenon. Thus, our agents are indeed very simple and stylized, and are not suitable for deeper speculations on the actual internal communication mechanisms. In other words, as we want to study in greater detail global properties, we have to renounce to rich agents structure. It is worth stressing, however, that the effort towards simplicity allows to clarify which are the fundamental individual characteristics that allow to obtain the desired global co-ordination properties. As we shall see, they are indeed very simple, and this is an important result of our work. This chapter is aimed at introducing and studying the Naming Game in the mean-field case, i.e. when the population is unstructured and all agents can in principle interact with each others. In Sec. 2.2 we present the model and discuss its basic phenomenology. Sec. 2.3 is devoted to the study of the role of the population size. We investigate the scaling relations of some important quantities and provide analytical arguments that allow to find the right exponents. In Sec.2.4 we look in more detail at the mechanisms that give rise to convergence. In particular, we identify and explain the presence of an hidden timescale that rules the transition to the final consensus state. In Sec.2.5 we focus on the relation between single simulation runs and averaged quantities, while in Sec. 2.6 we investigate the properties of the consensus word. We then analyze, in Sec. 2.7, a controlled case that helps us to shed light on the nature of the symmetry breaking process that yields convergence. Finally, we discuss the most relevant features of our model and present some conclusions in Sec. 2.8.

2.2

The model

We present here the Naming Game model we have introduced in [22]. The game is played by a population of N agents which play pairwise interactions in order to negotiate conventions, i.e. associations between forms and meanings, and it is able to describe the emergence of a global consensus among them. For the sake of simplicity the model does not take into account the possibility of homonymy, so that all meanings are independent and one can work with only one of them, without loss of generality. An example of such a game is that of a population that has to reach the consensus on the name (i.e. the form) to assign to an object (i.e. the meaning) exploiting only local interactions, and we will adopt this perspective in the following. However, as it will be clear, the model is appropriate to address all those situations in which negotiation rules a decision process (i.e. opinion dynamics, etc.). Each individual is described by its inventory, i.e. a set of form-meaning pairs (i.e, names, words, opinions, etc.) which is empty at the beginning of the game (t = 0) and evolves dynamically in time. At each time step (t = 1, 2, ..) two agents are randomly selected and interact: one of them play as “speaker”, the other one as “hearer”. The interactions obey the

2.2 The model

Figure 2.1: Naming game interaction rules. The speaker selects randomly one of its words, or it invents a new word if its inventory is empty (i.e. we are at the beginning of the game). If the hearer does not know the uttered word, it simply adds it to its inventory, and the interaction is a failure. If, on the other hand, the hearer recognizes the spoken word, the interaction is a success, and both agents delete from their inventories all their words but the winning one.

following rules (Fig. 2.1): ˆ The speaker has to transmit a name to the hearer. If its inventory is empty, it invents a new word, otherwise it selects randomly one of the names it knows; ˆ If the hearer has the uttered name in his inventory, the game is a success, and both agents delete all their words but the winning one; ˆ If the hearer does not know the uttered word, the game is a failure, and the hearer inserts the word in its inventory.

With this scheme of interaction, the assumption of the absence of homonymy simply translates into assuring that each newly invented word had never appeared before in the population. Thus, single objects are independent (i.e. it is impossible that two agents use the same word for two different objects) and their number becomes a trivial parameter of the model (see Sec. 5.5 for an explicit example). Indeed, one can treat an environment composed by an arbitrary number of objects by a simple controlled rescaling of the

25

26

The Naming Game findings of the case with one object. For this reason, as we mentioned above, we concentrate on the presence of one single object, without any deficit of generality. It is also interesting to note that the problem of homonymy has been studied in great detail in the context of evolutionary game theory and Komarova and Niyogi[123] have shown that languages with homonymy are evolutionary unstable (see Sec 1.4). However it is obvious that homonymy is an essential aspect of human languages, while synonymy seems less relevant. The two authors solve this apparent paradox noting that if we think of “words in a context” homonymy almost disappears while synonymy acquires a much grater role. This observation fits very well also with our inferential model of learning according to which we assume that agents are placed in a common environment and they are able to point referents (Sec. 1.2). So, after a failure, the speaker is able to point the named object (or referent) to the hearer which in its turn can assign the new name to it. Another important assumption of the model is that two agents are randomly selected at each time step. This means that each agent in principle can talk to anybody else, i.e. that the population is completely unstructured (we can refer to this situation as ”mean-field”case). In fact, we are first of all interested in investigating in greater detail this simpler case. The role of different underlying topologies will be discussed in detail in Chapters 3 and 4. Finally, it is worth stressing that the random selection rule adopted by the speaker to select the word to be transmitted, and the absence of weights to be associated with words, expressly violate a previous paradigm according to which these were fundamental ingredients to make the population reach a global consensus ( [179] and Sec. 1.3.2). Indeed, as we are going to show, they turn out to be unnecessary.

2.2.1

Basic phenomenology

The most basic quantities describing the state of the population at a given time t are: the total number of words present in the system, Nw (t), the number of different words known by agents, Nd (t), and the success rate, i.e. the probability of observing a successful interaction at a given time, S(t). In Figure 2.2 we report data concerning a population of N = 103 agents. The process starts with a trivial transient in which agents invent new words. It follows a longer period of time where the N/2 (on average) different words are exchanged after unsuccessful interactions. The probability of a success taking place at this time is indeed very small (S(t) ' 0) since each agent knows only few different words. As a consequence, the total number of words grows, while the number of different words remains constant. However, agents keep correlating their inventories so that at a certain point the probability of a successful interaction ceases to be negligible. As fruitful interactions become more frequent the total number of words at first reduces its growth and then start decreasing, so that the Nw (t) curve presents a well

2.2 The model

27

Nw(t)

15000

Nd(t)

averaged single run

5000 0 0 600

S(t)

a

10000

20000

40000

80000

60000

1e+05

b

400 200 0 0 1 0.8 0.6 0.4 0.2 0 0

20000

40000

80000

60000

0.2

S(t)=3*t / N 0

20000

40000

1e+05

c 0

2

20000

t

60000

40000

80000

1e+05

Figure 2.2: Basic global quantities. a) Total Number of words present in the system, Nw (t); b) Number of different words, Nd (t); c) Success rate S(t), i.e. probability of observing a successful interaction at a time t. The inset shows the linear behavior of S(t) at small times. All curves concern a population of N = 103 agents. The system reaches the final absorbing state, described by Nw (t) = N, Nd (t) = 1 and S(t) = 1, in which a global agreement on the form (name) to assign to the meaning (object) has been reached.

identified peak. Moreover, after a while, some words start disappearing from the system. The process evolves with an abrupt increase in the number of successes, whose curve S(t) exhibits a characteristic “S-shaped” behavior, and a further reduction in the numbers of both total and different words. Finally, the dynamics ends when all agents have the same unique word and the system is in the attractive convergence state. It is worth noting that the developed communication system is not only effective (each agent understands all the others), but also efficient (no memory is wasted in the final state). From the inset of Figure 2.2 it is also clear that the S(t) curve exhibits a linear behavior at the beginning of the process: S(t) ∼ t/N 2 . This can be understood noting that, at early stages, most successful interactions involve agents which have already met in previous games. Thus the probability of a success is proportional to the ratio between the number of couples that have interacted before time t, whose order is O(t), and the total number of possible pairs, N (N − 1)/2. The linear growth ends in correspondence with the peak of the Nw curve, where it holds S(t) ∼ 1/N 0.5 , and the success rate curve exhibits a bending afterwards, slowing down its growth till a sudden burst that corresponds to convergence.

28

The Naming Game

10

t

10 10 10

8

6

4

tmax tconv tdiff

α

t ∝ Ν , α ≈ 1.5

2

0

10 0 10

Nw

max

6

10 4 10 2 10 0 10 0 10

10 max

Nw

2

10

4

6

10

γ

∝ Ν , γ ≈1.5

10

2

10

4

6

10

N Figure 2.3: Scaling with the population size N . In the upper graph the scaling of the peak and convergence time, tmax and tconv , is reported, along with their difference, tdif f . All curves scale with the power law N 1.5 . Note that tconv and tdif f scaling curves present characteristic log-periodic oscillations (see Sec. 2.3.2). The lower curve shows that the maximum number of words (peak height, Nwmax = Nw (tmax )) obeys the same power law scaling.

2.3 2.3.1

The role of the system size Scaling relations

Now that we have a qualitative picture of the dynamics leading the system to convergence, it is natural to investigate the role played by the system size N . In particular, two fundamental aspects depend on N . The first is the time needed by the population to reach the final state, that we shall call the convergence time tconv . The second concerns the cognitive effort in terms of memory required to each agent by the dynamics. This reaches its maximum in correspondence of the peak of the Nw (t) curve. In Figure 2.3 it is shown the scaling behavior of the convergence time tconv , and the time . and height of the peak of Nw (t), namely tmax and Nwmax = Nw (tmax ). The difference time (tconv − tmax ) is also plotted. It turns out that all these quantities follow power law behaviors: tmax ∼ N α , tconv ∼ N β , Nwmax ∼ N γ and tdif f = (tconv − tmax ) ∼ N δ , with exponents α ≈ β ≈ γ ≈ δ ≈ 1.5. The values for α and γ can be understood through simple analytical arguments. Indeed, assume that, when the total number of words is close

2.3 The role of the system size

29

to the maximum, each agent has on average cN a words, so that it holds α = a + 1. If we assume also that the distribution of different words in the inventories is uniform, we have that the probability for the speaker to play a given word is 1/(cN a ), while the probability that the hearer knows that word is 2cN a /N (where N/2 is the number of different words present in the system). The equation for the evolution of the number of words then reads: µ

2cN a dNw (t) 1 1 − ∝ dt cN a N





1 2cN a 2cN a cN a N

(2.1)

where the first term is related to unsuccessful interactions (which increase Nw by one unit), while the second one to successful ones (which decrease Nw max ) by 2cN a ). At the maximum dNw (t = 0, so that, in the thermodynamic dt limit N → ∞, the only possible value for the exponent is a = 1/2 which implies α = 3/2 in perfect agreement with data from simulations. For the exponent γ the procedure is analogous, but we have to use the linear behavior of the success rate and the relation a = 1/2 we have just obtained. The equation for Nw (t) now can be written as: µ

ct 1 dNw (t) 1− 2 ∝ dt N cN 1/2





1 ct 2cN 1/2 . cN 1/2 N 2

(2.2)

w (t) If we impose dNdt = 0, we find that the time of the maximum has to scale with the right exponent γ = 3/2 in the thermodynamic limit. The exponent for the convergence time, β, deserves a more articulate discussion, and we can only provide a more naive argument, even though well supported by numerical evidences. We concentrate on the scaling of the interval of time separating the peak of Nw (t) and the convergence, i.e. tdif f = (tconv − tmax ) ∼ tδ ∼ N 1.5 , since we already have an argument for the time of the peak of the total number of words tmax . This is the time span required by the system to get rid of all the words but the one which survives in the final state. The problem cast in such a way, we argue that a crucial parameter is the maximum number of words the system stores at the beginning of the elimination phase. If we make the mean √ field assumption that at t = tmax each agent has on average Nwmax /N ∼ N words, we see that, by definition, in the interval tdif f , each agent must have won at least once. This is a necessary condition to have convergence, and we want to explore which is the timescale in which it is satisfied. Said N the number of agents who have not yet won a game at time t, we have:

N = N (1 − ps pw )t

(2.3)

where ps = 1/N is the probability that an agent is randomly selected and pw = S(t) is the probability of a success. The latter is O(1/N 0.5 ) at tmax , and stays around that value for a quite long time span afterwards. Indeed,

30

The Naming Game 10

0

-1

10 10

-2

v(t) V(t) Nw(t)

-3

10 0

tdiff

10 10 10 10

7

2×10

7

t

4×10

6×10

7

8

6 4

fit: tdiff ~ N*log(N)

2 0

10 0 10

10

2

4

10

10

6

8

10

N

Figure 2.4: Evidences supporting the argument for the β exponent. Top: v(t) is the (non normalized) histogram of the times at which agents play their first successful interaction, while V (t) is the cumulative curve. It is clear that up to a time very close to convergence there are still agents that have never won. Thus, the investigation of the first time in which V (t) = 1 provides a good estimate of tconv . Data refer to a single run for a population of N = 105 agents. The Nd (t) curve is also plotted, for reference, while the vertical dashed grey line indicates convergence time. Bottom: scaling of tdif f with N for a system in which, at the beginning of the process, half of the population knows word A and the other half word B. Thus, Nd (t = 0) = 2 and invention is eliminated. Experimental points are well fitted by tdif f ∼ N log N , as predicted by our argument (see text). A fit of the form tdif f ∼ N δ , on the other hand, turns out to be less accurate (data not shown).

as we have seen, the success rate S(t) grows linearly till the peak, where S(t) = ctmax /N 2 ∼ 1/N 0.5 , and exhibits a bending afterwards, before the final jump to S(t) = 1 (Fig. 2.2). If we insert this value in eq. (2.3), and we require the number of agents that have not yet won an interaction to be finite just before the convergence, i.e. N (tconv ) ∼ O(1), we obtain tdif f ∼ N 3/2 log N . Thus, the leading term of the difference time tdif f ∼ N 1.5 is correctly recovered, and the necessary condition N (tconv ) ∼ O(1) turns out to be also sufficient. The possible presence of the logarithmic correction, on the other hand, cannot be appreciated in simulations due also to logarithmic oscillations in the tdif f curve (Sec. 2.3.2). Finally, it is worth noting that the S(t) ∼ 1/N 0.5 behavior can be understood also assuming that at the peak of Nw (t) each agent has O(N 0.5 ) words (mean field assumption), and that the average number of words in common between two inventories is O(1)

2.3 The role of the system size

31

5.0 tmax / N

3/2 3/2

tconv / N tconv / tmax t ∝ sin(a + b ln(x))

4.0

t

3.0

2.0

1.0

0.0 0 10

10

2

10

4

6

10

N Figure 2.5: Log-periodic oscillations for convergence times. Rescaled values of tconv and tmax are plotted along with their ratio. The rescaled convergence times exhibit global oscillations that are well fitted by the function t ∝ sin(c + c0 ln(x)), where c and c0 are constants whose values are c ≈ 1.0 and c0 ≈ 0.4.

(as confirmed by numerical simulations shown in Fig. 2.12). We can test the hypothesis behind the above argument in two ways. First of all we can investigate the distribution v(t) of the times at which agents perform they first successful interaction. Remarkably, Fig. 2.4 (top) shows that this distribution extends approximately up to tconv , so that the R ∗ time t∗ , at which V (t) ≡ 0t v(t) = 1, turns out to provide a good estimate for tconv . Then, we can validate our approach studying a controlled case. Consider a simplified situation in which each agent starts the usual Naming Game knowing one of only two possible words, say A and B. Invention is then prevented, and for the peak of Nw (t) it holds Nwmax ∼ N . Noting that in this case we have S(tmax ) ∼ O(1), and substituting this value in eq. 2.3, we obtain that tdif f ∼ N log N . Indeed, this prediction is confirmed by simulations also for what concerns the logarithmic correction (Fig. 2.4 (bottom)), and our approach is supported by a second validation.

2.3.2

Rescaling curves

Since we know that the characteristic time required by the system to reach convergence scales as N 1.5 we would expect a transformation of the form t → t/N 3/2 to yield a collapse of the global-quantity curves, such as S(t) or Nw (t), relative to systems of different sizes. However this does not happen.

32

The Naming Game

1 N=50 N=100 N=500 N=1000 N=5000 N=10000 N=50000 N=100000

0.8

S(t)

0.6 0.4 0.2 0 -1

0

1

2

t / tS(t)=0.5 - 1 Figure 2.6: Rescaling of the success rate curves. Curves relative to different t system sizes show different qualitative behavior if time is rescaled as t → ( tS(t)=0.5 −

1), where tS(t)=0.5 ∼ N 3/2 . Indeed, on this timescale, the transition between the initial disordered state and the final ordered one where S(t) ≈ 1 (i.e. the disorderorder transition) becomes steeper and steeper as N grows.

The first reason is that the curve of the scaling of the convergence time with N does follow a N 3/2 trend, but presents a peculiar, seemingly oscillatory, behavior in logarithmic scale. This is already visible from Figure 2.3, but is clearer in Figure 2.5, where it is shown that the curve tconv /N 3/2 is well fitted by a function of the type t ∝ sin(c + c0 ln(x)), where c and c0 are constants1 . In Figure it is also shown that such oscillation are absent, or at are least very reduced, in the curve of peak times, tmax . The deviations of the convergence time scaling curve from a pure power law have the effect of scattering rescaled curves, thus preventing any possible collapse. An easy solution to this problem is that of rescaling according to intrinsic features of the each curve. In Figure 2.6, we have rescaled success rate S(t) curves following the transformation t → t/tS(t)=0.5 − 1, where tS(t)=0.5 is the time in which the considered curve reaches the value 0.5 (with tS(t)=0.5 ∼ N 1.5 , not shown). Interestingly we note that the curves still do not collapse. In particular the transition between a disordered state 1

It must be noted that, since the supposed oscillations should happen on logarithmic scale, it is hard to obtain data able confirm their actual oscillatory behavior. Thus, the fit proposed here must be intended only as a possible suggestion on the true behavior of the irregularities of the tconv scaling curve.

2.4 The approach to convergence

33

1 N=50 N=100 N=500 N=1000 N=5000 N=10000 N=50000 N=100000

0.8

S(t)

0.6 0.4 0.2 0 -20

-10

0 10 5/6 (t - tS(t)=0.5) / (tS(t)=0.5)

20

Figure 2.7: Collapse of the success rate curves. The time rescaling transformation t →

t−tS(t)=0.5 5/6

tS(t)=0.5

makes the different S(t) curves collapse. Since the time at

which the success rate is equal to 0.5 scales as N 3/2 (data not shown), the trans3/2 . The collapse shows that the disorder-order formation is equivalent to t → t−N N 5/4 transition between an initial disordered state in which S(t) ≈ 0 and an ordered state in which S(t) ≈ 1 happens on new timescale t ∼ N θ with θ ≈ 5/4.

in which there is almost no communication between agents (S(t) ≈ 0), to the final ordered state in which most interactions are successful (S(t) ≈ 1) becomes steeper and steeper as N becomes larger. In other words, it is clear that the shape of the curves changes when we observe them on our rescaled timescale. Figure 2.6 suggests that the disorder-order transitions happen on a new timescale t ∼ N θ with θ < β, so that N θ /tconv → 0 when N → ∞ and the transition becomes instantaneous, on the rescaled timescale, in the thermodynamic limit. Indeed this is exactly the case and, as shown in 3/2 Figure 2.7, the value θ = 5/4 and the transformation t → t−N produces N 5/4 a good collapse of the success rate curves relative to different N . In the next section we shall show how the right value for θ can be derived with scaling arguments after a deeper investigation of the model dynamics.

34

The Naming Game 1 n=0 n=1 n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=11 n=12 n=13 n=14 n=15

0.8

fn(t)

0.6

0.4

0.2

0 0

20000

t

40000

60000

Figure 2.8: Evolution in time of inventory sizes n (n = 1 . . . 15). fn (t) is the fraction of agents whose inventory size is n at time t. The process ends with all agents having the same unique word in their inventory, so that f1 = 1. Curves obtained by averaging 500 simulation runs on a population of N = 103 agents.

2.4 2.4.1

The approach to convergence The domain of agents

We have seen that agents at first accumulate a growing number of words and then, as their interactions become more and more successful, reduce the size of their inventories till the point in which all of them know the same unique word. More quantitatively, the evolution in time of the fraction of agents fn with inventory sizes n is shown in Figure 2.8. The curves refer to a population of N = 103 agents and have been obtained averaging over several simulation runs. We see that the process starts with a rapid decrease of f0 and a concomitant increase of the fraction of agents with larger inventories. After a while, however, successful interactions produce a new growth in the fraction of agents with small values of n. The process evolves till the point in which all agents have the same unique word and f1 = 1. Some of the initial-time regularities of the fn curves can be easily described analytically. For instance, it is easy to write equations for the evolution of the number of species as long as S(t) = 0. We have:

2.4 The approach to convergence -1

35

-1

10

-1

10

-2

10

-2

-2

10

10

Pn

10

-3

-3

10

t=2. 10

10

t=4. 10

-4

10

-3

10

5

5

t=6. 10

-4

0

10

1

10

10

2

-1

-4

0

10

10

1

10

10

2

0

10

-1

10

10

2

10

10

-7/6

Pn ~ n

-1

-2

1

10

0

10

10

-2

10

Pn

10

5

-2

10 -3

-3

10

10

t=8. 10

5

t=1. 10

-4

10

-3

10

6

-4

0

10

1

10

n

10

2

10

6

t=1.1 10

-4

0

10

1

10

n

10

2

10

0

10

1

10

n

2

10

Figure 2.9: Distribution Pn of inventory sizes n. Curves obtained by a single simulation run for a population of N = 104 agents, for which tmax = 6.2 × 105 and tconv = 1.3 × 106 time steps. Close to convergence the distribution is well described by a power law Pn ∼ n−7/6 .

f˙0 = −f0

(2.4)

f˙n>1 = fn−1 − fn (2.5) These trivial relations allow to understand some features of the curves, like the exponential decay of f0 , or the fact that, at early times, each fn (n > 0) crosses the correspondent fn−1 in correspondence of its maximum (as can be recovered imposing f˙n = 0). However, generalizing eq. (2.4) is not easy, since, as the dynamics proceeds, one should take into account the correlations among inventories to estimate the probability of successful interactions, and the analytical solution of our Naming Game model is still lacking. More quantitative informations can be obtained looking at the distribution Pn of inventory sizes n at fixed times, reported in Figure 2.9 for the case N = 104 . We see that in early stages most agents tend to have large inventories, thus determining a peak in the distribution. When agents start to understand each other, however, the peak disappears and large n values keep decreasing. Interestingly, in correspondence with the jump of the success rate that leads to convergence, the histogram can be described by a

36

The Naming Game power law distribution: √ Pn ∼ n−σ g(n/ N )

(2.6)

with the cut-off function g(x) = 1 for x << 1 and g(x) = 0 for x >> 1. Numerically it turns out that 1 < σ < 3/2. To be more precise, in Figure 2.9 it is shown that the value σ ≈ 7/6 allows a good fitting of the Pn at the transition, and from simulations it turns out that this is true irrespectively of the system size. We shall return on the inventory size distribution in greater details in Chapter 4, where we shall be able to understand qualitatively the behavior shown in Figure 2.9 in the more general framework of its dependence on the topology describing the set of possible interactions among the agents. Finally, it is also worth mentioning that, well before the transition, the larger number of words in the inventory of an hearer increases (linearly) the chances of success in a interaction (data not shown). The number of words known by the speaker, on the other hand, basically plays no role until the system is close to the transition. Here, small inventories are likely to contain the most popular word, thus yielding higher probability of success.

2.4.2

The domain of words

While agents negotiate with each others, words compete to survive. In Figure 2.10 the rank distribution of words at fixed times is reported. The most popular word is given rank 1, the second one 2 and so on. The first part of the distribution is well described by a power law function, with an exponent that decreases with time. In proximity of the disorder-order transition, however, the most popular word breaks the symmetry and abandons the power law behavior, which continues to describe well the remaining words. More precisely, the global distribution for the fraction of agents possessing the R-ranked word, w(R), can be described as:

w(R) = w(1)δR,1 +

Nw /N − w(1) R R−ρ g( ), 1−ρ 1−ρ (1 − ρ)((N/2) −2 ) N/2

(2.7)

where δ is the Kronecker delta function (δa,b = 1 iff a = b and δa,b = 0R if a 6= b) and the normalization factors are derived imposing that that ∞ 2. 1 w(R)dR = Nw /N On the other hand from equation (2.6) one gets, by a simple integration, the relation Nw /N ∼ N 1−σ/2 which, substituted into eq. (2.7), gives: w(R)|R>1 ∼ 2

1 N σ/2−ρ

R−ρ f (

R ). N/2

(2.8)

We use integrals instead of discrete sums, an approximation valid in the limit of large systems.

2.4 The approach to convergence

Figure 2.10: Distribution w(R) of words of rank R. The most popular word has rank R = 1, the second R = 2, etc. The distribution follows a power law behavior w(R) ∼ R−ρ with an exponent that varies in time, while for high ranks it is truncated at R ≈ N/2. Close to the disorder-order transition, however, the most diffused word abandons the distribution that, other hand, keeps describing the less popular words. Data come from a single simulation run and concern a population of N = 104 agents.

It follows that w(R)|R>1 → 0 as N → ∞, so that, in the thermodynamic limit w(1) ∼ O(1), i.e. the number of players with the most popular word, is a finite fraction of the whole population.

2.4.3

Network view - The disorder-order transition

We now need a more precise description of the convergence process. A profitable approach consists in mapping the agents in the nodes of a network (see Figure 2.11). Two agents are connected by a link each time that they both know the same word, so that multiple links are allowed. For example, if m out of the n words known by agent A are present also in the inventory of agent B, then they will be connected by m links. In the network, a word is represented by a fully connected subgraph, i.e. by a clique, and the final coherent state corresponds to a fully connected network with all pairs connected by a only one link. When two players interact, a failure determines the propagation of a word, while a success can result in the elimination of a certain number of words competing with the one used. In network view, as shown in Figure 2.11, this translates into a clique that

37

38

The Naming Game grows when one of its nodes is represented by a speaker that takes part in a failure, and is diminished when one (or two) of its nodes are involved in a successful interaction with a competing word. To understand why the disorder-order transition becomes steeper and steeper, if observed on the right timescale, we must investigate the dynamics that leads to convergence. If we make the hypothesis that, when N is large, just before the transition all the agents have the word that will dominate, the problem reduces to the study of the rate at which competing words disappear. In different words, the crucial information is how the number of deleted links in the network, Md , scales with N . It holds: Nw Md = N

Z ∞ 2

3

w2 (R)N dR ∼ N 3− 2 σ

(2.9)

where NNw is the average number of words known by each agents, w(R) is the probability of having a word of rank R, and w(R)N is the number of agents that have that word (i.e. the size of the clique). On the other hand, considering the network structure, eq. 2.9 is the product of the average number of cliques involved in each deletion process [ NNw ], multiplied by an integral saying, in probability, which clique is involved [w(R)] and which is its size [w(R)N ]. The integral on R starts from the first deletable word, i.e. the second most popular, because of the assumption that all the successes are due to the use of the most popular word. In our case, for σ ≈ 7/6, we obtain that Md ∼ N 5/4 . Thus, from 3 equation (2.9), we have that the ratio Md /N 3/2 ∼ N − 2 (σ−1) goes to zero for large systems (since σ ≈ 7/6, and in general σ > 1), and this explains the greater slope, on the system timescale, of the success rate curves for large populations (Figure 2.7).

2.4.4

The overlap functional

We have looked at all the timescales involved in the process leading the population to the final agreement state. Yet, we have not investigated whether this convergence state is always reached. Actually, this is the case, and trivial considerations allow to clarify this point. First of all, it must be noticed that, according to the interaction rules of the agents, the agreement condition constitutes the only possible absorbing state of our model. The proof that convergence is always reached is then straightforward. Indeed, from any possible state there is always a non-zero probability to reach an absorbing state in, for instance, 2(N − 1) interactions. For example, a possible sequence is as follows. A given agent speaks twice with all the other (N − 1) agents using always the same word A. After these 2(N − 1) interactions all the agents have only the word A. Denoting with p the probability of the sequence of 2(N − 1) steps, the probability that the system has not reached an absorbing state after 2(N − 1) iterations is smaller or equal to (1 − p).

2.4 The approach to convergence

Figure 2.11: Agents network dynamics. Top Left: a link between two agents (i.e. nodes) exists every time they have a word in common in their inventories, so that multiple links are possible. In this representation, a word corresponds to a fully connected (sub)set of agents, i.e. a clique; in Figure, the two cliques corresponding to words WABAKU and VALEM are highlighted. Top Right: the two blue-colored agents have just failed to communicate, so that the word VALEM has been transmitted to the agent placed in the top of the graphical representation. It therefore enters into the enlarged clique corresponding to the transmitted word VALEM. Bottom: the two blue-colored agents have just succeeded using word VALEM. The clique corresponding to the used word does not change in any respect, but the competing cliques (here that of WABAKU) are reduced.

39

40

The Naming Game 1 0.8 0.6 S(t) I(t)

0.4 0.2 0

Nw(t)

10000 5000 0 0

20000

t

40000

60000

Figure 2.12: Overlap functional O(t). Top: it is shown the evolution in time of the overlap functional averaged on 1000 simulation runs (for a population of N = 103 agents). Curves for the success rate, S(t), and the average intersection between inventories, I(t), are also included. By definition, O(t) ≤ 1. It is evident that it holds hO(t + 1)i > hO(t)i, which, along with the stronger hO(t + 1)i > O(t) valid for almost all configurations (not shown), indicate that the system will reach the final state of convergence where O(t) = 1. Bottom: The total number of words Nw (t) is plotted for reference.

Therefore, iterating this procedure, the probability that, starting from any state, the system has not reached an absorbing state after 2k(N − 1) iterations, is smaller than (1 − p)k which vanishes exponentially with k. The above argument, though being very simple and general, is exact. However, another perspective to address the problem of convergence consists in monitoring the lexical coherence of the system. To this purpose, we introduce the overlap functional O: O(t) =

X |ai ∩ aj | 2 , N (N − 1) i>j ki kj

(2.10)

where ai is the ith agent’s inventory, whose size is ki , and |ai ∩ aj | is the number of words in common between ai and aj . The overlap functional is a measure of the lexical coherence in the system and it is bounded, O(t) ≤ 1. A the beginning of the process it is equal to zero, O(t = 0) = 0, while at convergence it reaches its maximum, O(t = tconv ) = 1. From extensive numerical investigations it turns out that, averaged over several runs, the functional always grows, i.e. hO(t + 1)i > hO(t)i (see Fig-

2.5 Single games

41

ure 2.12. Moreover, looking at the single realization, this function grows almost always, i.e. hO(t + 1)i > O(t), except for a set a very rare configurations whose statistical weight appears to be negligible (data not shown). Even if it is not a proof in a rigorous sense, this monotonicity, combined with the fact that the functional is bounded, gives a strong indication that the system will indeed converge. It is also interesting to note that eq. (2.10) is very similar to the expression for the success rate S(t), which can formally be written as: Ã

X |ai ∩ aj | 1 |ai ∩ aj | S(t) = + , N (N − 1) i>j ki kj

!

(2.11)

where the intersection between two inventories are divided only by the inventory size of the speaker. Figure 2.12 shows that these two quantities exhibit a very similar behavior. However, while the overlap functional is equal to 1 only at convergence, this is not true for the success rate: if all agents had the identical inventories of size n > 1 we would have S(t) = 1 and O(t) = 1/n. For this reason the success rate is not a suitable functional to prove convergence. Finally, in Fig. 2.12 we have plotted also the average intersection between inventories, i.e. I(t) =

X 2 |ai ∩ aj |. N (N − 1) i>j

(2.12)

Remarkably, it turns out that I(t) < 1 during all the process, even if in principle this quantity it is not bounded. In Chapter 4 we shall exploit this finding to rediscuss and investigate in greater details the microscopic activity patterns of the agents.

2.5

Single games

We know that single realizations have a quite irregular behavior and can deviate significantly from average curves (Fig. 2.2). It is therefore interesting to investigate to what extent average times and curves provide a good description of single processes. In Figure 2.13(top) we have plotted the distribution of peak times for a population of N = 103 agents. It is clear that data cannot be fitted by a Gauss distribution. The same peculiar behavior is show also by the distribution of the convergence times (Fig 2.13(bottom)) and by that of the intervals between the time of the maximum number of words and the time of convergence (data not shown). Thus, the non Gaussian behavior appears to be an intrinsic feature of the model. In fact, as shown in Figure 2.13(bottom)

42

The Naming Game

-4

P(t)

10

data Gauss fit

-5

10

-6

10

10000

20000

30000

1 D(t)

data Weibull fit 0.5

0 20000

40000

60000

80000

t

Figure 2.13: Peak and convergence time distributions. Top: the distribution of the peak times tmax clearly deviates from Gauss behavior. Bottom: the cumulative distribution of the convergence times tconv is well fitted by a Weibull t−g0 g2

distribution D(t) = e( g1 ) , with fit parameters g0 ≈ 4.9 × 104 , g1 ≈ 7.9 × 100 and g2 ≈ 9.6 × 104 . The same function describes well also the peak time distribution (data not shown). Data refers to a population of N = 103 agents and are the result of 106 simulation runs.

for the of convergence times, all these distributions turn out to be well fitted (in their cumulative form) by an extreme value distribution: (

D(t) = e

t−g0 g ) 2 g1

(2.13)

where g0 , g1 and g2 are fit parameters [103, 94]. Extreme value distributions originated from the study of the distribution of the maximum (or minimum) in a large set of independent and identically distributed set of variables [103, 94]. It turns out, however, that a generalization of these functions including a continuous shape parameter a, known as Gumbel distribution Ga (x), has been observed in many models ranging from turbulence and equilibrium critical systems [40] to non-equilibrium models related to self-organized criticality [39], to 1/f noise [12] and many others systems (see [31] and references therein). And our model provides another example. It must be noted, however, that there is no obvious theoretical explanation of the fact that extreme-value like distributions are found also in the study of the fluctuations of global quantities. Yet, in many cases, these distributions are used simply like convenient fitting functions. Interestingly,

2.5 Single games

43

-2

fconv(t)

10

-3

10

-4

10

-5

fconv(t)

10 7 3×10 -2 10

7

4×10

5×10

7

7

6×10

7

7×10 data Gauss fit

-3

10

global convergence

-4

10

-5

10 7 4.8×10

4.9×10

7

7

t

5.0×10

7

5.1×10

Figure 2.14: Single agents convergence time distribution. We define the convergence time of a single agent the last time in which it had to delete words after a successful interactions; fconv (t) is the fraction of agents who reach convergence at time t. Top: distributions coming from 10 simulation runs are plotted. It is clear that distributions coming from different runs can be non-overlapping, i.e. that the distance between the peaks of single curves can be much larger than the average width of the same curves (that does not exhibit any strong dependence on the single run). Bottom: a single distribution is analyzed, showing that it can not be described by a Gauss distribution. The last agent to converge determines the global convergence time. Curves are relative to a population of N = 105 agents.

it was recently shown that there is a connection between Gumbel functions and the statistics of global quantities expressed as sums of non identically distributed random variables, without the need of invoking extremal processes [31]. We can therefore argue that there is no necessarily a hidden extreme value problem in our model. In any case, a more rigorous explanation of the presence of Gumbel like distribution is left for future work. In Figure 2.14(top) we show 10 single-run distributions of convergence times. Each curve illustrates the fraction of agents that converged at a given time in that run, fconv (t). We consider the single agent convergence time as the last time in which it had to delete words after a successful interaction. From Figure it is clear that the separation between the peaks of two different distributions can be much larger than the average width of a single curve. In other words, we see that the first moment of the distributions strongly depends on the single realization, while the second one does not. This information is crucial to interpret the curves showed in Figure 2.13 correctly. In fact, we now know that they are indeed representative of fluctuations

44

The Naming Game

Figure 2.15: Correlation between peak and convergence times (τmax τconv , respectively). Each run is represented by a point in the scatter plot. The dashed line is τconv = τmax and therefore no points can lay below it. The average times tconv and tmax are also shown with a clearer (yellow) point at the center of the distribution (statistical errors are not visible on the scale of the graph).

occurring among different runs, and do not describe simply the behavior of the last converging agent in a scenario in which most agents always converge, on average, at the same run-independent time. In Figure 2.14(bottom) it is shown that also single run curves deviates from Gauss behavior showing long tails for large times. Given that the distributions of convergence and peak times, and also that of their difference tdif f , behave in the same way, it is interesting to investigate whether there is any correlation between this two times. In Figure 2.15 we present a scatter plot in which the axis indicate τconv and τmax , respectively the convergence and peak times for a single run (so that tmax = hτmax i and tconv = hτconv i). It is clear that the correlation between this two times is very feeble. Indeed, the knowledge of τmax does not allow to make any sharp predictions on when the population will reach convergence in the considered run. Finally, Figure 2.16 shows that the relative standard deviation of all the relevant global quantities (tmax , tdif f , tconv and Nwmax ) decreases slowly as the system size N grows. In general, if the ratio σ(x)/hxi goes to zero as N increases, the system is said to exhibit self-averaging, and this seems to be

2.6 Convergence Word 10

45

0

-1

10

10

σ(tmax) / tmax σ(tdiff) / tdiff σ(tconv) / tconv

-2

max

max

σ(Nw ) / Nw -3

10

10

1

10

2

10

3

10

4

5

10

N Figure 2.16: Scaling of the relative standard deviation σ(x)/hxi. The ratio between the standard deviation σ and the corresponding (average) quantity is plotted in function of the system size. In all cases the ratio decreases slightly, or stays constant, as the population size N grows. In particular, the decrease is more evident for Nwmax and tmax , while tconv and tdif f curves are almost constant for large N . However, data from our simulations are not sufficient to conclude whether the NamingqGame exhibits self-averaging. The standard deviation of x is defined PNruns 1 2 as σ(x) = i=1 (xi − hxi) , xi is the ith measured value, hxi is the Nruns −1 average value, and Nruns is the number of simulation runs (here, Nruns = 1000).

the case for the Naming Game. However, it is difficult to draw a definitive conclusion, due to the large amount of time needed to perform a significant number of simulation runs for large values of N . Seemingly, the system seems to show self-averaging for what concern the peak height and time, but this does not seem the case for the time of convergence. In any case, it is worth mentioning that Lu, Korniss and Szymanski [133] conclude that a slightly modified version of our model does not display self-averaging when the population is embedded in random geometric networks (see also Sec. 5.3 for some more details on their work).

2.6

Convergence Word

As we have seen, the negotiation process leading agents to convergence can be seen as a competition process among different words. Only one of them

46

The Naming Game

-2

W

10

-3

10

-4

10

0.2 -2

0.4 0.6 0.8 invention position (normalized)

1

10 W

simulation W ~ (1/τ) exp(-t/τ) -4

10

-6

10

0

500

1000 invention time

1500

2000

Figure 2.17: Word survival probability. Top: The probability that a given word becomes the dominating one (i.e. the only one to survive when the system reaches the convergence state) is plotted in function of its normalized invention position (see text for details). Early invention is clearly an advantageous factor. Bottom: the survival probability is now plotted in function of the invention time of words. The experimental distribution can be fitted by an exponential of the form W ∼ (1/τ )exp(−t/τ ), with τ ≈ 150. In both graph data have been obtained by 105 simulation runs of a population made of N = 103 agents.

will survive in the final state of the system. It is then interesting asking whether it is possible to predict, at some extent, which word is going to dominate. According to the dynamical rules of our model, the only parameter that makes single words distinguishable is their creation time. Thus, it seems natural investigating whether the moment in which a word is invented can affect its chances of surviving. It turns out that this is indeed the case, as it is shown in Figure 2.17. In the upper graph it is plotted the probability for a word to become the dominating one in function of its normalized creation position. This means that each word is identified by its creation order: the first invented word is labeled as 1, the second as 2 and so on. To normalize the labels, they are then divided by the last invented word. From Figure it is clear that early invented words have higher chances of survival. The supremacy can be better quantified if we plot the winning probability of a word in function of its invention time, as it is done in the bottom graph of Figure 2.17. We find that data from simulations are well fitted by an exponential distribution of the form W = (1/τ )exp(−t/τ ), indicating that

2.7 Symmetry Breaking - A controlled case

47

-1

10

W

10

-2

-3

10

4

-4

10

N = 10 3 N = 10 2 N = 10

-5

10 0

0.2

0.4

0.6

0.8

1

invention position (normalized) Figure 2.18: Role of the system size on the distribution of the winning word. The advantage of early invention increases in larger populations.

the advantage of early invention is indeed quite strong. Finally, an interesting question concerns the behavior of the winning probability distribution in function of the system size N . In Figure 2.18 we show the distributions in function of normalized labels described above for three different system sizes, N = 102 , N = 103 and N = 104 . The advantage of soon creation increases with the system size, but data in our possess do not allow clear predictions about the behavior of the distribution in the thermodynamic, N → ∞, limit. As a matter of pure speculation we could argue that the distribution collapses into a Dirac’s delta of the first invented word [184].

2.7

Symmetry Breaking - A controlled case

In the previous sections we have seen that the word upon which the agreement takes place is chose by a symmetry breaking process (section 2.4.2). This is true even if, as we have seen in section 2.6, early invention increases the probability for a word to impose itself. Indeed, if we start with an artificial configuration in which each agent has a different word in its inventory, i.e. if we remove the influences of the invention process, the process still ends up in the usual agreement state (data not shown). In particular, we can concentrate on the case in which there are only two words at the beginning of the process, say A and B, so that the population

48

The Naming Game can be divided into three classes: the fraction of agents with only A, nA , the fraction of those with only the word B, nB , and finally the fraction of agents with both words, nAB (see also [55] for a similar model). Describing the time evolution of the three species is straightforward:

n˙ A

=

n˙ B

=

n˙ AB

=

1 1 − nA nB + n2AB + nA nAB 2 2 1 1 2 − nA nB + nAB + nB nAB 2 2 1 +nA nB − 2n2AB − (nA + nB )nAB 2

(2.14)

The meaning of the different terms of the equations is clear. For instance, for n˙ A we have that −(1/2)nA nB considers the case in which an agent with the word B transmits it to an agent with the word A, n2AB takes into account the probability that two more agents with only the A word are created if two agents with both words happen to have a success with A, and (1/2)nA nAB is due to the probability that an agent with only A have a success speaking to an agent with both A and B. The system of differential equations (2.14) is deterministic. It has three fixed points in which the system can collapse depending on initial conditions. If nA (t = 0) > nB (t = 0) [nB (t = 0) > nA (t = 0)] then at the end of the evolution we will have the stable fixed point nA = 1 [nB = 1] and, obviously, nB = nAB = 0 [nA = nAB = 0]. If, on the other hand, we start from nA (t = 0) = nB (t = 0), then the equations lead to nA = nB = 2nAB = 0.4. The latter situation is clearly unstable, since any external perturbation would make the system fall in one of the two stable fixed points. Indeed, it is never observed in simulations due to stochastic fluctuations that in all cases determine a symmetry breaking forcing a single word to prevail. Equations (2.14), however, are not only a useful example to clarify the nature of the symmetry breaking process. In fact, they also describe the interaction among two different populations that converged separately on two distinct conventions. In this perspective, eq. (2.14) predict that the population whose size is larger will impose its conventions. In the absence of fluctuations, this is true even if the difference is very small: B will dominate if nB (t = 0) = 0.5 + ² and nA (t = 0) = 0.5 − ² , for any 0 < ² ≤ 0.5 (we consider nAB (t = 0) = 0). In Figure 2.19 are reported data from simulations in which the probability of success of the convention of the minority group nA , S(nA ), was monitored in function of the fraction nA (where nA +nB = 1) The absence of fluctuations is partly recovered as the total number of agents grows, and in fact it turns out that, for any given nA < 0.5, the probability of success decreases as the system size is increased. Following eq. (2.14), in the thermodynamic limit (N → ∞) this probability goes to zero.

2.8 Discussion and conclusions 10

0

-1

S(nA)

10

10

49

N=100 N=200 N=400 N=1000

-2

-3

10

-4

10 0

0.1

0.2

nA

0.3

0.4

0.5

Figure 2.19: Resistance against invasion. Two populations that converged separately on conventions A and B merge. In Figure it is plotted the probability S(nA ) that convention A becomes the final accepted convention of the new population, versus the normalized size nA (where nA + nB = 1) of the original population of A spreaders. As the total population size increases, the probability for the initially less diffused convention to impose itself decreases, as predicted by equations (2.14).

2.8

Discussion and conclusions

The Naming Game is a very simple model able to account for the emergence of a shared set of conventions in a population of agents. The main characteristics are: ˆ The negotiation dynamics between individuals: the interaction rules are asymmetric and feedback is an essential ingredient to reach consensus; ˆ The memory of the agents: individuals can accumulate words, and wait before taking a decision; ˆ The absence of bounds to the inventory size: the number of words (or opinions, etc.) is not fixed nor limited.

All these aspects derive from issues of Semiotic Dynamics and Artificial Intelligence, which supplied the humus to our work. However, our model is suitable to describe all those cases in which agents adopt negotiation in a

50

The Naming Game decision process, i.e. spreading of opinions, ideas etc. It is then very interesting to note that the ingredients listed above are absent from most of the well known opinion-dynamics-like models we have seen in Sec. 1.5. In the Axelrod model [15], for instance, each agent is endowed with a fixed-size vector of opinions, while in the Sznajd-Weron and Sznajd’s model [185] and in the Voter model [130, 124, 126], the opinion can take only two discrete values, and an agent takes deterministically the opinion of one of its neighbors. Also in the approach of Deffuant et al. [79], the opinion is modeled as a unique variable and the evolution of two interacting agents is deterministic, while in the Hegselmann and Krause model [105] opinions evolve as an averaging process. Moreover, as we have seen, all these models include in some way the concept of bounded confidence, according to which two individuals do not interact if their opinions are not close enough, which is on the other hand absent in the Naming Game (we shall return to this point in Sec. 5.2.1). Regarding pre-existing Semiotic Dynamics models, on the other hand, our work introduces some important advances. First of all, our effort towards simple interaction rules has helped to point out which are the essential features to achieve a consensus state. Remarkably, we have shown that weights to be associated with different words are not a crucial ingredient, thus contradicting a previously accepted paradigm (see Sec 1.3.2). Then, we have performed (and will continue to perform in next chapters) a comprehensive analysis of our model which has no equivalent in earlier attempts on other models (as far as we know). From a general perspective, we can then say that the contribution of our work is twofold. On the one hand we have applied a statistical physics approach to a problem originated in a different field, simplifying original models and performing a much more accurate analysis of the emerging global properties. On the other hand, we have come back with new concepts that were not included in previous existing models of more traditional physics approaches to social modeling. Moreover, it is worth stressing that our Naming Game maintains a crucial aspect of the original model it was inspired by [177, 176]: it is perfectly suitable for experiments with real artificial agents. Indeed, in order to preserve this precious inheritance, we have chosen a scheme of interaction that is directly implementable in the real world. Thus, along with numerical simulation and analytical approaches, our model can also be tested using true embodied agents, and actually we are planning to run the first experiments in near future, thanks to a collaboration with the group led by Prof. Dario Floreano at the Ecole Polytechnique F´ed´erale de Lausanne. In summary, in this chapter we have at first defined our model. Due to the simplicity of the agents interaction rules, we have been able to investigate its dynamics deeply both with massive numerical simulations and, whenever possible, with analytical arguments. At first we have investigated the ba-

2.8 Discussion and conclusions sic features of the process leading the population to converge, and then we have focused on the scaling of the most crucial quantities with the system size. In this context, we have also unrevealed an hidden timescale that rules the transition between the initial state in which there is no communication among agents and the final one in which there is global agreement. Then we have analyzed several other aspects of the whole process, such as its properties of convergence, the relation between single runs and averaged curves and the different probabilities for single words to impose themselves. We have also studied the elementary case in which only two words are present in the system, which can be interpreted as the merging of two converged populations, that clarifies the role of stochastic fluctuations in the convergence process.

51

Chapter 3

The role of topology 3.1

Introduction

In Chapter 2 we have introduced our model prescribing that, at each time step t = 1, 2, .., two agents are randomly selected. The assumption behind this mean-field rule is that the population is not structured and that any agent can interact with anybody else. In general this is not true, and the topology on which the population is embedded identifies the set of possible interactions among individuals. Recent years have witnessed the birth and fast development of the new field of complex networks [6, 82, 162, 47]. First of all it was realized that a schematization in terms of nodes and links representing their interactions was a powerful tool to describe and analyze a large set of different systems, belonging to technological (Internet, the web, etc.), natural (food webs, protein interaction networks, etc.) or social (networks of scientific collaborations, acquaintances, etc.) domains. Then, surprisingly, it was found that almost all the investigated systems shared a certain number of peculiar and completely unexpected properties, which were not captured by the models known up to that moment. From our point of view, such structures, and the artificial attempts to reproduce them, constitute possibly the most natural interaction patterns to study the Naming Game. Thus, one of the purposes of this chapter is to understand the role of the different properties of complex networks. Before analyzing the effects of an underlying complex topology, however, we study the effects of simpler situations (Sec. 3.2), namely low dimensional regular lattices, discussing the results presented in [21] . Here, the number of neighbors is finite, the structure is regular and there is a complete homogeneity among the agents. Indeed, it seems natural to start from simpler situations before moving to more complicated scenarios. Moreover, d−dimensional lattices have traditionally been used as underlying topologies of many classical models of statistical physics, and there are well established

54

The role of topology methods to tackle them [134]. By means of numerical and analytical arguments, we show that low dimension grids induce a coarsening dynamics on the Naming Game, so that the time required by the system to converge is much slower. On the other hand, finite connectivity keeps the memory required to each agent finite. Results concerning the mean field case, on the one hand, and regular structures, on the other, act as fundamental references to understand the role of the different properties of complex networks. We start, in Section 3.3, by investigating the role of the small-world property (short average distance between any pair of nodes), which is one of the most characteristic features of many different networks. In particular we focus on a model, proposed by Watts and Strogatz [192, 191], which allows to pass progressively from regular structures to random graphs by tuning a parameter. The main result, presented in [73], is that the presence of shortcuts, linking agents otherwise far from each other, allows to recover the fast convergence typical of the mean-field case. The finite connectivity, on the other hand, keeps the demanded amount of memory finite, as in regular structures. In Section 3.4 we explore systematically, mainly by means of computer simulations, most of the relevant features exhibited by complex networks and we compare them with previously obtained results, referring to the work presented in [74]. Before doing that, however, it will be necessary to further specify the Naming Game rules, since it turns out that the criterion determining how the interacting agents are selected affects the dynamics significatively. Finally, in Section 3.5 we present some general conclusions, discussing our findings on the different interaction patterns we have analyzed. An important remark is now in order. Complex networks are nowadays familiar to many scientists working in statistical physics or, in general, in multi-agent modeling. However, the subject is quite new, and we prefer not to assume the reader to be too confident with it. For this reason, we discuss the network measures or models clearly each time they are first introduced, and often we recall definitions also in subsequent appearance of important terms. Additionally, for the sake of readability, we have collected the most important informations in Appendix A. In any case, in this thesis networks are exploited only as tools to study dynamical processes taking place on them, and we refer the interested reader to [6, 82, 162, 34, 47] for much more detailed and complete presentations.

3.2

Regular lattices

We now consider d-dimensional regular lattices with periodic boundary condition as topologies defining the set of possible interactions among N agents playing the Naming Game described in Chapter 2. Thus, each agent cor-

3.2 Regular lattices

55

Nw (t)

12000 MF 1D 2D

8000

1300 1200 1100 1000

4000

4

7×10

Nd (t)

0 0 600

4

4

4×10

6×10

4

9×10 4

8×10

5

1×10

5

1×10

MF 1D 2D

400 200 0

S(t)

4

2×10

4

8×10

1 0.8 0.6 0.4 0.2 0

0

4

2×10

4

4

4×10

6×10

4

8×10

5

1×10

MF 1D 2D

0

4

2×10

4

4

4×10

6×10

4

8×10

5

1×10

t Figure 3.1: Basic global quantities Time evolution in mean-field and finite dimensions of the total number of words (or total used memory), Nw (t), for the number of different words in the system, Nd (t), and for the average success rate, S(t). N = 1024, average over 1000 realizations. The inset in the top graph shows the very slow convergence in finite dimensions.

responds to a node of the lattice and can interact only with its 2d nearest neighbors. As we are going to discuss, this situation affects significatively both the scaling of the time needed to reach consensus and the memory required to agents. As we have seen before, relevant quantities in the study of the Naming Game are the total number of words in the system Nw (t), which corresponds to the total memory used by the agents, the total number of different words Nd (t), and the average rate of success S(t) of the interactions. Fig. 3.1 displays the evolution in time of these three quantities for the 1d and 2d low-dimensional models, compared to the mean-field case. The first striking difference between low dimensional grids and the mean field situation appears ever since the first steps of the dynamics, when we observe a very rapid growth of success rate. As we have seen, in the initial phase, the success rate is equal to the probability that two agents that have already played come to interact again, i.e. to t/E where E is the number of possible interacting pairs. in the mean field case, this leads to S(t) ∼ N (N − 1)/2, but in finite dimensions it becomes S(t) ∼ N d. The

56

The role of topology 12

10

tmax tconv

tmax tconv

α

tmax ∝ N 2, α2 ≈ 1.0

tconv ∝ N 1, β1 ≈ 3.0

tconv ∝ N 2, β2 ≈ 2.0

β

8

β

t

10

α

tmax ∝ N 1, α1 ≈ 1.0

10

4

d=1

d=2

0

10 0 10

10

2

4

10

N

10

0

2

10

4

10

N

Figure 3.2: Scaling with the population size N. Scaling of the time at which the number of words is maximal, tmax , and of the time needed to obtain convergence, tconv , in 1 and 2 dimensions.

success rate grows thus N times faster in finite dimensions, as confirmed by numerics. At larger times however, the eventual convergence is much slower in finite dimensions. The curves for Nw (t) and Nd (t) display in all cases a sharp increase at short times, a maximum for a given time tmax and then a decay towards the consensus state in which all the agents share the same unique word, reached at tconv . The short time regime corresponds to the creation of many different words by the agents. After a time of order N , each agent has played typically once, and therefore O(N ) different words have been invented (in fact, typically N/2): the total number of distinct words in the system grows and reaches a maximum scaling as N . In mean-field, each agent can interact with all the others, so that it can learn many different words. In contrast in finite dimensions words can only spread locally, and each agent has access only to a finite number of different words. The total memory used scales as N , and the time tmax to reach the maximum number of words in the system scales as N αd with α1 = α2 = 1 (Fig. 3.2). The same behavior is shown, consequently, also by the maximum number of words which scales as Nwmax ∼ N γd , with γ1 = γ2 = 1. No plateau is observed in the total number of distinct words since coarsening of clusters of agents soon start to eliminate words. Furthermore, the time needed to reach consensus, tconv , grows as N βd with β1 ' 3 in d = 1 and β2 ' 2 in d = 2.

3.2 Regular lattices

Figure 3.3: Typical evolution of a one-dimensional system (N = 1000). Black color corresponds to interfaces (sites with more than one word). The other colors identify different single state clusters. The vertical axis represents the time (1000 × N sequential steps), the one-dimensional snapshots are reported on the horizontal axis.

3.2.1

Analytical approach for 1-d lattices

We now focus only on the one dimensional case, where the Naming Game dynamics can be recovered almost exactly by means of analytical arguments. In Figure 3.3 we have reported a typical evolution of agents on a onedimensional lattice, by displaying one below the other a certain number of (linear) configurations corresponding to successive equally separated temporal steps. Each agent having one single word in memory is presented by a colored point while agents with larger inventories are shown in black. This figure clearly shows the growth of clusters of agents having one single word by diffusion of interfaces made of agents having more than one word in memory. Interfaces appear to be thin, so that the dynamics seems to be dominated by the competition of clusters of agents having the same unique word. This picture is not obvious a priori, since an agent having e.g. two words in memory can propagate them to its neighbors, leading to possible clusters of agents having more than one word. To understand what happens at the edge between two clusters, we can consider a single interface between two of them: in each cluster, all the agents share the same unique word, say A in the left-hand cluster and B in the other. The interface is a string of length m composed of sites in which both states A and B are present. We call Cm this state (A + B)m . A C0 corresponds to two directly neighboring

57

58

The role of topology

Figure 3.4: Truncated Markov process associated with interface width dynamics. Schematic evolution of a C0 interface · · · AAABBB · · · , cut at the maximal width m = 3.

clusters (· · · AAABBB · · · ), while Cm means that the interface is composed by m sites in the state C = A + B (· · · AAAC · · · CBBB · · · ). Note that, in the actual dynamics, two clusters of states A and B can be separated by a more complex interface. For instance a Cm interface can break down into two or more smaller sets of C-states spaced out by A or B clusters, causing the number of interfaces to grow. Numerical investigation shows that such configurations are however eliminated in the early times of the dynamics. An approximate expression for the stationary probability that two neighboring clusters are separated by a Cm interface can be computed in the following way. In a one-dimensional line composed of N sites and initially divided into two clusters of A and B, the probability to select the unique C0 interface is 1/N , and the only possible product is a C1 interface. Thus, there is a probability p0,1 = 1/N that a C0 interface becomes a C1 interface in a single time step, otherwise it stays in C0 . From C1 the interface 3 can evolve into a C0 or a C2 interface with probabilities p1,0 = 2N and 1 p1,2 = 2N respectively. This procedure is easily extended to higher values of m. The numerics suggest that we can safely truncate this study at m ≤ 3. In this approximation, and considering that the local dynamics happens on a timescale much faster than the global one, the problem corresponds to determine the stationary probabilities of the Markov chain reported in Fig. 3.4

3.2 Regular lattices

59

10

0.5

10

Pm

0.6

-2 -4

10

0.4

Pm

0

-6

10

0

0.3

2

4

m

6

8

0.2 numerical analytical

0.1 0

0

1

2

3

m (interface’s width) Figure 3.5: Interface width in one dimension. The Markov chain approach to the statistics of the interfaces produces theoretical values in excellent agreement with simulations.

and defined by transition matrix:  N −1 N  3  2N M= 1  N 1 N

1 N N −2 N 3 2N 1 N

0 1 2N N −3 N 3 2N

0 0 1 2N N −4 1 N + 2N

   , 

(3.1)

1 in which the basis is {C0 , C1 , C2 , C3 } and the contribution 2N from C3 to C4 has been neglected (see Fig. 3.4). The stationary probability vector P = {P0 , P1 , P2 , P3 } is computed by imposing P(t + 1) − P(t) = 0, i.e. (MT − I)P = 0, that gives

133 78 ≈ 0.586, P1 = ≈ 0.344, 227 227 2 14 ≈ 0.062, P3 = ≈ 0.0088. P2 = 227 227 P0 =

(3.2)

Direct numerical simulations of the evolution of a line · · · AAABBB · · · yields P0 ' 0.581, P1 = 0.344, P2 = 0.063, P3 = 0.01, thus clearly confirming the correctness of our approximation (see Figure 3.6). Since the width of the interfaces remains small, as we have seen, we can assume that they are punctual objects localized around their central

60

The role of topology position x: in the previously analyzed case, denoting by xl the position of the right-most site of cluster A and by xr the position of the left-most site r of cluster B, it is given by x = xl +x 2 . An interaction involving sites of an interface, i.e. an interface transition Cm → Cm0 , corresponds to a set of possible movements for the central position x. The set of transition rates are obtained by enumeration of all possible cases: denoting by W (x → x±δ) the transition probability that an interface centered in x moves to the position x ± δ, in our approximation only three symmetric contributions are present, namely δ ± 1/2, δ ± 1 and δ ± 3/2. Collecting all the terms sketched in Figure 3.4 according to the displacement δ they produce, we obtain: 1 W (x → x ± ) = 2 W (x → x ± 1) = 3 W (x → x ± ) = 2

1 P0 + 2N 1 P2 + 2N 1 P3 . 2N

1 1 1 P1 + P2 + P3 , N N 2N 1 P3 , 2N

(3.3)

Using the expressions for the stationary probability P0 , . . . , P3 , we finally get 1 W (x → x ± ) = 2 W (x → x ± 1) = 3 W (x → x ± ) = 2

319 , 454N 8 , 227N 1 . 227N

(3.4)

These transition probabilities are the fundamental ingredient we need to write the master equation for the probability P(x, t) to find the interface in position x at time t. In the limit of continuous time and space, i.e. writing ∂P (x, t), ∂t ∂P (δx)2 ∂ 2 P P(x + δx, t) ≈ P(x, t) + δx (x, t) + (x, t), ∂x 2 ∂x2 P(x, t + 1) − P(x, t) ≈ δt

(3.5)

the master equation reads: ∂P(x, t) D ∂ 2 P(x, t) = , (3.6) ∂t N ∂x2 where D = 401/1816 ' 0.221 is the diffusion coefficient (in the appropriate dimensional units (δx)2 /δt). These results are confirmed by numerical simulations as illustrated in Fig. 3.5 where the numerical probability P(x, t) is shown to be a Gaussian

3.2 Regular lattices

61

P(x, t)

0,25

t = 10

0,2

3

t = 10

4

t = 10

5

0,15 0,1 0,05 0

-40 -20 0

x

300

2

x

20 40 -40 -20 0

x

20 40

simul., N = 200 fit, Dexp = 0.224

250

< x > (t)

20 40 -40 -20 0

200 150 100 50 0

0

4

2×10

4

4

4×10

6×10

4

8×10

5

1×10

t Figure 3.6: Evolution of the position of an interface · · · AAABBB · · · . Top: evolution of the distribution P(x, t). Bottom: evolution of the mean square displacement, showing a clear diffusive behavior hx2 i = 2Dexp t/N with a coefficient Dexp ≈ 0.224 in agreement with the theoretical prediction.

around the initial position, while the mean-square distance reached by the interface at the time t follows the diffusion law hx2 i = 2Dexp t/N with Dexp ' 0.224 ≈ D. We have then a clear picture of the whole dynamical evolution of the Naming Game on a one-dimensional lattice. At short times, pairwise interactions create O(N ) small clusters, divided by thin interfaces (see the first lines in Fig.3.3). We can estimate the number of interfaces at this time with the number of different words in the lattice, that is about N/2. The interfaces then start diffusing. When two interfaces meet, the cluster situated in between the interfaces disappears, and the two interfaces coalesce. Such a coarsening leads to the well-known growth of the typical size ξ of the clusters as t1/2 . The density√of interfaces, at which unsuccessful interactions can √ take place, decays as 1/ t, so that 1 − S(t) also decays as 1/ t. Moreover, starting from a lattice in which all agents have no words, p a time N is needed to reach a size of order 1, so that in fact ξ grows as t/N (as also shown by the fact that the diffusion coefficient is D/N ), which explains the time tconv ∼ N 3 needed to reach consensus, i.e. ξ = N .

62

The role of topology

Figure 3.7: Typical evolution of a two-dimensional system (N = L2 = 10000) Clusters of agents sharing the same unique word are well defined. Snapshot taken at different times, increasing from the upper left picture in lexicographic order, are plotted. It is remarkable that the width of the interfaces between clusters (agents with more than one word, in black) does not grow in time, indicating the presence of surface tension, which comes along with coarsening dynamics.

3.2.2

Extensions to higher dimensions

Results concerning the 1d lattice can be extended to the case of higher dimensions. As it is clear from Figure 3.7, the interfaces between clusters, although quite rough, are well defined. Moreover, their width does not grow in time, which points to the existence of an effective surface tension. The numerical computation of equal-time pair correlation function in dimension d = 2 indicates that the characteristic length scale p ξ, on which two agents with the same unique words are found, grows as t/N (not shown) (a time O(N ) is needed to initialize the agents to at least one word and therefore to reach a cluster size of order 1), in agreement with coarsening dynamics for non-conserved fields [43]. From the point of view of cluster sizes, tconv corresponds to the time needed to reach ξ = N 1/d , we can argue tconv ∼ N 1+2/d , that has been verified by numerical simulations in d = 2 and d = 3. This scaling and the observed coarsening behavior suggest that the upper critical dimension for this system is d = 4 [43]. A second measure that can validate this view is the time required for the implosion of a spherical cluster surrounded by an ideally infinite different cluster [81]. From simulations, showed in Figure 3.8, it turns out that the

3.2 Regular lattices

63

0.2 2

2

π R (t) / L 2

fit ∝ R0 - σ t

0.1

2

πR (t) / L

2

0.15

0.05

0 0

2×10

4

4×10

4

4

6×10

t/N

Figure 3.8: Surface tension in the 2d regular lattice. The radius of a bubble of agents with the same unique word, surrounded by a sea of agents which agree p on a different word, decreases as R(t) ∼ R02 − σt, as predicted by the theory of coarsening. Data refer to an initial radius of size R0 = 100 placed in a lattice of side L = 400, and concern a single simulation run.

q

radius of the bubble decreases as R(t) ∼ R02 − σt, where σ is a constant related to the surface tension. A result in agreement with the predictions of the theory of coarsening [43]. In summary, we have seen that the low-dimensional Naming Game presents a very different behavior compared to the mean-field case, identifying also the existence of a finite upper-critical dimension. Low-dimensional dynamics is initially more effective, less memory per node is required, preventing agents from learning a large part of the many different words created. More precisely, the requirement of memory is finite in the thermodynam√ ics limit, in strong contrast with the N requirement of the mean-field. Moreover, the peak interac√ in the number of word is reached after O(N ) 1.5 tions (tmax ∼ N d), while in the mean field it holds tmax ∼ N . On the other hand the time required for convergence is much longer, scaling as tconv ∼ N 1+2/d . Thus, for d ≥ 4 the mean-field behavior is recovered (tconv ∼ N 1.5 ). Finally, it is worth noting that, in contrast with other models of opinion dynamics (e.g. the Voter model [30, 81]), the Naming Game presents an effective surface tension that is reminiscent of the nonequilibrium zero-temperature Ising model [43]. However, it is worth stressing again that the dynamics of the Naming Game is heavily affected by the fact that, in contrast with Ising-like models, the number of words (i.e. states)

64

The role of topology

Figure 3.9: The small-world network. Starting from a regular grid, links are rewired with probability p. Thus, for p = 0 the network is a regular lattice, while for p = 1 it becomes a totally random graph. At small, but finite, values of p (1/N ¿ p ¿ 1), the small-world properties emerges. Along with short distances between nodes, there are many triangles, i.e. the clustering coefficient is high.

an agent can have is neither fixed a priori nor limited (see Chapter 4).

3.3

Small-world networks

Mean-field and low-dimensional structures are opposite poles. In the first each agent can interact with anybody else, in the latter the number of neighbors is very low, and the distance between two agents, in terms of intermediate individuals, scales algebraically with the system size. Real social networks [100], which are probably the more realistic topologies for the Naming Game model, can be thought of as somehow in the middle between the two cases we have considered so far, and, as we will see in next chapter, they exhibit several peculiar properties [6, 82, 162, 47] (see also Appendix A). In particular, social networks are typically “small-worlds” in which, on the one hand, the average distance between two agents is small [137], growing only logarithmically with the network’s size, and, on the other hand, many triangles are present, unlike totally random networks1 . In order to reconcile both properties, Watts and Strogatz have introduced the now famous small-world network model [192, 191] which allows to interpolate between regular low-dimensional lattices and random networks, by introducing a certain amount of random long-range connections into an initially regular network. More precisely (see Fig. 3.9), starting from a one-dimensional lattice of 1

A random network is generated as follows: given a set of N nodes, all the possible pairs are considered, and a link is added between two of them with probability p. See also Appendix A.

3.3 Small-world networks N sites, with periodic boundary conditions (i.e. a ring), each vertex being connected to its 2m nearest neighbors, a stochastic rewiring procedure is applied. The vertices are visited one after the other, and each link connecting a vertex to one of its m nearest neighbors in the clockwise sense is left in place with probability 1 − p, and with probability p is reconnected to a randomly chosen other vertex. Thus, for p = 0 the network retains a purely one-dimensional topology, while the random network structure is approached as p goes to 1. At small but finite p (1/N ¿ p ¿ 1), a smallworld structure with short distances between nodes, together with a large clustering, is obtained. In this situation, the degree distribution decreases exponentially [27], so that the number of neighbors of each node is bounded. In this Paragraph, we study the Naming Game on small-world networks. The main finding is that the small-world property allows to recover the fast convergence typical of the mean-field case. On the other hand, finite connectivity implies finite memory requirements. Thus, the small-world topology, as far as the Naming Game is concerned, turns out to be a very efficient solution. It is also worth stressing that the impact of long-range “short-cuts” on the behavior of various models has been already investigated in numerous cases: from the Ising model [27] to the spreading of epidemics [141, 161], or the evolution of random walks [115, 116]. Dynamics of models inspired by social sciences are no exception, such as the Voter model [54, 188] or Axelrod’s model of culture dissemination [15, 128].

3.3.1

A crossover between two regimes

For p = 0, we know from Section 3.2 that the dynamics proceeds by a slow coarsening of clusters of agents sharing the same state or word. At small p, on the other hand, the short-cuts are typically far from each other, with a typical distance 1/p between them. Thus, the early dynamics is not affected and proceeds as in dimension 1. In particular, at very short times many new words are invented and the success rate is small. After a time of order N , each agent has played typically once, and therefore O(N ) different words have been invented. Since the number of neighbors of each site is restricted, each agent has access only to a finite number of different words, so that the average memory required per agent remains finite, as in finite dimensions and in contrast with the mean-field case. The dynamics then proceeds with the usual coarsening phenomena as long as the clusters are typically one-dimensional, i.e. as long as the typical cluster size is smaller than 1/p. However, as the average cluster size reaches the typical distance between two short-cuts ∼ 1/p, a crossover phenomena is bound to take place. Since the cluster size grows p as t/N (Sec. 3.2), this corresponds to a crossover time tcross = O(N/p2 ). For times much larger than this crossover, one expects that the dynamics

65

66

The role of topology

10

0

Increasing p

Nw /N -1

10

10

-1

-2

10

-3

10

0

10

1

10

2

10

3

10

4

t/N Figure 3.10: Average number of words per agent in the system. Nw /N − 1 is plotted as a function of the rescaled time t/N , for small-world networks with hki = 8 and N = 103 nodes, for various values of p. The curve for p = 0 is shown for reference, as well as p = 5.10−3 , p = 10−2 , p = 2.10−2 , p = 4.10−2 , p = 8.10−2 , from bottom to top on the left part of the curves. Larger values of p speed up the convergence.

is dominated by the existence of short-cuts and enters a mean-field like behavior. The convergence time is thus expected to scale as N 3/2 and not as N 3 . In order for this picture to be possible, the crossover time N/p2 needs to be much larger than 1, and much smaller than the consensus time for the one-dimensional case N 3 . This means that we need p À 1/N , which is indeed the necessary condition to obtain a small-world network. We can therefore expect that the small-world topology allows to combine advantages from both finite-dimensional lattices and mean-field networks: on the one hand, only a finite memory per node is needed, in opposition to the O(N 1/2 ) in mean-field; on the other hand the convergence time is expected to be much shorter than in finite dimensions. Figure 3.10 displays the evolution of the average number of words per agent as a function of time, for a small-world network with average degree hki = 8, and various values of the rewiring probability p and size N . While Nw (t) in all cases decays to N , after an initial peak whose height is proportional to N , the way in which this convergence is obtained depends on the parameters. At fixed N , for p = 0 a power-law behavior √ Nw /N − 1 ∝ 1/ t is observed due to the one-dimensional coarsening pro-

3.3 Small-world networks

67

80 p=0.01 p=0.02 p=0.04 p=0.08

(Nw(t)/N-1)/p

70

(Nw(t)/N-1)/p

60 50 40

6 4 2

30 2

t p /N

20 10 0 -4 10

1

A) -2

10

2

t p /N

0

10

2

10

Figure 3.11: Rescaled curves of the average number of words per agent. The rescaling shows the collapse around the crossover time N/p2 . For each value of p, two values of the system size (N = 104 and N = 105 ) are displayed. The curves for different sizes are perfectly superimposed before the convergence.

cess. As soon as p À 1/N however, deviations are observed and get stronger as p is increased: the decrease of Nw is first slowed down after the peak, but leads in the end to an exponential convergence. The intermediate slowing down and the faster convergence are both enhanced as p increases. On the other hand, a system size increase at fixed p corresponds to a slower convergence, even on the rescaled time t/N , with a longer and longer plateau at almost constant average used memory (not shown). As mentioned previously, a crossover phenomenon is expected when the one-dimensional clusters reach sizes of order 1/p, i.e. at a time of order N/p2 . By definition, all sites forming a cluster have only one word in memory, while the sites with more than one word are localized at the interfaces between clusters, whose number is then of order N p. The average excess memory per site (with respect to global consensus) is thus of order p, so that one expects Nw /N − 1 = pG(tp2 /N ). Figure 3.3.1 indeed shows that the data of (Nw /N −1)/p for various values of p and N indeed collapse when tp2 /N is of order 1. On the other hand, Fig. 3.12 indicates that the convergence towards consensus is reached on a timescale of order N βSM , with βSM ≈ 1.4 ± 0.1, close to the mean-field case N 3/2 and in strong contrast with the N 3 behavior of purely one-dimensional systems. Moreover, as also observed in mean-field, the transition to the final consensus becomes more and more abrupt as the system size increases. The scaling of the time of convergence in function of

68

The role of topology

N1

0

Nw(t)/N-1

10

N2 N3

p=0.01 p=0.02 p=0.04 p=0.08

-1

10

10

10

8

10

-2

10

6

10

tconv

4

10

10 2 10

-3

-2

10

3

10

4

5

N 10

10 0

10

2

t/N

1.4

10

Figure 3.12: Convergence at large times. The drop of Nw /N − 1 to 0 shows convergence time on a rescaled timescale t/N 1.4 . For each p, three different sizes (N1 = 103 for the left peak, curves in black, N2 = 104 for the middle peak, curves in blue, and N3 = 105 , right peak, curves in red) are shown. On the N 1.4 scale, the convergence becomes more and more abrupt as N increases. The inset displays the convergence time as a function of size for p = 0 (bullets), p = 0.01 (squares), p = 0.02 (diamonds), p = 0.04 (triangles), p = 0.08 (crosses); the dashed lines are proportional to N 3 and N 1.4 .

the rewiring probability, on the other hand, is tconv ∼ p−1.4±.1 (not shown), which is consistent with the fact that for p of order 1/N one should recover an essentially one-dimensional behavior with convergence times of order N 3 . While the observation of Figs. 3.10 and 3.12 could convey the impression that, after the memory peak, the system tends to reach a stationary state whose length increases with N , the analysis of the evolution of the number of distinct words instead displays a continuous decrease (see Fig. 3.13). During this apparent plateau, therefore, the system is still evolving continuously towards consensus by elimination of redundant words. Figure 3.13, on the other hand, shows that Nd (t) curves for various system sizes and values of p collapse well when correctly rescaled around the crossover time N/p2 . The combination of the results concerning average inventory sizes and number of distinct words corresponds to a picture in which clusters of agents sharing a common unique word compete during the time interval between the peak and the final consensus. It is thus interesting to measure how the average cluster size evolves with time and how it depends on the rewiring probability p. At p = 0, we know that the average cluster size hsi evolves

3.3 Small-world networks 10

2

Nd /(Np)

10

10

10

10

69

p=0.01 p=0.02 p=0.04 p=0.08

1

0

-1

-2

10

-4

10

-2

2

t p /N

10

0

10

2

Figure 3.13: Number of different words. Curves refer to small-world networks with hki = 8, p = 10−2 and p = 8.10−2 and increasing sizes (from left to right): N = 103 (black), N = 104 (blue), N = 105 (red). Rescaling on both axes make the curves collapse around the crossover time N/p2 . As in Fig. 3.3.1), two values of the system size (N = 104 and N = 105 ) are displayed for each p.

√ following a pure coarsening law hsi ∝ t. As p increases, on the other hand, deviations are observed when time reaches the crossover p2 /N , at a cluster size 1/p, as was expected from the intuitive picture previously developed. Figure 3.14 shows the collapse of the curves for different values of p around the crossover region, i.e. of hsip vs. tp2 /N for tp2 /N of order 1. √ Interestingly, the first deviation from the t law corresponds to a slowing down of the cluster growth, correspondingly with the slowing down observed in Fig. 3.10. Because of long-range links, indeed, the clusters are locally more stable, due to the presence of an effective ’pinning’ of interfaces near a shortcut. This effect is reminiscent of what happens for the Ising model on small-world networks [38] where, at low temperature, the local field transmitted by the shortcuts delay the passage of interfaces. Unlike Ising’s zero temperature limit, however, the present dynamics only slows down and is never blocked into disordered configurations. Strikingly, the final abrupt jump towards a unique cluster of size N starts earlier and from smaller average cluster size as p is increased. Although not intuitive, this behavior can be explained as follows. As p increases, these clusters are smaller and separated by more and more sites which have more than one word in memory (hence a larger value of Nw /N as p increases) and are more and more correlated. The sudden convergence to global consensus

70

The role of topology

p=0.005 p=0.01 p=0.04 p = 0.08

-2

p/N

10

-4

10

-6

10

-4

10

-2

10

2

tp /N

0

10

Figure 3.14: Evolution of the cluster size. The curves (N = 104 ) are rescaled around the crossover region, where the initial coarsening regime is abandoned.

is thus obtained through a final fast agreement process between these sites. In summary, we have shown how the addition of a finite number of longrange links to a regular one-dimensional structure (p À 1/N ) leads to a strong change in the dynamics, allowing a significant acceleration of the convergence process which passes from a N 3 dependence, typical of the onedimensional case to a N 1.4 scaling, close to the mean-field one. The overall dynamics occurs in two stages separated by a crossover time scaling as N/p2 .

3.4

Complex networks

As we have seen (Sec. 1.6), social interactions take place on networks that are neither mean-field like nor regular lattices, but share a certain number of properties such as the small-world property (the average topological distance between nodes increases very slowly -logarithmically or even slowerwith the number of nodes) and the relative abundance of “hubs” (nodes with very large degree compared to the mean of the degree distribution P (k)). More precisely (see Appendix A for a more detailed discussion), the degree distributions are in many cases heterogeneous, with heavy-tails often power-law (or “scale-free”) distributed (for a significant range of values of k, one has P (k) ∼ k −γ ) [6, 82, 162, 47]. Moreover, social networks are often characterized by a large transitivity, which implies that two neighbors of a given vertex are also connected to each other with large probability. Transitivity can be quantitatively measured by means of the clustering co-

3.4 Complex networks efficient ci of vertex i [192], defined as the ratio between the number of edges mi existing between the ki neighbors of i, and its maximum possible value, i.e. ci = 2mi /(ki (ki − 1)). The average clustering coefficient, defined as P C = i ci /N , usually takes quite large values in real complex networks. In order to investigate to what extent these properties affect the local and global dynamics of the Naming Game, in this Section we discuss the results of extensive simulations of the model with agents embedded on the nodes of various paradigmatic computer-generated network models. We refer to Appendix A for a definition of the various models we have used, along with a discussion of their properties. However, since it is the first time that we deal with such structure in this thesis, for grater convenience we recall them in the next paragraph, focusing on the most relevant aspects.

3.4.1

Networks definition and main properties

Our aim is to understand the influence on the dynamics of the Naming Game of the most salient network properties such as heterogeneity in the degree distribution, clustering, average degree. We therefore concentrate on a few network models that have become indeed paradigms of (unweighted) complex networks. For the purposes of this thesis (see also Chapter 4), an important distinction concerns homogeneous networks, whose degree distribution is peaked around the average value, and heterogeneous networks, whose degree distribution is skewed, presenting large fluctuations around the average value. The prototype of homogeneous networks is the uncorrelated random graph model proposed by Erd¨os and R´enyi (ER model) [85, 86], whose construction consists in drawing an (undirected) edge with a fixed probability p between each possible pair out of N given vertices. The resulting graph shows a binomial degree distribution with average hki ' N p, converging to a Poisson’s distribution for large N . If p is sufficiently small (order 1/N ), the graph is sparse and presents locally tree-like structures. In order to account for degree heterogeneity, other constructions have been proposed for random graphs with arbitrary degree distributions [139, 140, 2, 3, 98, 56]. In particular, we will consider the uncorrelated configuration (UC) model which yields uncorrelated random graphs through the following construction: N vertices with a fixed√degree sequence taken from the desired degree distribution, with a cut-off N , are connected randomly avoiding multi-links and self-links. Since many real networks are not static but evolving, with new nodes entering and establishing connections to already existing nodes, many models of growing networks have also been introduced. We will consider the model introduced by Barab´asi and Albert (BA) [16], which has become one of the most famous models for complex heterogeneous networks, and is constructed as follows: starting from a small set of m interconnected nodes,

71

72

The role of topology new nodes are introduced one by one. Each new node selects m older nodes according to the preferential attachment rule, i.e. with probability proportional to their degree, and creates links with them. The procedure stops when the required network size N is reached. The obtained network has average degree hki = 2m, small clustering (of order 1/N ) and a power-law degree distribution P (k) ∼ k −γ , with γ = 3. The BA networks have small clustering, in contrast with social networks. It turns out that growing networks can as well be constructed with a large clustering. In Ref. [83] indeed, Dorogovtsev et al. have proposed a model (DMS model) in which each new node connects with the two extremities of a randomly chosen edge, forming therefore a triangle. Since the number of edges arriving to any node is in fact its degree, the probability of attaching the new node to an old node is proportional to its degree and the preferential attachment is recovered. The degree distribution is therefore the same as the one of a BA model with m = 2, and the degree-degree correlations are as well equal. However, the clustering coefficient is large and approximately equal to 0.73 [25]. In order to tune the clustering, we can consider a generalization of this construction, in the spirit of the Holme-Kim model [106]: starting from m connected nodes (with m even), a new node is added at each time step; with probability q it is connected to m nodes chosen with the preferential attachment rule (BA step), and with probability 1 − q it is connected to the extremities of m/2 edges chosen at random (DMS-like step). The onenode and two-node properties (i.e. degree distribution and degree-degree correlations) are the same as the ones of the BA network, while the clustering spectrum, i.e. the average clustering coefficient of nodes of degree k, can be computed as C(k) = 2(1 − q)(k − m)/[k(k − 1)] + O(1/N ) [25, 24]: changing m and q allow to tune the value of the clustering coefficient. Since the ER model also displays a low clustering, we consider moreover a purposely modified version of this random graph model (Clustered AR, or CER model) with tunable clustering. Given N nodes, each pair of nodes is considered with probability p; the two nodes are then linked with probability 1 − Q while, with probability Q, a third node (which is not already linked with either) is chosen and a triangle is formed. The clustering is thus proportional to Q (with p ∼ O(1/N ) we can neglect the original clustering of the ER network) while the average degree is approximately given by hki ' [3Q + (1 − Q)] pN ' (2Q + 1)pN Thus, in order to compare an ER and a CER network with the same hki, we will have to tune p for the construction of the corresponding CER. The next section contains the results of simulations of the minimal Naming Game with agents embedded on ER and BA networks. Our simulations have been carried out on networks of sizes ranging from 103 to 5.104 nodes, with results averaged over 20 runs per network realization and over 20 network realizations. Since the BA model has some particular hierarchical structure due to its growing construction, we have compared the correspond-

3.4 Complex networks

73

Nw(t)/N

1.6 1.4 1.2 1 0

Nd(t)/N

10

-1

10

reverse NG direct NG

-2

10

-3

10

0

10

1

2

10

10

3

10

t/N Figure 3.15: Direct and reverse Naming game. Total memory Nw (top) and number of different words Nd (bottom) vs. rescaled time for two different strategies of pair selection on a BA network of N = 104 agents, with hki = 4. The reverse NG rule (black full line) converges much faster than the direct rule (red dashed line). ing results with the case of networks created with the UC model, in which the exponent of the degree distribution can moreover be varied. It turns out that the obtained behavior is very similar, so that we will display results for the BA model. The effect of clustering will be discussed using the mixed BA-DMS and the CER network models.

3.4.2

Direct and reverse Naming Game

Before analyzing simulation results, it is necessary to note that the minimal Naming Game model itself, as described in section 2, is not well-defined on complex heterogeneous topologies. The problem is in the asymmetric interaction rule, according to which the two neighboring agents chosen to interact have different roles. One (the speaker) transmits a word and is thus more “active” than the other (the hearer). We should therefore specify whether, when choosing a pair, we choose first a speaker and then a hearer among the speaker’s neighbors, or the reverse order. If the agents sit on either a fully connected graph or on a regular lattice, they have an equivalent neighborhood so the order is not important. On a generic network with degree distribution P (k), however, the degree of the first chosen node and of its chosen neighbor are distributed respectively according to pk ≡ P (k)

74

The role of topology and to qk ≡ kpk /hki. The second node will therefore have typically a larger degree, and the asymmetry between speaker and hearer can couple to the asymmetry between a randomly chosen node and its randomly chosen neighbor, leading to different dynamical properties (this is the case for example in the Voter model, as studied in [183, 51, 52]). This is particularly relevant in heterogeneous networks for which a neighbor of a randomly chosen node is a hub with relatively large probability. Thus, along with the usual rules, a further prescription is needed, stating how to assign the roles to the extracted agents. More precisely, there are three possibilities: Direct Naming Game. A randomly chosen speaker selects (randomly) a hearer among its neighbors. This is probably the most natural generalization of the original rule. We call this strategy direct Naming Game (or speaker-first selection rule). In this case, larger degree nodes will preferentially act as hearers. Reverse Naming Game. The opposite strategy, called reverse Naming Game (or hearer-first selection rule) , can also be carried out: we choose the hearer at random and one of its neighbors as speaker. In this case the hubs are preferentially selected as speakers. Neutral Naming Game. A neutral strategy to pick up pairs of nodes is that of considering the extremities of an edge taken uniformly at random. The role of speaker and hearer are then assigned randomly with equal probability among the two nodes. Figure 3.15 allows to compare the evolution of the direct and the reverse Naming Game for a BA network of N = 104 agents and hki = 4. In the case of the reverse rule, a larger memory is used although the number of different words created is smaller, and a faster convergence is obtained. This corresponds to the fact that the hubs, playing principally as speakers, can spread their words to a larger fraction of the agents, and remain more stable than when playing as hearers, enhancing the possibility of convergence. Interestingly, as noted for the Voter model [183, 51, 52], the scaling laws of the convergence time are indeed affected by the adopted selection procedure, except for very special cases (power-law degree distribution P (k) ∼ k −γ with exponent γ = 3). However, in what follows we concentrate only on the direct Naming Game, and we do not have a similar in depth analysis of the other cases. Our choice has been due to the fact that the direct selection procedure, in which the speaker selects the agent it wants to talk to, is probably the most natural to model real world interactions among individuals or artificial agents. A detailed study of the reverse and mixed Naming Games, and their comparison with the analysis we are going to present, is left for future work (see also [18]).

3.4 Complex networks

10

75

1

3

10

ER, 10

1

BA, 10

4

Nw(t)/N

ER, 10

BA, 10

ER, 5x10 MF, 10

4

(Nd(t)-1)/N

10 10

MF, 10

0

10

1

2

10

10

3

10 0 10 10

3

ER, 10

10

4

10

4

3

-1

-2

4

10

ER, 5x10

-4

10

0

MF, 10 1

10

2

10

t/N

10

3

10

1

BA, 10 BA, 10

-3

10

2

10

3

4 4

-4

10

3

BA, 5x10

3

10

10

-1

-2

ER, 10

-3

4

BA, 5x10

3

0

10 0 10

3

0

MF, 10 1

10

3

10

t/N

2

10

3

Figure 3.16: Number of words. ER random graph (left) and BA scale-free network (right) with hki = 4 and sizes N = 103 , 104 , 5.104 . Top: evolution of the average memory per agent Nw /N versus rescaled time t/N . For increasing sizes a plateau develops in the re-organization phase preceding the convergence. The height of the peak and of the plateau collapse in this plot, showing that the total memory used scales with N . Bottom: evolution of the number of different words Nd in the system. (Nd − 1)/N is plotted in order to emphasize the convergence to the consensus with Nd = 1. A steady decrease is observed even if the memory Nw displays a plateau. The mean-field (MF) case is also shown (for N = 103 ) for comparison.

3.4.3

Global quantities

We first study the global behavior of the system through the temporal evolution of the usual three main quantities: the total number Nw (t) of words in the system, the number of different words Nd (t), and the rate of success S(t). Of course, all these quantities are averaged over a large number of runs and networks realizations. In Fig. 3.16, we report the curves of Nw (t) and Nd (t) for ER (left) and BA networks (right) with N = 103 , 104 and 5.104 nodes and average degree hki = 4. The corresponding data for the mean-field case (with N = 103 ) are displayed as well for reference. The curves for the average use of memory Nw (t) show a rapid growth at short times, a peak and then a plateau whose length increases as the size of the system is increased (even when time is rescaled by the system size, as in Fig. 3.16). The time and height of the peak, and the height of the plateau, are proportional to N . These scaling properties are systematically studied in Fig. 3.17, which also shows that the convergence time tconv scales as

76

The role of topology 10

10

8

10

t

6

tmax tconv

α

t ∝ Ν , α ≈ 1.0 β

t ∝ Ν , β ≈ 1.4

10

4

10

2

10

6

Nw

max

10

max

Nw

ER

BA

ER

BA

α

∝ Ν , α ≈ 1.0

4

10

2

10

0

10 0 10

2

4

10

10

N

2

4

10

10

6

10

N

Figure 3.17: Scaling behavior. Top: scaling with the system size N for the time of the memory peak (tmax ) and the convergence time (tconv ) for ER random graphs (left) and BA scale-free networks (right) with average degree hki = 4. In both cases, the maximal memory is needed after a time proportional to the system size, while the time needed for convergence grows as N β with β ' 1.4. Bottom: In both networks the necessary memory capacity (i.e. the maximal value Nwmax reached by Nw ) scales linearly with the size of the network.

N 1.4 for both ER and BA. The apparent plateau of Nw does however not correspond to a steady state, as revealed by the continuous decrease of the number of different words Nd in the system: in this re-organization phase, the system keeps evolving by elimination of words, although the total used memory almost does not change. The scaling laws observed for the convergence time is a general robust feature that is not affected by further topological details, such as the average degree, the clustering or the particular form of the degree distribution. We have checked the value of the exponent 1.4 ± 0.1 for various hki, clustering, and exponents γ of the degree distribution P (k) ∼ k −γ for scale-free networks constructed with the uncorrelated configuration model. All these parameters have instead an effect on the other quantities such as the time and the value of the maximum of memory (see Section 3.4.6) . Figures 3.1, 3.16 and 3.17 allow for a direct comparison between the networks investigated and both the mean-field (MF) topology and the regular lattices. Thanks to the finite average connectivity, the memory peak scales only linearly with the system size N , and is reached after a time (N ), in contrast with MF (O(N 1.5 ) for peak height and time) but similarly to the

3.4 Complex networks

77

1

S(t)

0.8 0.6 3

BA, N = 10 4 BA, N = 10 3 MF, N = 10

0.4 0.2 0 0.2

S(t)

0.15 0.1

2

10

t/N

100

150

3

10

m=2 fit: 0.00053 t m=4 fit: 0.00026 t m=8 fit: 0.00013 t

0.05 0 0

50

200

250

300

t

Figure 3.18: Success rate. Top: Temporal evolution of S(t) for BA scale-free networks with hki = 4 and sizes N = 103 and 104 . The dotted black line refers to the mean-field case (N = 103 ). Bottom: short time behavior of S(t) for N = 103 and hki = 4 (m = 2, circles), hki = 8 (m = 4, squares), and hki = 16 (m = 8, triangles). The fitted slopes are very close to the predicted value 2/hkiN .

finite dimensional case. The MF plateau observed in the number of different words, and corresponding to the building of correlations between inventories, with an increasing used global memory, and almost no cancellation of words, is replaced here by a slow continuous decrease of Nd with an almost constant memory used. On the other hand, we have seen in Section 3.3 that, with respect to the slow coarsening process observed in finite dimensional lattices, the small-world properties of the networks, speeds up the convergence towards the global consensus. Therefore, complex networks exhibiting small-world properties constitute an interesting trade-off between mean-field “temporal efficiency” and regular lattice “storage optimization”. Figure 3.18 displays the success rate S(t) for BA networks with N = 103 (red full line), and 104 (blue dashed line) agents and hki = 4. The behavior for ER networks is similar. The success rate for the mean-field (N = 103 ) is also reported (black dotted lines). The success rate increases linearly at very short times (Bottom plot of Fig. 3.18) then, after a plateau similar to the one observed for Nw , increases on a fast timescale towards 1. At short times most inventories are empty, and the success rate increases linearly, being equal to the probability that two agents interact twice, as we have seen for different topologies. As shown in Fig. 3.18 for BA networks, the slopes of S(t) are well described by t/E, where E = N hki/2 is the number

78

The role of topology of possible interacting pairs (i.e. the number of links in the network). When t ∼ O(N ), no inventory is empty anymore, words start spreading through unsuccessful interactions and S(t) displays a bending.

3.4.4

Clusters statistics

We now turn our attention to the behavior of clusters of agents. Again, we call ”cluster” any set of neighboring agents sharing a common unique word. In the case of agents embedded in low-dimensional lattices, we have seen that the dynamics of the Naming Game proceeds by formation of such clusters, that grow through a coarsening phenomenon: the average cluster size (resp. the number of clusters) increases (resp. decreases) algebraically with time (see Section 3.2). On generic networks, a different behavior is observed. As shown indeed in Fig. 3.19 for the BA model (the behavior is very similar for the ER model) the number of clusters reaches very rapidly a plateau that lasts up to the convergence time at which it suddenly falls to 1. Moreover, the normalized average cluster size remains very close to zero (in fact, of order 1/N ) during the plateau, and converges to one with a similar sudden transition. This transition becomes steeper when the average degree increases (and also when the size of the system increases), as also emphasized by sharper peaks in the variance of the cluster size. It is also interesting to monitor the number of agents with a certain number of words: agents with only one word are parts of clusters while agents using more memory are propagating words from one part of the system to another. Fig. 3.20 shows the temporal evolution of the fractions of nodes with 1, 2 and 3 words. As for Nw and the cluster size, these quantities display plateaus whose length increases with the system size (even in rescaled time units t/N ), and converge respectively to 1 and 0 abruptly at tconv . Moreover, n1 is much lower than what would be observed in a coarsening process in which agents with more than one word are only found at the interfaces. The emerging picture is very different from the coarsening obtained on finite dimensional lattices, although the initial formation of small clusters of agents reaching a local consensus through repeated interactions is similar. While a majority of nodes soon compose small clusters, the fraction of nodes with more words is not negligible and decreases only at the end of the evolution. Therefore, the dynamics can not be seen as a coarsening or growth of clusters but as a slow process of correlations between inventories, in a way much more similar to what is observed in mean-field.

3.4.5

Effect of the degree heterogeneity

Global properties of dynamical processes are often affected by the heterogeneous character of the network topology [82, 162]. The previous sub-

3.4 Complex networks

79

Ncl

4000 m=2 m=4 m=8

2000

/N

1 m=2 m=4 m=8

0.5

Var (s)

0 8 10 10 10

m=2 m=4 m=8

6 4

10

0

10

1

10

2

10

3

4

10

t/N Figure 3.19: Cluster statistics. BA network with N = 104 , hki = 4 (circles), hki = 8 (squares), hki = 16 (crosses). From top to bottom: Total number of clusters, average normalized cluster size hsi/N , fluctuations of the cluster size vs. time.

section however has shown that, similarly to what happens for the Voter model [183, 51, 52], the dynamics of the Naming Game is similar on heterogeneous and homogeneous networks. Nonetheless, a more detailed analysis reveals that agents with different degrees present very different activity patterns, whose characterization is necessary to get additional insights on the Naming Game dynamics (see Chapter 4). Let us first consider the average success rate Sk (t) of nodes of degree k. At the early stages of the dynamics it can be computed following the arguments of section 3.4.3. The probability of choosing twice the edge i − j is t N

Ã

1 1 + ki kj

!

,

(3.7)

i.e. the probability of choosing first i (1/N ) then j (1/ki ) or vice versa. Neglecting the correlations between ki and kj , we can average over all nodes

80

The role of topology

1

ER

BA

0.8

n1

n1

0.6 0.4 0.2 0 0 10

n2

n3

n2

n3 2

0

10

1

10

10

t/N

2

10

3

10

t/N

Figure 3.20: Coexistence of agents with different inventory sizes on networks. ER and BA networks with average degree hki = 4. Fractions of nodes with one (n1 in black), two (n2 in red) and three (n3 in blue) words versus time. These fractions evolve only very slowly during the plateau displayed by the memory. The curves for three different sizes are reported: N = 103 (dotted line), N = 104 (dashed line) and N = 5.104 (full line).

i of fixed ki = k, obtaining Sk (t) '

t N

µ

1 + k

¿ À¶

1 k

.

(3.8)

Fig. 3.21 shows that, on uncorrelated scale-free networks (UC model), the data (circles) obtained by numerical simulations are in qualitative agreement with the direct calculations of the expression in eq. 3.8 (crosses). These data together with eq. 3.8 show that, at the very beginning, the success rate grows linearly but the effect of the degree heterogeneity is partially screened by the presence of the constant term h1/ki. The same argument can be used to predict that the success rate should be essentially degree independent for larger times. S(t) is indeed always given by two terms, of which only that referring to the node playing as speaker contains an explicit dependence on 1/k. The argument is only approximate since the multiplicative prefactors contain non-negligible correlations due to the overlapping inventories. More precisely, these arguments are correct for a neutral Naming Game rule, but they should hold also for the direct Naming Game in which the constant term, coming from the activity of nodes as hearers, is much more relevant for high degree nodes. Another interesting point concerns the memory peak. From Fig 3.22 it is clear that the height of the memory peak is larger for nodes of larger degree. This can be understood considering that hubs act more frequently as hearers and therefore receive and collect the different words created in the

3.4 Complex networks

81 *

-1

Sk(t )

10

t = 0.2×Ν 10

*

10

*

t = 0.1×Ν data theory

-2

-1

-2

1

10 100 1

10

10 *

*

t = 0.4×Ν

*

Sk(t )

t = 0.3×Ν -1

10

10

10

-2

100

-1

-2

1

10

k

10 100 1

10

100

k

Figure 3.21: Success rate vs node degree. The four plots correspond to snapshots of the success rate per degree classes Sk (t∗ ) at times t∗ /N = 0.1, 0.2, 0.3, 0.4 in a power-law random graph with exponent γ = 2.2 and size N = 104 generated using the uncorrelated configuration model. The numerical computation of the success rate (black circles) qualitatively agrees with the numerical evaluation of the expression in eq. 3.8 (red crosses).

various “area” of the network they connect together2 .√In fact, the maximal memory used by a node of degree k is proportional to k (see bottom panel in Fig. 3.22). For the mean-field case, all agents have degree k =√N − 1 and the maximal value of the total memory Nw scales indeed as N k = N 3/2 . Note however that in the general case, the estimation of the peak of Nw is not as straightforward. This peak is indeed a convolution of the peaks of the inventory sizes of single agents, that have distinct activity patterns and may reach their maximum in memory at different temporal steps. The knowledge of the average maximal memory of a node of degree k is not sufficient to understand which degree classes play a major role in driving the dynamics towards the consensus. More insights on this issue can be obtained observing the behavior of the total number of different words in each degree class. Figure 3.23 shows the evolution of the number Nd (k, t) of different words in the class of nodes with degree k, for various values of k in a BA network with size N = 104 and hki = 4. Two competing effects take part in determining the differences between nodes: high degree nodes require more memory than low degree nodes (Fig. 3.22), but their number is much 2

Note that for the reverse Naming Game, the hubs act more frequently as speakers and therefore accumulate much less different words. The required maximal memory then slightly decreases at large k.

82

The role of topology

Nw(k,t)/Nk

8 k=5 k=20 k=50 k=100

6 4 2 0 0 10

1

t/N

10

3

10

4

2

max

Nw (k)

10

10

10

1

10

0

10

1

10

2

10

3

k Figure 3.22: Role of the degree on the average and maximum inventory size. BA model with m = 2 (i.e. hki = 4), N = 5.104 . Bottom: maximum memory √ used by a node as a function of its degree. The dashed line is ∝ k. Top: average memory used by nodes of degree k, for various values of k. The lines show the total memory Nw (k, t) used by nodes of degree k at time t, normalized by the number Nk of nodes of degree k. The circles correspond to the bottom curve (k0 = 5) rescaled p by k/k0 showing the scaling of the peaks. Note that the values of Nw (k, t)/Nk are averages over many runs that wash out fluctuations and therefore correspond to smaller values than the extremal values observed for Nwmax (k).

smaller. As a result, low degree classes have in fact overall a larger number of different words (as shown in Fig. 3.23). This is due to the fact that during the initial phase, in which words are invented, low degree nodes are more often chosen as speakers and invent many different words. The hubs need each a larger memory but they in fact retain a smaller number of different words. After the peak in memory, the dynamical evolution displays a relatively fast decrease of Nd (k, t) for small k while a plateau is observed at large k: words are progressively eliminated for low-k nodes while the hubs, which act as intermediaries and are in contact with many agents, still have typically many words in their inventories. The role of the hubs, then, is that of diffusing words throughout the network and their property of connecting nodes with originally different words helps the system to converge. On the other hand, however, playing mostly as hearers, the hubs are not able to promote actively successful words, and their convergence follows that of the neighboring lowdegree sites. In fact, once the low-degree nodes have successfully eliminated most of the different words created initially, the system globally converges on

3.4 Complex networks

2

k=5 k = 20 k = 50 k = 100

Nd(k,t)

10

83

10

1

0

10 0 10

10

1

t/N

2

10

10

3

Figure 3.23: Total number of different words in classes of degree k. Data refer to BA network, m = 2, N = 104 , for values of k = 5, 20, 50, 100 (curves for k = 50 and 100 are almost on top of each other). Low degree nodes, being the vast majority, store a cumulative number of different words larger than the high degree ones. a faster timescale. We note that the average memory Nw (k, t)/Nk converges slightly faster than Nd (k, t) (and that Nd (k, t) converges faster for larger k), showing that the very final phase consists in the late adoption of the consensus by the lowest degree nodes, in a sort of final cascade from the large to the small degrees.

3.4.6

Effect of the average degree and clustering

Social networks are generally sparse graphs (see Appendix A), but their structure is often characterized by high local cohesiveness, that is the result of a very natural transitive property of many social interactions [100]. The simplest way to take into account these features on the dynamics of Naming Game is that of studying the effects of changing the average degree and the clustering coefficient of the network. The effects of increasing the average degree on the behavior of the main global quantities are reported in Fig.3.24. In both ER (left) and BA (right) models, increasing the average degree provokes an increase in the memory used, while the global convergence time is decreased. Note also that, while the behavior of the convergence time with N (i.e. a power-law N β with β ≈ 1.4) is very robust, the linear scaling for the memory peak properties (Nwmax ∝ N α and tmax ∝ N α with α = 1), are slightly altered by an

84

The role of topology 3

5

Nw(t)/N

ER 4

2.5

3

2

2

1.5

1

1

0

10

10

ER

Nd(t)/N

-1

0

BA

-1

10

10

-2

-2

10

10

= 4 = 8 = 16

-3

10

= 4 = 8 = 12

-3

10

-4

10

BA

-4

0

10

1

10

10

t/N

2

10

3 10

0

10

10

1

10

2

3

10

t/N

Figure 3.24: Role of the average degree. ER networks (left) and BA networks (right) with N = 104 agents and average degree hki = 4, 8, 16. The increase of average degree leads to a larger memory used (Nw , top) but a faster convergence. The maximum in the number of different words is not affected by the change in the average degree (bottom).

increase in the average degree (not shown). Increasing hki at finite N brings indeed the system closer to the mean-field behavior where the scaling of these quantities is non-linear (αM F = 1.5); for fixed hki however, the linear scaling is recovered at large enough sizes. Moreover, for larger average degree, the number of nodes having only one word decreases (not shown); i.e. the system needs a more complicated re-organization phase that involves a larger number of agents with many words, but induces a faster convergence. In fact, the larger possibilities of interaction given by the larger number of connections allows for a better sharing of common words and for a more efficient correlation of inventories, thus favoring a faster convergence. Note that the clustering is slightly changing when changing the average degree, but its variation is small enough for the two effects to be studied separately. Here we use some other mechanisms to enhance clustering, summarized in the following two models that have been defined in section 3.4.1: clustered Erd¨os-R´enyi (CER) random graphs, and mixed BA-DMS model. Figure 3.25 shows the effect of increasing the clustering at fixed average degree and degree distributions: the number of different words is not changed, but the average memory used is smaller and the convergence takes more time. Moreover, the memory peak at fixed k is smaller for larger clus-

3.4 Complex networks

85

Nw(t)/N

3

1.6

2.5

q=1 (BA) q=0.75 q=0.5 q=0.25

BA-DMS

1.4 2 1.2

1.5

ER-CER

1

1

0

0

10

10

Nd(t)/N

ER-CER -1

-1

10

10

-2

10

-3

10

0

10

= 6, no clust. = 6, Q = 0.25 = 10, no clust. = 10, Q = 0.75

10

1

10

10

t/N

-2

BA-DMS 2

10

-3

10

0

1

10

2

10

3

10

t/N

Figure 3.25: Effect of clustering. Curves of the total number of words Nw (t) and of the number of different words Nd (t) on random graphs (left) and scale-free networks (right) with N = 104 are plotted. In order to inject triangles into the ER random graphs we have used the construction described in section 3.4.1, obtaining clustered random graphs (CER model, with clustering coefficient proportional to Q) that have been compared to standard ER graphs with equal average degree (hki = 6 and 10). Scale-free networks have been generated with the mixed BA-DMS model described in section 3.4.1, in which the clustering coefficient is proportional to 1 − q. In both networks higher clustering leads to smaller memory capacity required but a larger convergence time.

tering (not shown): it is more probable for a node to speak to 2 neighbors that share common words because they are themselves connected and have already interacted, so that it is less probable to learn new words. Favored by the larger number of triangles, cliques of neighboring nodes learn from the start the same word, causing a slight increase in the fraction of nodes with only one word as reported in Fig. 3.26 for both homogeneous and heterogeneous networks. At fixed average degree i.e. global number of links, less connections are moreover available to transmit words from one part of the network to the other since many links are used in “local” triangles. The enhanced local coherence therefore is in the long run an obstacle to the global convergence. We note that this effect is similar to the observation by Holme et al. [107] that, at fixed hki, more clustered networks are more vulnerable to attacks since many links are “wasted” in redundant local connections.

86

The role of topology

1 ER-CER

0.8

no clust. Q = 0.75

0.6

n1

0.4 0.2 0

n2

n3 3

4

10 1 0.8

5

10

10

BA-DMS

n1

0.6 0.4

n3

n2

0.2 0 0 10

1

10

2

10

q=1 (BA) q = 0.75 q = 0.5 q = 0.25 3

10

t/N Figure 3.26: Effect of enhanced clustering on the agents inventory size. The fraction of agents of agents with 1 (n1 in black), 2 (n2 in red) and 3 (n3 in blue) words is plotted. Top: We compare a clustered random graph (CER model, with clustering coefficient proportional to Q) to standard ER graphs, both with average degree hki = 10. Bottom: Scale-free networks have been generated with the mixed BA-DMS model described in section 3.4.1, in which the clustering coefficient is proportional to 1 − q. In both cases there is a tendency to increase the fraction of agents with one word and decrease the others fractions.

3.4.7

Effect of hierarchical structures

In the previous sections we have argued that networks with small-world property have fast (mean-field like) convergence after a re-organization phase whose duration depends on other properties of the system. The small-world property holds when the diameter of the network grows slowly, i.e. logarithmically or slower, with its size N . This ensures that every part of the network is rapidly reachable from any other part, in contrast to what happens with regular lattices. Such property therefore generically enhances the possibility of creating correlations between the inventories of the agents and of finally converging to a consensus. In this subsection, we show that this line of reasoning bears, surprisingly at first sight, some exceptions. The first (and easiest to understand) exception is given by the scale-free trees, obtained by the preferential attachment procedure with m = 1. In

3.4 Complex networks

Nw(t)/N -1

10

87 BA tree (m=1) DMS, m=2 RAN BRV BA, m=4

0

-1

10

10

-2

10

-3

10

1

10

2

10

3

t/N Figure 3.27: Role of hierarchical structure. Evolution of the number of words for the Naming Game on hierarchical networks, namely the BA tree, the DMS model, the hierarchical BRV [17], and the Random Apollonian network. The case of the BA model with m = 4 is shown with a dashed line for reference. this case, as shown in Fig. 3.27, the convergence is reached very slowly, with Nw (t)/N − 1 decreasing as a power law of the time. This is in contrast with the generic behavior, i.e. a plateau followed by an exponential convergence, as shown also for reference in Fig. 3.27, but similar to the finite-dimensional lattices (the average cluster size as well grows as a power-law, in contrast with the data of Fig. 3.19). This is reminiscent of what happens for the Voter model [183, 51, 52], in which a power-law instead of exponential decrease of the fraction of active bonds is observed, and can be understood through the tree structure of the network. Indeed, from the viewpoint of the dynamics, a tree is formed by two ingredients: linear structures on which the interfaces between clusters diffuse as in one-dimensional systems and branching points at which interfaces may be pinned and their motion slowed. In fact, we have checked that similar (slow) power-law behaviors are also obtained for the Naming Game on Cayley tree (i.e. in which every node has the same degree) or for scale-free tree with different degree distributions (obtained through the generalized linear preferential attachment model). The slowness of this dynamical behavior is however rooted in a slightly more subtle consideration. As Fig. 3.27 indeed shows, the Naming Game displays power-law convergence in other heterogeneous networks that are not at all tree-like, such as the DMS model with m = 2 [83], in which at each step a triangle is created, the deterministic scale-free networks of Barab´asi, Ravasz and Vicsek (BRV) [17] or the Apollonian and Random Apollonian

88

The role of topology Networks (RAN) [11, 198]. Let us briefly recall how these networks are constructed: ˆ For the DMS model with m = 2, one adds at each step a new node which is connected to the extremities of a randomly chosen edge. ˆ For the deterministic scale-free BRV networks, one starts (step 1) with two nodes connected to a root. At each step n, two units (of 3n−1 nodes) identical to the network formed at the previous step are added, and each of the bottom 2n nodes are connected to the root. ˆ The Random Apollonian networks are embedded in a two-dimensional plane. One starts with a triangle; a node is added in the middle of the triangle and connected to the three previous nodes; at each step, a new node is added in one of the existing triangles (chosen at random) and connected to its three corners, replacing the chosen triangle by three new smaller triangles.

All these networks share a very important and hard to quantify property: they are hierarchically built. This is particularly clear for the BRV case, since at each step the new network is formed by three identical sub-networks. In the RAN as well, hierarchically nested units can be identified with the triangles, each of which contains other smaller triangles. Finally, in the DMS case, one can identify a unit at a certain scale as an edge and the set of nodes that have been attached to the extremities of this edge or of the edges subsequently created in this unit. Because of these particular network organizations, each node belongs in fact to a given sub-hierarchical unit and, to go from one node to another node in another sub-unit, a hierarchical path has to be followed. The trees represent a particular class of such structures, in which there exist only one path between two given nodes. In this sense, such networks, although being small-world, present a structure which renders communication between different parts of the network more difficult. Each sub-unit can therefore converge towards a local consensus which renders the global consensus more cumbersome to achieve. Such results show that the small-world property in fact does not by itself guarantee an efficient convergence of dynamical processes such as the Naming Game, and that strongly hierarchical structures in fact slow down and obstruct such convergence.

3.4.8

The role of community structures

In contrast with other non-equilibrium models, as those based on zero temperature Glauber dynamics or the voter model [183, 54, 38, 51, 52], we do not find any signature of the occurrence of metastable blocked states in any

3.4 Complex networks

89

1.6

4

Nw(t)/N

3

4

N = 10 , c = 10

1.4

N = 10 , c = 100

3

1.2

2

1 10 0 10

1 0

10

2

10

4

3

10 0 10

0

10

1

Nd(t)/N

10

10

-2

10 0

10

2

t/N

10

4

10

3

N = 10 , c = 100

-1

10

2 4

N = 10 , c = 10

10

10

-1

-2

10

0

10

1

10

2

10

3

t/N

Figure 3.28: Metastable states in networks with strong community structure. Each community is composed of c nodes so that there are N/c = 100 communities. The dashed ellipse and the arrow identify the metastable states with Nd = N/c different words.

relevant topology with quenched disorder. While the total number of words displays a plateau whose length increases with the system size during the re-organization phase, indeed, the number of different words is continuously decreasing, revealing that the convergence is not a matter of fluctuations due to finite-size effects, but the result of an evolving self-organizing process. Such behavior makes the Naming Game a robust model of self-coordinated communication in any structured population of agents. A noticeable exception concerns the case of agents sitting on networks with strong community structures, i.e. networks composed of a certain number of internally highly connected groups interconnected by few links working as bridges. Figure 3.28 reports the behavior of the Naming Game on such a network, composed of fully-connected cliques, each of c nodes, the various cliques being connected to each other with only one link. From simulations it turns out that, not only the total number of words, but even the number of different words display a plateau whose duration increases with the size of the system. The number of different words in the plateau equals the number of communities, while the corresponding total number of words per node is about one, proving the existence of a real metastable state in which the system reaches a long-lasting multi-vocabulary configuration. Indeed, each community reaches internal consensus but the weak connections between communities are not sufficient for words to propagate from one community to the other.

90

The role of topology 10

10

Nw(t) Nd(t) Bs(t)

2

10

10

4

0

-2 8

10

10

9

t

n

0.4

10

10

11

10

11

t = 3.10

0.2 0 0

1

2

3

4

rank Figure 3.29: Metastable states in the WAN network. Top: the number of different words decreases with steps spaced out by long plateaus, while the total number of words is approximately constant and close to N = 3880. The fraction of boundary sites B(t), i.e. of sites which in principle could play an unsuccessful interaction in the next game, is approximately 20%. Bottom: histogram of the fraction of agents, n, storing each of the three surviving words at time t = 3.1011 , ordered in terms of their popularity (rank order). There is not a dominating word, but instead different clusters of agents are competing.

Interestingly, we find a similar behavior on a real network, namely the World-wide Air-transportation Network (WAN), in which vertices are airports and links are direct flight connecting them. The network is made of N = 3880 connected airports, with an average degree of hki = 9.7, the most connected node having kc = 318 neighbors. The degree distribution is power law with P (k) ∼ k −γ f (k/kc ), where γ ' 2.0 and f (k/kc ) is an exponential cutoff. The network is small-world, since the average distance among nodes is hli = 4.4, and, as it is usual, we have not considered weights on links. In Figure 3.29, we show that a single run of the Naming Game on the WAN network has not yet converged after t = 3.1011 time-steps (but, in different simulations, we did not observe convergence after t = 1012 interactions), while, for comparison, on a Bar´abasi-Albert network of the same size and average degree, consensus is reached in a time of order O(106 ). The number of different words presents several decreasing steps, while the number of total words is approximately constant and close to N (i.e. the average number of words per agent is very close to unity). We consider also the number of boundary sites, B(t), which are defined as agents whose next game may be

3.5 Conclusions a failure, and we see that they are approximately the 20% of the population, i.e. B(t) ' 0.2. In the bottom graph of Figure 3.29 we also plot the rank histogram of the three different words surviving at time t = 3.1011 , showing that there is not a strong dominating word, but on the contrary different clusters of agents are competing. A raw analysis suggests that there is a correspondence between observed clusters and the geographical coordinates of the airports. For instance, European and North American airports form two well cut clusters. Of course, however, these are only preliminary results, and a more detailed analysis, of this and other real world networks, will be addressed in future work.

3.5

Conclusions

In this chapter we have investigated the role of different interaction patterns on the global dynamics of the Naming Game. In d−dimensional regular lattices, each agent is connected to a finite number of neighbors (2d) and it stores only a finite number of different words in its inventory at any given time. As a result, the total amount of memory used by the whole system grows as N instead of N 3/2 of the meanfield topology. Local consensus appears at very early stages of the evolution, since neighboring agents tend to share the same unique word. The dynamics then proceeds through the coarsening of such clusters of agents sharing a common name; the interfaces between clusters are composed by agents who still have more than one possible name, and diffuse randomly. Because p of this particular coarsening process, the average cluster size grows as t/N , and the time to convergence corresponds to the time needed for one cluster to reach the system size, i.e. a time N 1+2/d for d ≤ 4, which is then the upper critical dimension. In one dimension in particular, we have shown also analytically that the convergence is thus dramatically slowed down from O(N 3/2 ) to O(N 3 ). In summary, compared to the mean-field case, lowdimensional lattice systems require more time to reach the consensus, but a lower use of memory. The next step has consisted in exploring the role of the small-world property (the average hopping distance between two nodes scales only logarithmically with the size of the network), which is a crucial signature of most complex networks. We have started using the model proposed by Watts and Strogatz [192, 191], which allows to interpolate between one dimensional regular structures and random graphs by tuning the rewiring probability p. Interestingly, we have been able to identify two regimes ruling the global dynamics of the Naming Game on such networks when 1/N ¿ p ¿ 1. Indeed, at the beginning everything goes as in the one dimensional case, but, after a time of order N/p2 , the small-world property ensures the propagation of different words out of the local scale, boosting up the spreading process and

91

92

The role of topology thus fostering convergence. More precisely, we have that tconv ∼ N βSW , with exponent approximately βSW ' 1.4 (the discrepancy with the mean-field exponent (β ' 1.5) may be due to logarithmic corrections that are unlikely to be captured using numerical scaling techniques). On the other hand, small world networks assure a finite memory requirement to the agents, since the peak of the total number of words scales as N . This is due to the finite connectivity of these networks, which are sparse, i.e. whose average degree hki is small compared to N . In summary, small-world networks constitute a perfect tradeoff between low dimensional and mean-field topologies. Like the former, they require finite memory, but the presence of shortcuts allows for fast convergence. We have also explored a large number of complex networks models in order to investigate the role of their different properties on the Naming Game. First of all, in all cases we have found that the convergence time scales as tconv ∼ N βSW , with βSW ' 1.4. Since all the considered networks share the small-world property, this finding strongly corroborates our analysis of the Watts and Strogatz [192, 191] model, suggesting that this feature is indeed a sufficient condition for fast convergence. We have also confirmed that low connectivity is responsible for finite memory requirements. Beyond these unifying aspects of networks properties, we have been able to point out important differences among homogeneous and heterogeneous networks. In homogeneous networks all nodes have a similar neighborhood and therefore similar dynamical evolution, while in heterogeneous networks classes of nodes with different degree play different roles in the evolution of the Game. In the direct Naming Game, high degree nodes, indeed, are more likely chosen as hearers and, consequently, they have larger inventory sizes. At the beginning, low degree nodes are much more involved in the process of word generation than the hubs. Local consensus is easily reached and a large amount of locally stable different words get in touch with higher degree nodes. The latter start to accumulate a large number of words in their inventories, playing as spreaders of names towards less connected agents and finally driving the convergence. From this viewpoint, the convergence dynamical pattern of the Naming Game on heterogeneous complex networks presents some similarities with more studied epidemic spreading phenomena [28]. We shall explore the microscopic dynamical patterns in different topologies in greater details in Chapter 4. The relation between topological properties and the dynamical evolution of the system are further characterized by a detailed study of the effects of varying the average degree and clustering coefficient. These effects are equivalent on homogeneous and heterogeneous networks. While any increase of the average degree provokes a larger memory peak and a faster convergence, the growth of the clustering coefficient leads to the decrease of the necessary memory but the fast growth of local consensus delays in the long run the global convergence. The latter effect is particularly relevant for real social

3.5 Conclusions networks in which local cohesiveness is an important feature that cannot be neglected. Finally, the presence of hierarchical structures slows down the convergence time dramatically. We have shown that the main reason is that agents in different hierarchical levels are able to find rapidly a consensus, but a cluster cannot expand itself easily out of its original site due to competition with other clusters. This is exactly what happens when a strong community structure is present. In this case the number of different words can exhibit arbitrarily long plateaus whose height represents the number of competing communities. However, in the end the population is always able to reach a consensus, showing that the Naming Game is governed by a strongly converging dynamical rule. In conclusion, populations of agents with fixed complex topology do evolve towards a homogeneous state of consensus and efficient communication, except for cases in which hierarchical or community structures are very strong, and the detailed topological properties affect only the convergence pattern and time scale. The interesting issue of a possible interplay between topology and dynamics, in situations in which the agents are free of rearranging their connectivity patterns in relation to the dynamical evolution of the system, is not addressed in this thesis, but it is an important direction for future work.

93

Chapter 4

Microscopic activity patterns 4.1

Introduction

As many other models of social interaction, the Naming Game is a nonequilibrium model in which the system eventually reaches a stationary state. The dynamical evolution of these systems is usually characterized by a temporal region in which the system reorganizes itself followed by the sudden onset of a very fast convergence process induced by a symmetry breaking event. As we have seen, the Naming Game presents this type of dynamics when the agents are embedded both in a mean-field like topology, i.e. a complete graph, and in complex networks with small-world property, that are undoubtedly the most realistic cases for models of social interaction. In this chapter, based on the work presented in [72], we concentrate on the analysis of single agents activity, which we have started to investigate in Chapter 3. With respect to usual global quantities, studied in previous chapters, this point of view allows to point out deeper connections between the learning process of the agents (i.e. the dynamics of acquisition and deletion of words of a single agent) and the topological properties of the system. Indeed, it turns out that, far from the convergence process, the shape of the distribution of the number of words stored by an agent, i.e. its memory size, depends on purely topological properties of the system (i.e. the first two moments hki and hk 2 i of the degree distribution). In particular, we show analytically, by means of a master equation approach, that homogeneous graphs, whose degree distribution is peaked around the average value, yield exponential distributions, while heterogeneous networks, characterized by large fluctuations of the agents degree, give rise to halfnormal distributions. During the convergence process, on the other hand, the master equation approach is still appropriate to describe agents internal dynamics, but with qualitatively different results. All these systems tend to develop a power-law memory-size distribution, which is a signature of the convergence process,

96

Microscopic activity patterns but it actually emerges only in the case of the complete graph (called ‘meanfield’ case). In other topologies, the cut-off sets in too early for the power-law to be observed. Therefore, the analytical and numerical study of the memory-size distributions provides deep insights on the influence of the topology in the dynamics of the Naming Game. Moreover, the new findings are complementary to those we already know from the analysis of global observable quantities, and allow for a deeper understanding of the observed phenomena. The chapter is organized as follows. The next section contains the main numerical results concerning the internal dynamics of individual agents in the Naming Game. In section 4.3, the problem of determining agents internal dynamics is faced using a master equation approach. Section 4.4 is devoted to illustrate in details some interesting cases. Conclusions and a discussion on the relevance of the present work are reported in section 4.5.

4.2

Agents activity on different topologies

In this section, we study numerically the activity of an agent focusing on the dynamics of its memory or inventory size, i.e. the number of words nt stored in the inventory of a node at the time t. In particular, the present analysis is conceived for topologies on which we cannot clearly identify a coarsening process leading to the nucleation and growth of clusters containing quiescent agents (e.g. complete graph, homogeneous and heterogeneous random graphs, high-dimensional lattices, etc.). As we know, complex networks represent typical examples of such topological structures. In other topologies, such as in low-dimensional lattices, the agents internal activity is limited by the small number of words locally available (Sec. 3). An example of the different activity patterns in different topologies is reported in Fig. 4.1. Top panels show the different level of activity displayed by low and high degree nodes in a BA heterogeneous network. Indeed, as already pointed out in Sec. 3.4.2, the asymmetry of the NG interaction rules becomes relevant when the degree distribution of the network, P (k), has long tails. When selecting the two interacting agents, the first node is thus chosen with probability pk ≡ P (k), while the hearer is chosen with probability qk = kpk /hki. Then, since we study, as it is usual, the direct naming game, the hubs are more active, being preferentially chosen as hearers, and they may reach larger inventory sizes. In homogeneous networks (bottom-left panel) all agents display approximately the same level of activity. In this case we reported an ER random graph with rather large average degree, so that the inventory may reach moderately large sizes. It is possible to verify with a magnification of the scales that the structure of the peaks is the same for all networks. The only

4.2 Agents activity on different topologies 60

60 BA, k = 414

nt

50

40

30

30

20

20

10

10 0

60

5

5

5

5

6

2×10 4×10 6×10 8×10 1×10 ER, k = 50

50 40

nt

BA, k = 10

50

40

0

97

0

0

6

5

5

5

5

6

2×10 4×10 6×10 8×10 1×10 1D, k = 2

4

30 20

2

10 0

0

5

5

5

5

6

2×10 4×10 6×10 8×10 1×10

t

0

0

5

5

5

5

6

2×10 4×10 6×10 8×10 1×10

t

Figure 4.1: Temporal series of the inventory size of a single agent in different topologies. Top: Series from a Barab´asi-Albert (BA) network with N = 104 nodes and average degree hki = 10, for nodes of high degree (e.g. k = 414) and low degree (e.g. k = 10). Bottom: Series for nodes in Erd¨os-R´enyi random graph (N = 104 , hki = 50) and in a one-dimensional ring (k = 2).

topology displaying clearly different results is the regular one-dimensional lattice (bottom-right panel), in which the inventory size does not exceeds 2 because of the coarsening process (Sec. 3.2). A quantity that clearly points out the statistical differences in the activity of the nodes depending on both their degree and the topological structure of the network is the probability distribution Pn (k|t) that a node of degree k has a number n of words in the inventory at the time t. The distribution is computed averaging over the class of nodes of given degree k at a fixed time t. Fig. 4.2 displays typical inventory size distributions for the Naming Game on complex networks computed in the reorganization region that precedes the convergence. The top panel of Fig. 4.2 reports Pn (k|t) for the case of highly connected nodes in a heterogeneous network (the Barab´asiAlbert network), whereas the bottom panel shows the same data for nodes of typical degree in a homogeneous network (the Erd¨os-R´enyi random graph). From the comparison of the curves for different temporal steps (in the reorganization region), it turns out that in both cases the functional form of the distribution does not change considerably in time; the time t enters in the distributions as a simple parameter governing their amplitude and the position of the cut-off. Moreover, in homogeneous networks the shape of the distribution does

98

Microscopic activity patterns

Pn(k | t)

0.08 0.06

Pn(t1) Pn(t2)

0.04

BA fit, a*exp(-bx )

2

0.02

Pn(k | t)

0 0 0 10 10

10

20

30

40

50

60

70

100

120

-2 -4

10

-6

10

0

Pn(t1) Pn(t2) ER fit, a*exp(-b x) 20 40 60

80

n Figure 4.2: Parametric dependence on time of the distribution of the number of words Pn (k|t) in the reorganization region. Time has the effect of deforming the shape of the distributions, but does not change their functional description. Top: BA graph of N = 104 nodes with hki = 10. Only the set of nodes with k > 150 (hubs) is monitored. Histograms come from measurements at different times t1 and t2 with t2 − t1 = 5.105 time-steps. Bottom: ER graph of N = 104 nodes and hki = 10. Measures refer to to the set of nodes with k > 70. t2 − t1 = 4.105 time-steps.

not actually depend on the degree of the node, since all nodes have degree approximately equal to the average degree hki. In the heterogeneous networks a deep difference exists between the behavior of low and high degree nodes. Low degree nodes have no room to reach high values of n, thus their distribution has a very rapid decay (data not shown); for high degree nodes, on the contrary, the distribution extends for more than one decade and its form is much clearer. Apart from the behavior of low degree nodes, it is clear that the functional form of the distribution Pn (k|t) is different in homogeneous and heterogeneous networks. Homogeneous networks are characterized by exponential distributions, while high degree nodes in heterogeneous networks present faster decaying distributions, that are well approximated by half-normal distributions (i.e. with Gauss-like shape). Both cases of homogeneous and heterogeneous networks appear different from that of the mean-field case, in which the agents are placed on the

4.2 Agents activity on different topologies 10

0

t/N = 100 t/N = 110 t/N = 120 t/N = 125 t/N = 126 t/N = 128 t/N = 130 -ρ n , ρ ≈1.16

-1

Pn(t)

10

10

99

-2

-3

10

-4

10

0

10

1

10

2

n

10

Figure 4.3: Inventory size distribution for the Naming Game on the complete graph during the convergence process. At the beginning the peak √ at n ∼ N gives way to a power-law, with exponent approximately −1, that rapidly becomes more and more steep at low values of n. The numerical data are obtained from a single run of the Naming Game on a complete graph of N = 104 nodes, monitoring the whole temporal region of convergence. Note that we report single run experiments since the temporal fluctuations of the convergence process are rather large (Sec. 2.5), so that averaging over many runs may alter the real value of the power-law exponent.

vertices of a complete graph. In that case, indeed, we have seen in Sec. 2.4.1 that, during the reorganization, the inventory size distribution is given by the superposition of an exponential and a delta function peaked around √ n ∼ N . The reason of these differences will be elucidated in the next sections by means of an analytical approach to the problem. In contrast with the previous reorganization region, at convergence the main global quantities describing the dynamics accelerate: Nw (t) converges to N , while Nd (t) and S(t) go to 1, all with a super-exponentially fast process. Nevertheless, even in this region, the temporal scale of the global dynamics is much slower than that of agents activity, thus the fixed-time inventory size distribution Pn (k|t) is still a significant measure of the local activity. Now the mean-field presents a more interesting phenomenology compared to sparse complex networks. We have seen in Figure 2.9 that, near the convergence, the complete graph develops √ a power-law inventory size distribution, with an exponential cut-off at n ' N . Approaching the final consensus state the slope of the power-law becomes steeper and the

100

Microscopic activity patterns cut-off moves backwards to 1, as shown in more detail Figure 4.3. Similar power-law behaviors are not observed in any other topology even if it should be expected on homogeneous random graphs that, in the limit of large average connectivity, tend to the complete graph. Numerical simulations instead show that, in the region of convergence, both homogeneous and heterogeneous complex networks (such as the ER model and the BA model) present an exponential distribution of the inventory size (data not shown). The numerical results reported in this section point out that the microscopic agents activity is closely related with the global dynamics and with the topological properties of the system. In the next section, we will show that, even if the dynamics of the number of words exhibited by a node is very complicated, mapping it on a jump process allows for some more rigorous results that give reason of the behaviors found in the numerical simulations.

4.3

Master equation approach to agents internal dynamics

The jump process observed in the previous section and its statistics can be described using a master equation for the probability Pn (k|t) that an agent of degree k has inventory size n at time t. Formally, it reads

Pn (k|t + 1) − Pn (k|t) = Wk (n − 1 → n|t)Pn−1 (k|t) +

(4.1)

−Wk (n → n + 1|t)Pn (k|t) + −Wk (n Nd (t) P1 (k|t + 1) − P1 (k|t) =

X

→ 1|t)Pn (k|t),

Nd (t) ≥ n > 1

Wk (j → 1|t)Pj (k|t) − Wk (1 → 2|t)P1 (k|t) ,

j=2

where Nd (t) is the number of different words present in the system at time t and Pn (k|t) depends a priori explicitly on the time. Note that this equation describes the average temporal behavior of a class of agents with the same degree k. In order to get an expression for the transition rates, we call Ck (t) the number of different words that are accessible to a node (of degree k) at time t, i.e. that are present in the neighborhood of the node. In the case of the complete graph, Ck (t) = C(t) = Nd (t). The small-world property characterizing many complex networks ensures that the quantity Ck (t) does not actually depend on k, since nodes with very different degree have access to the same repertoire of different words (or words). Furthermore, the largest part of the words present in the system are accessible to all nodes. In smallworld topologies, indeed, there is an initial spreading of words throughout

4.3 Master equation approach to agents internal dynamics

101

the network that destroys local correlations. Consequently, we will safely approximate Ck (t) with C(t) and we can expect C(t) ≤ Nd (t) and proportional to it. The case of low-dimensional lattices is different since words can spread only locally, causing strong correlations between the inventories (Sec 3.2). According to the numerical results exposed in section 4.2, the behavior of Pn (k|t) allows to separate the evolution of the system in two regimes: a reorganization region extending from the maximum of Nw (t) to the beginning of the convergence process, and a convergence region, involving the cascade process that leads the system to the final consensus state. In addition, Pn (k|t) assumes different shapes for different topologies. Interestingly, in both regions, the temporal dependence of the distribution turns out to be only parametric, i.e. it has the only effect of deforming the shape during the evolution. In other words, the actual distribution should be well approximated by a quasi-stationary solution Pn (k|t) of the master equation, only parametrically depending on the time. This means that the master equation can be solved by means of an adiabatic approximation, a method that is commonly used in the study of outof-equilibrium systems with different time scales for the dynamics [91, 168]. Let us consider a dynamical system characterized by slow and fast modes, the adiabatic approximation consists in assuming that faster modes relax immediately to their equilibrium conditions according to the instantaneous values of the slow modes, without perceiving their variation in time. In the present case, the microscopic agent activity is a fast mode, while the global quantities that enter in the transition rates are slower modes. In order to prove the validity of the adiabatic approximation, we need the expressions of the transition rates Wk (a → b|t) from the inventory size a to b at time t, in both dynamical regimes and for different topologies.

4.3.1

Transition rates in the reorganization region

In a general context, the expressions of the transition rates can be derived from the probability of a successful interaction, given by P rob {success} =

|S ∩ H| , nS

(4.2)

where |S ∩ H| is the size of the intersection set between the inventories of the speaker and the hearer, and nS is the inventory size of the speaker. Note that expression in eq.( 4.2) holds for every choice of the speaker-hearer pair, and its average over the population corresponds to the success rate S(t). As we have discussed in Chapter 2 (Sec 2.4.4), in the reorganization region, the intersection |S ∩ H| is on average close to zero and all words have approximately the same probability of appearing in the inventory of the speaker, justifying the assumption of uncorrelation of the inventories

102

Microscopic activity patterns

Wk >>

10

-2

10

-4

10

10 10

Wk ≈

-3

BA, W(n → 1) BA, W(n → n+1)

-5 0

10

-2

10

10

1

10

2

ER, W(n → 1) ER, W(n → n+1)

-3

-4

10

10

-5

-6

10

0

10

10

n

1

10

2

Figure 4.4: Probability of winning and loosing (only the term causing an increase of the number of words) for BA and ER models. Both with N = 5000 nodes and k ' 200 (for a BA with hki = 10) and k ' 70 (for a ER with hki = 50). Data were obtained averaging over several runs (3.104 ) the probability of successful or unsuccessful interactions after t = 5.105 time-steps from the beginning of the process. In fact, the time has also in this case only a parametric influence on the observed curves. Remarkably, not only the qualitative behaviors, but also the measured values of the W are in excellent agreement with the theoretical predictions of Sec. 4.4, once we insert the right parameters. In the upper panel, lines represent the theoretical predictions of eqs. 4.24.

in all topologies with small-world property. From this assumption it turns out that the intersection is well expressed by |S ∩ H| ' nS nH /Nd (t) (where nS and nH are the inventory sizes of the speaker and the hearer). Indeed, the fraction of all accessible words that are present in the inventory of the speaker is nS /Nd (t); i.e. in each slot of the hearer’s inventory there is a probability nS /Nd (t) of finding a given word. Since the average number of common words is given by the product of such probability and the hearer’s inventory size nH , the result for |S ∩ H| follows. The expressions of the transition rates are straightforward from the probability of a successful negotiation, |S∩H| ' nH /Nd . Considering both the nS

4.3 Master equation approach to agents internal dynamics

103

probabilities for the agent playing as hearer and speaker, the transition rate Wrk (n → 1|t) reads Wrk (n → 1|t) ' pk

hnit n + qk , C(t) C(t)

(4.3)

where the average inventory size hnit comes from the mean-field hypothesis for the neighboring sites of a node playing as speaker, that is actually correct in all small-world topologies, and pk and qk are the probabilities of playing as speaker and as hearer respectively. The index r in Wkr is used to indicate that these transition rate are correct in the reorganization region. The inventory size may increase only when the agent plays as hearer, i.e. µ

Wrk (n

→ n + 1|t) ' qk

n 1− C(t)



.

(4.4)

In order to verify the above expressions for some specific cases, we have computed numerically the quantities Wrk (n → n + 1|t) and Wrk (n → 1|t), in the case of a BA network of N = 5 · 103 nodes and hki = 10 (top panel in Fig. 4.4) and for an ER model with N = 5 · 103 nodes and hki = 50 (bottom panel in Fig. 4.4). For heterogeneous networks, the numerical Wrk (n → 1|t) clearly show a linear growth of the quantity with n, in agreement with eq. (4.3), while the approximately constant behavior of Wrk (n → n + 1|t) with n can be fitted with an expression of the form eq. (4.4) only for very small values of n/C(t). On the other hand, Fig. 4.4 (Bottom: points out that in the case of homogeneous networks, in which all nodes have approximately the same behavior, both quantities are almost independent of n. The different behaviors of the transition rates are responsible of the different shape of the probability distribution Pn (k|t).

4.3.2

Transition rates during the convergence process

When the convergence process begins, the temporal behavior of all global quantities accelerates, and the expression of the success probability changes considerably. In chapters 2 and 3, we have seen that in all small-world topologies, the convergence is reached by means of a sort of cascade process, triggered by a symmetry breaking event in the space of the words. The word involved in the symmetry breaking starts to win, becoming more and more popular among the inventories. At the end of the process, when the global consensus is reached, this is the only surviving word. Therefore, as the system is close to the convergence, most of the successful interactions involves the most popular word, while positive negotiations involving different words rapidly disappear. The statistical behavior of the quantity |S∩H| depends now only on the properties of the most popular nS word. The average size of the intersection set |S ∩ H| is well expressed by

104

Microscopic activity patterns the probability αk (t) of finding the most popular word (or word) in both the inventories. During the convergence process, αk (t) is close to one. With this approximation we are neglecting the successful interactions due to less popular words, that we will show to have an effect for the dynamics on the complete graph (see section 4.4.3). According to this argument, the transition rates assume the following form, αk (t) αk (t) + qk , n hnit µ ¶ αk (t) Wck (n → n + 1|t) ' qk 1 − , hnit Wck (n → 1|t) ' pk

(4.5) (4.6)

where the index c is used to distinguish the expression of the transition rates during the convergence region from that of the reorganization regime.

4.3.3

Validation of the adiabatic approximation

In both the reorganization and the convergence regions, the validity of the adiabatic approximation can be proved computing the characteristic relaxation time of the non-equilibrium process described by the master equation in eq. (4.1) with transition rates of eqs. 4.3-4.4 or eq. (4.5). Given the (continuous, for simplicity) master equation ∂t P(t) = −WP(t),

(4.7)

the relaxation time τ is defined as the inverse of the real part of the smallest non-zero eigenvalue λ1 of the transition matrix W. The explicit diagonalization of the Markov transition matrix for a finite system may be demanding, but the order of magnitude of τ is easy to compute. We first note that ¯ in both cases. The size of W ¯ depends on the total number of W = pk W words Nw (t), and hence on the population size N . Its structure can however be found out simply, even if it is a laborious task. For instance, for highly connected nodes in a heterogeneous network it would read: 

k −

  k  C(t)  ¯ = W k  2  C(t)  k 3 C(t)

k k − C(t) −

0 0

k

0

0

k

0

k −2 C(t) −

0

k

k k −3 C(t)

        

(4.8) ¯ can be written where we have put Nw (t) = 4. In this case, the matrix W ¯ and it is easy to prove that, in the reorganization region ¯ = k W, as W

4.3 Master equation approach to agents internal dynamics

105

(when C(t) À 1) and in the limit of large systems, the eigenvalues of the ¯ are real and equal to −1. Thus, the eigenvalues of W ¯ are O(k/hki), W and λ1 ∝ qk , and the time necessary to reach the stationary state is τ ∼ O(1/qk ). The argument holds even close to the consensus state, where C(t), hnit , and αk (t) are of order 1, since the smallest non-zero eigenvalue is still proportional to qk . Note that, in all complex networks qk > 1/N , thus τ < N . The time-dependent quantities involved in the expressions of the transition rates, such as hnit and C(t) and αk (t), vary on a slower timescale (the characteristic timescale of the global system is t/N ), justifying the adiabatic approximation.

4.3.4

General expression of the adiabatic solutions

Mathematically, the adiabatic approximation consists in setting to zero the temporal derivative of the inventory size distribution, and looking at the stationary solution Pn (k|t), with parametric dependence on the time, that we call adiabatic solution. We compute the general adiabatic solution of the master equation in the two regions, while the most interesting cases are reported separately in the next section. Let us first consider a general complex network in the reorganization region. Plugging the expressions of the transition rates Wrk (n → n + 1|t) and Wrk (n → 1|t) into the stationary form of the master equation (eq. (4.1)), we get the following recursion relation, h

Pn (k|t) =

h

qk 1 −

n−1 C(t)

qk 1 − n C(t)

i

i

hnit n + pk C(t) + qk C(t)

Pn−1 (k|t) .

(4.9)

Then, introducing qk = kpk /hki = b(k)pk and eq. (4.9) can be rewritten as Pn (k|t) =

b(k)[1 − b(k) +

n−1 Since C(t) ¿ 1, we can write 1 − relation,

n−1 C(t)

n−1 C(t) ] Pn−1 (k|t) hnit C(t)

Pn (k|t) ' s(k, t)n−1 e ³

n−1 − C(t)

'e



(4.10)

, thus solving the recurrence

n(n−1) 2C(t)

´

.

P1 (k|t) ,

(4.11)

hnit . The normalization relation gives P1 (k|t). with s(k, t) = b(k)/ b(k) + C(t) The controlling parameter of the curve is s(k, t), that allows to tune the decay of the distribution between an exponential and a Gaussian-like tail. hnit A change of variable s(k, t) = 1 − ²(k, t) (with ²(k, t) = b(k)C(t) ) makes

evident that s(k, t)n ≈ e−²(k,t)n , therefore the curve has the behavior Pn (k|t) ∝ e

−²(k,t)n−

n(n−1) 2C(t)

.

(4.12)

106

Microscopic activity patterns The linear term dominates when hnit À b(k), i.e. in homogeneous topologies, while the quadratic term governs the shape of the distribution for the high-degree nodes in heterogeneous networks (hnit ¿ b(k)). This result is very interesting since it shows that heterogeneity is a necessary condition for agents to show a super-exponential decay in the inventory size distribution. When we are in the convergence region, on the other hand, to get the form of the memory size distribution we must insert the eqs. 4.5 into the stationary version of eq. (4.1), ¶

µ

µ



αk (t) ∂Pn (k|t) αk (t) Pn−1 (k|t) − qk 1 − Pn (k|t) = 0 = qk 1 − ∂t hnit hnit · ¸ αk (t) αk (t) − pk Pn (k|t) . + qk (4.13) n hnit We get the following recursive relation, 

Pn (k|t) = 

αk (t) hnit αk (t) 1 b(k) n

1− 1+

  Pn−1 (k|t) ,

(4.14)

in which b(k) = k/hki. The general solution is of the form Pn (k|t) ∝ n



αk (t) b(k)



e

αk (t) n hnit

,

(4.15)

showing that near the convergence, the inventory size distribution may develop a power-law structure. Nevertheless, in the section 4.2, we stated that from numerical data there is no evidence of power-law behaviors on complex networks. This can be explained looking at the terms of eq. (4.15). In homogeneous networks, the power-law has exponent close to 1 (since both αk (t) and b(k) are of order 1), but the cut-off imposed by the exponential distribution sets in at very low n, preventing the underlying power-law to be observed. The same argument holds for low-degree nodes in heterogeneous networks, but high-degree nodes should present sufficiently large inventories to see the power-law. However, in this case b(k) À 1, thus the exponent of the power-law is too small to be observed. The only case in which we are able to observe a power-law inventory size distribution is that of the complete graph, that presents some peculiarities and will be discussed separately in the next section.

4.4

Explicit solution for some interesting cases

In this section, we study more in detail the effects of the topology on the adiabatic solution of the master equation making explicit calculations in three interesting cases: in the reorganization region, we consider the activity

4.4 Explicit solution for some interesting cases

107

statistics of generic nodes in homogeneous random graphs and of hubs in heterogeneous scale-free networks; in the convergence region, we focus on the purely mean-field behavior of agents placed on a complete graph.

4.4.1

The case of homogeneous networks

As revealed by simulations reported in Fig. 4.4 (Bottom: the transition rates for homogeneous networks in the reorganization region are almost independent of the number of words in the inventory. In homogeneous networks qk ' b(k)pk , with b(k) ' O(1), and the nodes are in general equivalent, thus the number of words is approximately the same for every node, i.e. n ' hnit . The approximated expressions of the transition rates for a node of typical degree hki are Wrk (n → 1|t) ≈ pk hnit (1 + b(k))/C(t) ≈ 2pk hnit /C(t) µ ¶ hnit r Wk (n → n + 1|t) ≈ b(k)pk 1 − ≈ pk . C(t)

(4.16) (4.17)

Such approximations are in agreement with the data reported in Fig. 4.4bottom. Indeed, from simulations we obtain the value pk (k = 200) ' 2.4 × 10−4 that is in good agreement with the measured Wk (n → n + 1|t) ' 2.3 × 10−4 . The adiabatic condition for the master equation becomes 0= 0=

2 Pn−1 (hki|t) − Pn (hki|t) − hnit C(t) Pn (hki|t)

P∞

2 j=2 C(t) hnit Pj (hki|t)

n>1

− P1 (hki|t) .

(4.18)

The solution by recursion is very simple, Pn (hki|t) ≈ (1 − θ)θn−1 ,

θ=

1 1+

2hnit C(t)

.

(4.19)

Using the expansion of logarithm log(1 − ²) ' −², with ² = 1 − θ ' 2hnit /C(t), the previous formula gives the following exponential decay for the distribution of the number of words, tn 2hnit − 2hni e C(t) . (4.20) C(t) The exponential decay is in agreement with the numerical data. Knowing the complete form of the distribution (i.e. with the correct normalization prefactor), we can also roughly estimate hnit and C(t), at fixed time t, from a self-consistent relation for hnit , From eq. (4.20), we compute the approximate average value of hnit , i.e.

Pn (hki|t) '

hnit ≈

Z ∞ 1

nPn (hki|t)dn ,

(4.21)

108

Microscopic activity patterns and we get the self-consistent expression µ

hnit '

C(t) 2hnit

¶µ



1+

t 2hnit − 2hni e C(t) . C(t)

(4.22)

Now, introducing in eq. (4.22) the numerical value of hnit /C(t), it is possible to verify that the orders of magnitude of both hnit ∼ O(10) and C(t) ∼ O(102 ) are in agreement with their numerical estimates.

4.4.2

High-degree nodes in heterogeneous networks

Now we pass to describe the dynamics of the hubs in heterogeneous networks in the reorganization region of the system. In a direct Naming Game, a hub is preferentially chosen as hearer, by a factor b(k) = k/hki À 1, then in the transition rates we can neglect the terms associated with the speaker. We consider the following approximated expressions Wrk (n → 1|t) ' qk µ

Wrk (n → n + 1|t) ' qk 1 −

n C(t)



n , C(t)

(4.23)

' qk ,

(4.24)

in which the last approximation is justified by the fact that, in general, n/C(t) ¿ 1. Inserting real values of qk and C(t), the eqs. 4.24 are in agreement with the behaviors coming from the fit of the corresponding curve in Fig. 4.4 (Top). We can easily compute the adiabatic solution Pn (k|t) from eq. (4.1) µ



n 0 = qk Pn−1 (k|t) − qk + qk Pn (k|t) , C(t)

(4.25)

and we find recursively Pn (k|t) =

C(t) C(t)+n Pn−1|t (k)



=

C(t)2 (C(t)+n)(C(t)+n−1) Pn−2 (k|t)

n−1

C(t) Γ(C(t)+2) Γ(C(t)+n+1) P1 (k|t)

.

(4.26) (4.27)

P

Now, from the closure relation ∞ n=1 Pn (k|t) = 1 we get the expression of P1 (k|t), and the final form for Pn (k|t) becomes ·

¸

Γ(C(t) + 1) C(t)n−1 C(t)C(t)+1 e−C(t) , Pn (k|t) = Γ(C(t) + n + 1) γ(C(t) + 1, C(t))

(4.28)

where γ(a, x) is the lower incomplete Gamma function. The functional form of the stationary distribution is complicated, but exploiting Stirling approximations for Gamma functions we can easily write it into a much simpler

4.4 Explicit solution for some interesting cases

109

√ form. Indeed, using the expression Γ(x) ≈ 2πe−x xx−1/2 and the representation via Kummer hypergeometric functions for the incomplete Gamma function γ(a, x), we find that lim

x→+∞

Γ(x + 1) = const ' 2 , γ(x + 1, x)

(4.29)

and this value is correct in the range of x = C(t) À 1. Finally, using the asymptotic series expansion of Γ(x + n + 1) for large x, we get an expression that can be formally written as

Γ(x + n + 1) ≈

n

√ 2πe−x xx+n+1/2

× O(1) + Q[O((n + 1)2 )]x−1 + Q[O((n + 1)4 )]x−2 + . . .

o

,

(4.30)

in which Q[O((n + 1)l )] is a polynomial in (n + 1) of maximum degree l. Now, we can do the resummation of the series keeping at each order k in x only the highest term in the polynomial in (n + 1), whose coefficient is 2−k /k!, ∞ X √ √ 2 x−k (n + 1)2k 2πe−x xx+n+1/2 2πe−x+(n+1) /2x xx+n+1/2 . = k k!2 k=0 (4.31) Putting together all the ingredients, we find that a good approximation of the distribution of the number of words is given by (the half-Normal distribution)

Γ(x+n+1) ≈

s

Pn (k|t) '

−(n+1)2 2 e 2C(t) . πC(t)

(4.32)

Fitting numerical results in Fig. 4.4 (Top: with this expression provides values for C(t) ∼ O(102 ), showing that, as expected, on the BA model C(t) < Nd (t) ∼ O(102 ÷ 103 ).

4.4.3

Power-laws on the complete graph

The last interesting case consists in studying the inventory size distribution for agents on the complete graph, whose phenomenology has been investigated in Chapter 2. In the reorganization region, the √ mean-field dynamics is characterized by a large fraction of agents with O( N ) words in their inventories and another smaller fraction with √ exponentially distributed inventory sizes. The existence of a peak at O( N ) comes from the initial accumulation process, while the exponential part of the distribution is produced during the following reorganization regime. Since the most of the agents

110

Microscopic activity patterns √ have O( N ) words and the intersection between inventories is close to zero, we can write the following transition rates 2 1 √ N N µ ¶ 1 1 Wrk (n → n + 1|t) ≈ 1− √ N N Wrk (n → 1|t) ≈

(4.33) (4.34)

With the usual recurrence relation we compute the following adiabatic solution, √ − √2 n N ) + (1 − f (t)) e N , (4.35) √ with f (t) is the fraction of agents around N that tends to zero the convergence The interesting region is however the last one, during the convergence process, in which the inventory size distribution of the mean-field system develops a power-law structure. In eq. (4.15), we have shown that the expected distribution in the convergence region presents a power-law, that in the particular case of the complete graph should have an exponent close to 1 (since αk (t) ' 1). Nonetheless, Fig. 4.3 reveals that the slope −1 is correct only at the beginning of the convergence process, while later the slope increases, developing a bump in the range of small inventory sizes. Starting from the previous remark on the mixed distribution emerging in the reorganization region, we explain how the alteration of the power-law is due to the superposition of an exponential distribution. During the convergence process, the agents having access to the most popular word behaves following the transition rates in eq. (4.5) and their activity is at the origin of the power-law in Pn (k|t). The other agents, that have no √ access to the most popular word maintain an inventory of size about N and fall to 1 if the get a successful interaction. In other words, they keep on playing as in the reorganization region, generating an exponential distribution of the inventory sizes. Even if the fraction of these agents decreases in time, the superposition of the exponential on the power-law has the immediate effect of increasing the slope of the power-law at low n. In summary, we have provided an explanation of the behavior of the activity patterns of the Naming Game on the complete graph, pointing out some fundamental differences with respect to generic complex networks. Pn (k|t) ∝ f (t)δ(n −

4.5

Conclusions and discussion

We have studied the microscopic activity patterns in a population of agents playing the Naming Game. In previous chapters we have seen that the non-equilibrium dynamical behavior of the model presents very different

4.5 Conclusions and discussion

Figure 4.5: Summary of the microscopic activity: role of topology and temporal region. The left figure displays the situation in the reorganization region, in which the major effect is due to the increase of the degree fluctuations (the memory size distribution passes from an exponential to a half-normal distribution. In the right panel, we show the same picture for the convergence region, in which the final cascade process of convergence produces a power-law like memory size distribution. Such a distribution is however visible only in the purely mean-field case, while on generic complex networks it is covered up by exponential terms. The region at both large average degree and fluctuations is difficult to be explored, but should correspond to mixed distributions in which all previously classified behaviors may be observed.

features depending on the underlying topological properties of the system. The analysis, however, was mainly focused on the behavior of global quantities, while in this chapter we have investigated the microscopic activity patterns of single agents. Indeed, by means of numerical simulations and analytical approaches, we have shown that the negotiation process between agents is at the origin of a very rich internal activity in terms of variations of the inventory size. More precisely, our analysis has focused on the instantaneous activity statistics described by the distribution Pn (k|t) that an agent of degree k has an inventory of size n at time t. We have been able to explain its behavior in function of both the global temporal evolution and the underlying topology of the system. Apart from an initial transient, the dynamics of the Naming Game can be split in two temporal regions, namely the reorganization part and the convergence part. Fig. 4.5 summarizes our findings, showing the microscopic activity statistics in function of the first two moments of the degree distribution P (k), which turn out to be essential features of complex networks affecting the dynamics of the Pn (k|t). In the left panel of Fig. 4.5 we sketch the relation between topology and single agent activity in the reorganization region. Increasing the heterogeneity of the nodes the Pn (k|t)

111

112

Microscopic activity patterns shifts from an exponential to a super-exponential (half normal) regime. Increasing hki while preserving the homogeneity of the nodes, on √ the other hand, leads to a superposition of an exponential and a delta at N . A class of distributions mixing up all these features is observed for networks with diverging average degree and fluctuations (top-right corner of the plane). A similar summary can describe the effect of the topology also in the convergence region (Fig. 4.5, right panel): increasing the average degree the distribution moves from exponential to a superposition of an exponential and a power-law, while larger fluctuations destroy the power-law leaving only an exponential distribution. In general, the influence of topological properties of complex networks on the dynamical properties of processes taking place on them is the object of a vast interest in statistical physics community (see Sec. 1.6). However, only global properties are usually considered. Here, on the other hand, we have focused only on the internal dynamics of single agents, and we have found results providing further insights in the strong converging property of the corresponding global dynamics. Indeed, as we know, one of the most interesting aspects of the Naming Game is exactly that the number of words an agent can store is not fixed a priori, but, on the contrary, diverges in the thermodynamic limit in the mean-field case. This is a relevant difference with most of the well known models in various fields of statistical mechanics or opinion dynamics, such as the voter [130, 124] or the Axelrod [15] models, and we have investigated the deep consequences it has on the global behavior of the system. We conclude with a final remark. In order to compare the Naming Game with usual statistical mechanics models, it would be useful to shift our perspective and look at the waiting time between successive decision events. In the present case, a decision event corresponds to a successful interaction, so that the waiting time is directly proportional to the inventory size. In the non-equilibrium Glauber dynamics, for instance, a decision event is commonly associated to a spin flip. The corresponding waiting time is exponentially distributed during the dynamics, but at the convergence (to the ferromagnetic state) the waiting time between flips may diverge, and its distribution assumes a power-law shape. As we have shown, a similar behavior is observed and proved for the inventory size distribution in the mean-field Naming Game. Thus, on the basis of our results, the Naming Game inventory statistics can be compared to the waiting time statistics in other models. According to this analysis it should be interesting to further investigate the relation between global collective behavior and local dynamics of agents in other models with a similar non-poissonian individual dynamics. This is indeed a possible direction for future work.

Chapter 5

Beyond the Naming Game 5.1

Introduction

The Naming Game rules we have introduced (Sec 2.2) represent a remarkable simplification of the previous interaction schemes developed in the context of artificial intelligence and semiotic dynamics (Sec. 1.3), while, as we have seen, they yield the same results (e.g. the emergence of a final consensus). Moreover, they preserve the conceptual content, and hence the relevance, of their inspiring models, while the increased transparency allows for a much more detailed analysis. For this reason our approach is interesting also from the broader point of view of opinion-dynamics, whose models often lack clear microscopic connections with the systems they aim to describe (Sec. 1.5). In this chapter, we go back to the definition of our model in order to investigate the role of the different ingredients, and, at the same time, we explore the possibility of enriching it. We proceed by modifying in turn single rules, either simplifying them further or adding some new parameters, and we discuss the effects on the dynamics of the model. Of course, for each of the modified versions of the Naming Game we could perform the same analysis presented in the previous chapters, but this is not our purpose, nor would it be possible in this context. Instead, our aim is to understand better the different aspects of the model we are now familiar with, and, at the same time, to seek for helpful cues for future investigations. In Sec. 5.2 we generalize the Naming Game, introducing a parameter β which determines the probability for the agents involved in a successful interaction to update their inventories (thus, when β = 1 the usual Naming Game is recovered). An abrupt transition is observed at β = βc = 1/3. For β > βc , at convergence the population has reached consensus, while, for β < βc , several conventions coexist in the asymptotic state. We point out that this result is interesting in particular from the point of view of opiniondynamics models. In Sec. 5.3, on the other hand, we remove the symmetric update of the agents after a successful interaction. Interestingly, when only

114

Beyond the Naming Game the hearer performs the update, the overall dynamics is very similar to the usual one. We discuss how this result may have important consequences when the population is embedded in non mean-field topologies, where a new broadcasting scheme can replace pairwise interactions. Sec. 5.4 is devoted to the investigation of the role of the word-selection rule adopted by the speaker. It turns out that a simple deterministic procedure allows for a strong speeding-up of the convergence process. Sec. 5.5 deals with an ultrasimplified model, in which no update occurs after successes. It represent the limit of β = 0 of the generalized Naming Game described in Sec. 5.2, and can be solved analytically. Finally, in Sec. 5.6 we show briefly that the presence of weights in our model does not alter the dynamics in a significant way, while it makes the model considerably more complicated.

5.2 Stochastic update: a generalized Naming Game In our Naming Game, agents update their inventories after a successful interaction, deleting all the words but the one they have just agreed upon (Sec.2.2). Now, we introduce a parameter, β, stating which is the probability of this update. All the remaining rules remain unchanged, and the usual model is recovered for β = 1. For convenience, in Figure 5.1, we summarize the interaction rules followed, at any time t, by the two randomly extracted agents. Figure 5.2 shows the behavior of the convergence time tconv for various values of the post-success updating probability β. The final state is reached also for β < 1, even though the convergence time increases as beta decreases. However, most interestingly, there is a fast speeding up for β ' 1/3. More precisely, at β = 1/3 a transition occurs in the space of the final states reached by the system. For β > 1/3 the population is able to reach a consensus, while for β < 1/3 two or more conventions survive in the asymptotic state. The presence of a transition is very interesting and can be understood analytically with the simple exact argument that we propose below. The convergence time is defined as the time in which the system reaches the one-word adsorbing state. And such state is attained with a progressive elimination of competing words. Consequently, except for exponentially rare configurations, just before convergence there are only two words competing to survive, one of which is usually far more popular than the other (see Fig. 2.8). Thus, the seemingly artificial two-words scenario we have discussed in Section 2.7 (in which new words cannot be invented), turns out to be an helpful framework to understand the transition. The exact equations (2.14) can be easily generalized to the case in which the update is stochastic. They read:

5.2 Stochastic update

Figure 5.1: Stochastic update. In case of success, inventories are updated with probability β. In case of failure, on the other hand, the hearer adds the new word to its inventory as in the usual scheme. The usual rules are recovered for β = 1, so that the stochastic update constitutes a generalization of the Naming Game introduced in Chapter 2.

3 1 1 − nA nB + βn2AB + ( β − )nA nAB , (5.1) 2 4 4 3 1 1 n˙ B = − nA nB + βn2AB + ( β − )nB nAB , 2 4 4 3 1 n˙ AB = +nA nB − 2βn2AB − ( β − )(nA + nB )nAB , 4 4 where nA (nB ) is the fraction of agents with word A (B), and nAB is the fraction of agents with two words. Considering that nAB = 1 − nA − nB , the steady states of eq.(5.1) are given by n˙ A = n˙ B = 0. Subtracting the two equations we obtain that only two possibilities are given by nA = nB , nAB = 1 − nA − nB or nAB = 0, i.e. n˙ A

=

ˆ nA = 1, nB = 0, nAB = 0,

115

116

Beyond the Naming Game 10

6

N=500 N=1000 N=5000 4

10

10

tconv

tconv

10

2

10

6

4

2

10

0

100.3

0.32

0.34

0.38

0.36

β

0

10 0.2

0.3

0.4

0.5

0.6

β

0.7

0.8

0.9

1

Figure 5.2: Stochastic update. Convergence time in function of the post-success update probability β. Time is rescaled so as that tconv (β = 0) = 1 for all system sizes. At β = 1/3 there is an abrupt increase in convergence times. The inset shows that the transition becomes steeper and steeper as the population grows.

ˆ nA = 0, nB = 1, nAB = 0, ˆ nA = nB = b(β), nAB = 1 − 2b(β),

with −b(β)2 /2 + β(1 − 2b(β))2 + (3β − 1)b(β)(1 − 2b(β))/4 = 0 i.e. 10βb2 − (1 + 13β)b + 4β = 0, so that b(β) =

1 + 13β +

p

1 + 26β + 9β 2 . 20β

(5.2)

To study the stability of these states we can subtract the equations relative to n˙ A and n˙ B , obtaining: d(nA − nB ) 3β − 1 = (nA − nB )nAB . dt 4

(5.3)

Thus, if β > 1/3, d(nA − nB )/dt and nA − nB are of the same sign. The solution nA = nB is then unstable and the stable solutions are either nA = 1, nB = 0 or nA = 0, nB = 1. If β < 1/3, on the contrary, the solution nA = nB is stable and nA = 1 and nB = 1 are unstable. This means that the system can converge if β > 1/3 but not if β < 1/3, and the origin of the transition is revealed.

5.2 Stochastic update 10

tconv

10

10

117

8

N=500 N=1000 N=5000 -0.42 tconv ~ (β-1/3)

7

6

5

10

4

10 -4 10

10

-3

-2

10

β - 1/3

10

-1

0

10

Figure 5.3: Convergence time before transition. Convergence times are plotted in function of the rescaled parameter β − 1/3. Curves are well fitted by a power 1 law of the form tconv ∼ (β−1/3) χ with χ ≈ 0.42.

Before the transition, on the other hand, the effect of β is a slowing down in the time of convergence which is well described by the relation 1 tconv ∼ (β−1/3) χ with χ ≈ 0.42 (Fig. 5.3). It is more difficult to capture this behavior than the transition itself, and we leave the investigation of this phenomenon, which we report here for the sake of completeness, for future work. Recalling also the results of Section 2.7 , the situation, in function of β, can be summarized as follows: 1. For β > 1/3, convergence is always reached, since different equilibria are unstable. 2. For β < 1/3, there is only the absorbing state nA = nB = b(β) and nAB = 1 − 2b(β), which is reached from all initial configurations (except, of course, the trivial and quite pathological nx = 1, ny = 0, nxy = 0). b(β) is a real number function of β, with b(0) = 0 . 3. At the transition, β = 1/3, a numerical solution of the equations shows that they lead to final states that preserve the asymmetry of non symmetric initial conditions, but in which all the three species survive. If the initial conditions are symmetric, on the other hand, the final state is nA = nB ' 0.31 nAB ' 0.38.

118

Beyond the Naming Game

tx

10 10

9

7

10

5

0.1 0.4

βc(x)

t1 t2 t3 t4 t5 t6 t7 t8 t9

β

0.2

0.3

0.4

data fit: βc(x) = 1/3 - a*log(x)

0.3 0.2 0.1 1

2

3

x

4

5

6

7 8 9 10

Figure 5.4: Hierarchy of transitions for β < 1/3. Top: tx is the time at which the number of different words becomes x, i.e. in which Nd (tx ) = x for the first time (so that t1 ≡ tconv ) . The first nine transitions are plotted, and it is shown that they occur for decreasing values of β. Bottom: the values of β = βc (x) at which the different transitions occur are plotted in function of the number of different words x. Points corresponding to the first transitions are well fitted by a function of the form βc (x) = 1/3 − a × log x, where a = a(N ) is a parameter that depends on the system size (a = 0.071 in our case). The βc (x) dots plotted in figure correspond (somehow arbitrarily) to the values in which tx = 107 . All data refer to a population of N = 1000 agents.

When we consider the whole dynamics of the Naming Game with invention, the validity of eqs. 5.1 beyond the critical value 1/3 must be considered carefully. Indeed, their utility come from the fact that, when convergence is reached, the system spontaneously passes through a two words situation. But this is not granted for values of β smaller than 1/3. To see it clearly, it is sufficient to consider that, at β = 0, agents do not delete words, and the only absorbing state is reached when every agent stores all the N/2 (on average) words created during the invention phase (see Sec. 5.5). Thus, Figure 5.4 shows that, after the first transition from a two word to a single word state, there is a complete series of different transitions occurring at smaller values of β = βc (x), which concern the passage from a Nd = x to a Nd = x − 1 final state, where x is an integer value. It is worth noting that their numerical values are well fitted by the function βc (x) = 1/3 − a × log x, where a = a(N ) is a parameter that depends on the system size. At present we do not have a clear interpretation of this relation, but we believe that

5.2 Stochastic update it is a very interesting issue to be addressed in future work. On the contrary, single transition values can be derived by the corresponding x−word equations, whose structure becomes only more and more intricate for larger values of x.

5.2.1

Some remarks on the generalized Naming Game

Before concluding, it is interesting to make a general remark on the transition we have observed. We have identified a critical value of the update probability which divides in two distinct regions the space of final states of the Naming Game. For β > 1/3, the agents are able to reach a consensus on a unique convention. For β < 1/3, on the other hand, the population is made of different groups, whose number depends on the value of β, which are composed by agents storing the same words in the inventories. More precisely, we have seen that there is a hierarchical organization of transitions, from an x words final state to an x − 1 words asymptotic state, identified by values of the critical parameters β = βc (x), which are well fitted by βc (x) = 1/3 − a × log x, where a is a parameter that varies with the system size. Thus, for βc (x + 1) < β < βc (x) the system will asymptotically store x words. For instance, in the region βc (2) < β < βc (1) = 1/3, we have seen that the final state is made of three groups: GA , GB and GAB , whose members’ inventories contain the word A, B or both A and B. The relative abundance of these groups is stable in time, but there is a constant flux of agents between them (e.g. an agent in GA can pass to GB through GAB ). Since our model was originated in the context of the study of the emergence of language, we could interpret the non-homogeneous asymptotic states as describing the emergence of multi-lingual societies. However, as we have seen in Chapter 2, the Naming Game is suitable for modeling all those aspects of human dynamics that are based on a negotiation of conventions. Thus, if we see our results in the light of opinion dynamics (see Sec. 1.5), we could interpret our multi-word under-critical state as separating a region in which a consensus emerges from another one in which the population is polarized on different possibilities (for instance, different voting directions). The most common way to attain such state in many opinion dynamics models consists in endowing agents of bounded confidence, so that interactions are possible only among individuals whose opinions are not too different [175]. In our case, on the other hand, the mechanism is completely different, since we do not prevent agents from interacting, but rather we add some external noise, endowing them with a sort of distrust on different opinions, or, which is the same, with a tendency in staying with their opinions. Moreover, this latter mechanism incorporates the possibility that an agents changes its opinion even in the polarized state, while bounded confidence determines the emergence of a frozen final state. For all these reasons we believe that the transition in β is very interest-

119

120

Beyond the Naming Game ing, and we are planning to investigate the stochastic update rule in more details in the immediate future (see also [19]). First of all, it would be important to compare carefully our findings with the results obtained for different models (e.g. the kinetic Ising model, or the pair contact process), in which the presence of noise is known to be responsible for transitions from a “consensus” state to disordered configurations [155]. At the same time, we plan to examine the role of topology: preliminary results, which we are not presenting here, show that the transition occurs on all interaction patterns, whose properties can however alter the value βc = 1/3. Furthermore, an important direction consists in passing from a global parameter β, which is the same for all agents, to a quenched-disorder scenario in which each agent i is endowed with a quenched value of the distrust parameter βi , to be extracted by a probability distribution P (β). Finally, we think that the parameter β can be an important ingredient to deal with the interplay between topology and dynamics, when the network of possible interactions is shaped, at least to some extent, by the output of the agents interactions themselves.

5.3

Asymmetric update: broadcasting in the Naming Game

The modification we want to explore now is very simple. The rules are the same of the usual model, but for the fact that only one of the two agents updates its inventory after a success. We start from the case in which only the hearer deletes synonyms after a successful interaction (Fig. 5.5). In this case, the system evolves towards the usual final state. The curves are qualitatively very close to those of the usual NG (see Figure 2.2). Convergence is slower but remarkably, as shown in Figure 5.6, the scaling of the relevant quantities is not altered. This is not surprising, if we consider that we have preserved the backbone of the Naming Game scheme: an agent reacts deterministically, with the usual rules, to a word received by another agent. The update of the speaker, on the other hand, is a second order effect, that depends on the selected word and on the composition of the inventory of the hearer, and that, in any case, concerns only successful interactions. The new setting is very interesting since it corresponds to a simplification of the interaction rules which does not alter the overall dynamics of the model. However its main importance emerges when the population of agents is embedded in a non mean-field topology. Indeed, in this case one could imagine a broadcast communication rule according to which the speaker interacts simultaneously with all its neighbors. This possibility has been explored by Lu, Korniss and Szymanski [133], but, since they adopted the usual interaction rules (Sec. 2), in their scheme the speaker had to collect the answer of all its neighbors. Thus, if the speaker has degree k, a broadcasting

5.3 Asymmetric update

Figure 5.5: Asymmetric update. We explore which is the role of the symmetric post-success update scheme by preventing one of the agents from performing it. In Figure are shown the rules of a Naming Game in which only the hearer renews its inventory after a success.

unit consists in k + 1 communication events. If the update is performed only by the hearer, on the contrary, the k answers are no more needed, and the negotiation process is considerably simplified. The scaling of the main quantities for both ER and BA networks is presented in Figure 5.7. One time step corresponds to a single broadcast action. In these units, a broadcast in a mean field scenario would lead the system to converge in 1 time step. It is worth noting that the exponents are different from the usual case, and that BA and ER networks yield to different behaviors. In particular, the BA topology seems to be more effective than the ER one, possibly due to the important role played by the hubs, which act as powerful convention spreaders, even though they play mostly as hearers,. In this perspective, it is also worth mentioning that, using the broadcast rule, convergence is observed also when the population is embedded in the WAN network, with tconv ∼ 6.106 . As we have seen in Sec. 3.4.8, on this topology the usual rules are not able to lead the system to converge after 1012 time steps. Of course these are only preliminary results, and a more detailed analysis is requested, but, also in view of possible applications of the Naming Game, broadcast seems to be a promising improvement. The opposite case, in which only the speaker updates its inventory, seems quite artificial, as we have mentioned. Moreover, in this case it turns out

121

122

Beyond the Naming Game

8

tmax tconv

6

t ∝ Ν , α ≈ 1.5 tmax (hearer-only) tconv (hearer-only)

10

t

10 10 10

4

α

2

Nw

max

0

10 0 10 8 106 104 102 100 10 0 10

10 max

2

10

4

6

10

γ

Nw ∝ Ν , γ ≈1.5 hearer-only

10

2

10

4

6

10

N

Figure 5.6: Scaling relations. Scaling curves of the usual Naming Game (dots) are compared with those of the model in which only the hearer updates its inventory after a success (green lines). The scaling exponents are the same.

that the time needed by the population to converge on the same convention is strongly increased, and from numerics it seems to scale with the system size as tconv ∼ N β , with β ≈ 2 (Fig. 5.8). To understand this slowing down it is helpful to introduce again the stochasticity in the post-success update rule.

5.3.1

Stochastic asymmetric update

We now merge the stochastic update rule seen in Sec. 5.2, with the asymmetry of the update. So, only one of the two agents updates its inventory after a success, and this happens with probability β. It is straightforward to repeat the analysis performed in Sec. 5.2, i.e. to write down the equations for the evolution of the system when only two words are present and no invention is possible anymore (since we are far from the beginning of the process). In particular, it is interesting to study the case in which only the speaker updates its inventory after a success. The equivalent of eq.(5.3) reads d(nA − nB ) β−1 = (nA − nB )nAB , dt 4

(5.4)

where, as it is usual, nA (nB ) is the fraction of agents with word A (B), and nAB is that of agents storing both words in their inventories. Interestingly,

5.3 Asymmetric update

tmax tconv

6

10

123

α

α ≈ 1.1 β ≈ 1.2

t ∝ N , α ≈ 1.3 β

t ∝ N , β ≈ 1.35

t

4

10

ER

2

10

BA

6

10

max

Nw

max

Nw

γ

γ ≈ 1.2

∝ N , γ ≈ 1.1

4

10

2

BA

ER

10

1

10

2

10

3

10

N

4

10

51

10

2

10

3

10

4

10

5

10

N

Figure 5.7: Broadcasting on networks. The speaker talks simultaneously to all its neighbors, but it does not updates its inventory in any case after a communication event. Scaling of the main quantities for ER and BA networks with hki = 10 is shown.

we see that the transition occurs at β ≡ βc = 1, so that the system is naturally set in a (under-)critical region, even when the update is not stochastic. The slow convergence showed in Fig. 5.8 is then natural. The opposite case in which is the hearer to perform the update, on the other hand, presents a critical value βc = 1/2. The usual symmetric update rule is then more robust, at least for what concerns the role of β. In summary, we have seen that if the update following a success is asymmetric, the situation changes drastically depending on which is the agent that performs it. If it is the hearer, the dynamics stays very close to the case in which the update is symmetric. Moreover, the pair interaction rule can be naturally substituted by a broadcasting scheme, in which the speaker talks simultaneously to all its neighbors, which is of particular interest when the population is embedded in complex topologies. The opposite scheme in which only the speaker performs the update, on the contrary, is highly inefficient. We have given an analytical explanation of this behavior studying the equations describing the two-words state that precedes convergence. Finally, a remark is in order. Defining the asymmetric update rules, and particularly the more interesting case in which only the hearer deletes its synonyms, we are acting upon our simplified model of the Naming Game

124

Beyond the Naming Game

t

10

8

10

tmax tconv

α

t ∝ Ν , α ≈ 1.5 tmax (speaker-only) tconv (speaker-only)

4

Nw

max

0

10 0 10 8 106 104 102 100 10 0 10

10 max

2

10

4

6

10

γ

Nw ∝ Ν , γ ≈1.5 speaker-only

10

2

10

4

6

10

N

Figure 5.8: Only the speaker updates its inventory after a success. This rule is highly inefficient. The scaling of tconv with the system size is more unfavorable than in the usual symmetric case.

without considering the original problem it comes from. Indeed, for instance, these modified rules do not take into any account the possibility of homonymy, in which the hearer could be in need of answering to the speaker not only to communicate a success, but also to disambiguate two objects which, according to itself, have both the name uttered by the speaker. To tackle with such situations, further prescriptions would be in order. On the other hand, if we consider our Naming Game as model for opinion dynamics, for example, we see that the modified rules can acquire a precise meaning (the case of broadcast being the most obvious). In general, as mentioned above, all the modified rules we propose in this chapter are aimed only at clarifying some properties of our usual Naming Game, and, hopefully, at providing interesting hints for future work.

5.4 Deterministic word-selection strategies: higher performances for the Naming Game We now focus on the word selection criteria adopted by the speaker, referring to the material presented in [20]. In the usual Naming Game model (Sec. 2), agents, when playing as speakers, extract randomly a word from their inventories. This feature, along with the drastic deletion rule that follows a successful game, is the distinctive trait of the model. Indeed, as we have seen, most of the previously proposed models of semiotic dynamics prescribe

5.4 Deterministic word-selection strategies that a weight is associated to each word in each inventory, determining its probability of being chosen (see Sec. 1.3). As a natural consequence, the effect of a successful game consists in updating the weights, rewarding the weight associated with the winning word and possibly reducing the others. Such sophisticated structures can in principle lead to faster convergence, but make the models more complicated, compromising the possibility of a clear global-scale picture of the convergence process (see Sec. 5.5). In order to maintain the simplicity of the dynamical rules, it seems natural to alter the purely stochastic selection rule of the word chosen by the speaker. In the model previously described, all the words of a given agent’s inventory share a priori the same status. However, a simple parameter to distinguish between them is their “arrival time”, i.e. the time at which they enter in its inventory. In particular, two words are easily distinguished from the others: the last recorded one and the last one that gave rise to a successful game, i.e. the first that was recorded in the new inventory generated after the successful interaction. Natural strategies to investigate consist therefore in choosing systematically one of these particular words. We shall refer to these strategies as play-last and play-first respectively. Other selection rules are of course possible, but would be either more complicated or more artificial. The scaling behavior of the model when the play-last strategy is adopted is very interesting (not shown). The peak time and height scale respectively as tmax ∼ N α with α ≈ 1.3 and Nwmax ∼ N γ with γ ≈ 1.3, i.e. the used memory is reduced1 , while the convergence time scales as tconv ∼ N β with β ≈ 2.0. At the beginning of the process, playing the last registered word creates a positive feedback that enhances the probability of a success. In particular a circulating word has more probabilities of being played than with the usual stochastic rule, thus creating a scenario in which less circulating words are known by more agents. On the other hand the “last in first out” approach is highly ineffective when agents start to win, i.e. after the peak. In fact, the scaling tconv ∼ N β can be explained through simple analytical arguments. Let us denote by Na the number of agents having the word “a” as last recorded one. This number can increase by one unit if one of these agents is chosen as speaker, and one of the other agents is chosen as hearer, i.e. with probability Na /N × (1 − Na /N ); the probability to decrease Na of one unit is equal to the probability that one of these agents is a hearer and one of the others is a speaker, i.e. (1 − Na /N )Na /N . These two probabilities are perfectly balanced so that the resulting process for the density ρa = Na /N can be written as an unbiased random walk (with actually a diffusion coefficient ρa (1 − ρa )/N 2 ); it is then possible to show that the time 1

Remember that, with the usual stochastic word-selection rule, it holds tmax ∼ N α , tconv ∼ N β , tdif f = (tconv − tmax ) ∼ N δ , and Nwmax ∼ N γ with exponents α ≈ β ≈ γ ≈ δ ≈ 1.5 (Sec 2.3.1).

125

126

Beyond the Naming Game

6

tmax tconv tdiff

4

t ∝ Ν , δ ≈ 1.0

10

t

10 10 10

8

α

t ∝ Ν , α ≈ 1.3 δ

2

0

10 0 10

Nw

max

6

10 4 10 2 10 0 10 0 10

10 max

Nw

2

10

4

6

10

γ

∝ Ν , γ ≈ 1.3

10

2

10

4

6

10

N

Figure 5.9: Play-smart strategy - Scaling with the population size N . Top - For the time of the peak tmax ∼ N α , α ≈ 1.3, while for the convergence time we have tconv ∼ aN α + bN δ with δ ≈ 1.3, δ ≈ 1.0. Bottom - the maximum number of words scales as Nwmax ∼ N γ with γ ≈ 1.3. The play-smart rule gives rise to a more performing process, from the point of view of both convergence time and memory needed.

necessary for one of the ρa to reach 1 is of order N 2 . In summary, in this framework it is much more difficult to bring to convergence all the agents, since each residual competing word has a good probability of propagating to other individuals. The play-first strategy, on the other hand, leads to a faster convergence. Due to a sort of arbitrariness in the strategy before the first success of the speaker, the peak related quantities keep scaling as in the usual model, so that tmax ∼ N α and Nwmax ∼ N γ with α ≈ γ ≈ 1.5 (data not shown). This seems natural, since playing the first recorded word is essentially the same as extracting it randomly when most agents have only few words. In fact, in both cases no virtuous correlations or feedbacks are introduced between circulating and played words. However, playing the last word which gave rise to a successful interaction strongly improves the system-scale performances once the agents start to win. In particular it turns out that for the difference between the peak and convergence time we obtain (tconv − tmax ) ∼ N δ with δ ≈ 1.15 (data not shown), so that the behavior of the convergence time is the result of the combination of two different power law regimes, i.e. tconv ∼ aN α + bN δ . On the other hand, the stochastic rule leads to (tconv − tmax ) ∼ N 1.5 (Sec. 2.3.1). This means that the play-first strategy

5.4 Deterministic word-selection strategies

127

1 0.8

max

Nw(t) / Nw V(t)

0.6 0.4

0.2

0 0

5

1×10

2×10

t

5

3×10

5

4×10

5

Figure 5.10: Play-smart strategy - Fraction V (t) of agents who have played at least one successful game at time t. The transition between the initial condition, in which all agents play the last heard word, to the final one, in which agents play the word which took part in their last successful interaction, is continuous. The growth gets faster after the peak of Nw .

is able to reduce the time that the system has to wait before reaching the convergence, after the peak region. This seems the natural consequence of the fact that successful words increase their chances to be played while suppressing the spreading of other competitors.

5.4.1

The play-smart strategy

We have seen that, compared to the usual random extraction of the played word, the play-last strategy is more performing at the beginning of the process, while the play-first one allows to fasten the convergence of the process, even if it is effective only after the peak of the total number of words. It seems profitable, then, to define a third alternative strategy which results from the combination of the two we have just described. The new prescription, which we shall call play-smart, is the following: → If the speaker has never took part in a successful game, it plays the last word recorded; → Else, if the speaker has won at least once, it plays the last word it had

128

Beyond the Naming Game 1 1

3

S(t)

N=10

0.5

0.5 0

0 0.0

2.5×10

4

0

1×10

4

t

5.0×10

6

7.5×10

4

1

S(t)

N=10

4

stochastic play-last play-first play-smart

0.5

0 0.0

1.0×10

t

6

2.0×10

6

Figure 5.11: Success rate curves S(t) for the various strategies: stochastic, play-last, play-first and play-smart. At the beginning of the process the stochastic and play-first strategies yield similar success rates, but then the deterministic rule speeds up convergence. On the other hand also the play-smart and the play last evolve similarly at the beginning, but the latter reaches the final state much earlier through a steep jump. It is worth noting that for three strategies the S(t) curves present a characteristic “S-shaped” behavior, while in the play-last one the disorderorder transition is more continuous (see inset in the top figure). All curves, both for N = 103 and N = 104 , have been generated averaging over 3 × 103 simulation runs.

a communicative success with. The first rule will thus be applied mostly at the beginning, and as the system evolves, the second rule will be progressively adopted by more and more agents. Since the change in strategy is not imposed at a given time, but takes place gradually, in a way depending of the evolution of the system, such a strategy has also the interest of being in some sense self-adapting to the system’s actual state. In Figure 5.9, the scaling behaviors relative to the play-smart strategy are reported. Both the height and time of the maximum follow the scaling of the play-last strategy: tmax ∼ N α and Nwmax ∼ N γ with α ≈ γ ≈ 1.3. The convergence time, on the other hand, scales as a superposition of two power laws: tconv ∼ aN α + bN δ with α ≈ 1.3, δ ≈ 1.0. Thus, the global behavior determined by the play-smart modification is indeed less demanding in terms of both memory and time. In particular,

5.4 Deterministic word-selection strategies 2×10

4

Nw(t)

N=10 1×10

129

3

stochastic play-last play-first play-smart

4

3

5×10

0 0.0

2.5×10

4

4

t

5.0×10

7.5×10

4

5

4×10

5

N=10

4

Nw(t)

3×10

5

2×10

5

1×10

0 0.0

1.0×10

t

6

2.0×10

6

Figure 5.12: Total number of words Nw (t) for the various strategies: stochastic, play-last, play-first and play-smart. Due to different scaling behaviors of the process, differences become more and more relevant for larger N (top figure: N = 103 ; bottom figure: N = 104 ). The play-smart approach combines the advantages of play-last and play-first strategies.

while the lowering of the peak height yields in fact a slower convergence for the play-last strategy, the progressive self-driven change in strategy allows to fasten the convergence further than for the play-first strategy. It is also worth studying how the transition between the initial situation in which most agents play the last recorded word to that in which they play the last successful word takes place in the play-smart strategy. In other words we want to study the probability V (t) of finding an agent that has already been successful in at least one interaction at a given time (see Sec. 2.3.1 for the behavior of V (t) in the usual Naming Game). Results relative to a population of 104 agents are shown in Figure 5.10. Interestingly, the transition from the initial situation to the final one is continuous, and there is a sudden speeding up after the peak. Finally, in order to have an immediate feeling of what different playing word selection strategies imply, we report in Figures 5.11 and 5.12 the success rate S(t) and the total number of words, Nw (t) relative to the four strategies described previously, for two different sizes. The play-first and play-smart curves exhibit the same “S-shaped” behavior for S(t) as in the case of the stochastic model, while the play-last rule affects qualitatively the way in which the final state is reached. Indeed, in this case the transition between

130

Beyond the Naming Game the initial disordered state and the final ordered one is more continuous (see the inset in the top figure). Moreover, Figure 5.12 illustrates that the choice of the strategy has substantial quantitative consequences for both necessary memory and time needed to reach convergence, even if the changes in scaling behavior could at first appear rather limited (from N 1.5 to N 1.3 ). In particular, the play-smart strategy, which adapts itself to the state of the system, leads to a drastic reduction of the memory and time costs and thus to a dramatic increase in efficiency. In summary, we have shown that the modification of the rule followed by agents to select the word to be transmitted gives rise to a process which is less demanding in terms of agents memory usage and leads to a faster convergence, too. Due to the possible utility of the Naming Game model as inspiration for the design of technological systems, we believe that the findings presented here are not only intrinsically interesting from a theoretical point of view, but can also be relevant for applications.

5.5

A not efficient Naming Game

To model the mere emergence of a state in which individuals can understand each others, we can simplify the Naming Game drastically. In particular, we do not need to make the agents update their inventories after a successful interaction. This corresponds to setting β = 0 in the context of the stochastic update rule defined in Section 5.2. All agents start with empty inventories and, at any time t, two randomly extracted individuals interact according to the following rules: ˆ The speaker randomly extracts a word contained in its inventory, or, if its inventory is empty (i.e. it has never played before), it invents a new word. The word is then transmitted to the hearer. ˆ If the hearer does not know the uttered word, it adds it to the inventory and the interaction is a failure. ˆ If the hearer knows the uttered word, the interaction is a success.

In the absorbing state, each inventory contains all the O(N/2) different words created by the population. Therefore, at convergence, all interactions are successes, i.e. S(t) = 1, but the required memory scales as O(N 2 ). In other words, the communication system is effective, since agents can understand each others, but is not efficient, since, in the consensus state, they have to store a number of words which is proportional to the system size. The absence of the deletion rule reduces drastically the complexity of the model, and makes it possible to write exact master equations for the

5.5 A not efficient Naming Game

Figure 5.13: The Naming Game without words deletion. Mean field equations fit perfectly simulation runs. Since there is no deletion of words, in the final state every agent stores in its inventory all the words invented by the population (that are, on average, N/2). Curves refer to a population of N = 100 individuals.

evolution of the most relevant quantities. We call Ndi (t) the number of words invented by agent i up to time t, and Nwi (t + 1) the number of words stored in its inventory. For generality we can also work with more than one object, so that Noi (t) is the number of object agent i has dealt with at time t. We take into account explicitly the number of object since this simple case allows to give an example of the trivial impact of this parameter in the model (once homonymy has been neglected). µ



1 N i (t) 1 − omax (5.5) N No µ ¶ Noi (t) 1 i i 1 − max Nd (t + 1) = Nd (t) + N No µ ¶ Noi (t) 1 i i 1 − max + Nw (t + 1) = Nw (t) + N No · µ ¶ µ ¶¸ X 1 1 N j (t) 1 X Noj (t) Nwi (t) 1 − omax + 1 − N N − 1 j6=i No N − 1 j6=i Nomax Nd (t) Noi (t + 1) = Noi (t) +

The meaning of equations 5.5 is straightforward. The equations for No (t+1) and Nd (t + 1) are proportional to the number of object that have not yet been discovered by the population. They differ only by a factor of 2, since a new word can be invented only by a speaker, while a new object has to be identified also by the hearer. On the other hand, the total number of known words grows each time that a new object is found, both when the agent plays as speaker (second term of the equation for Nw (t + 1)) or as P hearer (first term). Finally, after a failure, an hearer may have to store P a new word even if it already knew the considered object (second term).

131

132

Beyond the Naming Game The equations can be simplified within a mean field approximation of average agent. Recovering the usual notation, we can write that the total numP i ber of words is Nw (t) = N is N (t) = i=1 Nw (t), the number of different words PN PN d i i N (t), and the number of discovered objects is N (t) = o i=1 d i=1 No (t). i i The approximation consists in writing Nw (t) = Nw (t)/N , Nd (t) = Nd (t)/N , and Noi (t) = No (t)/N . Inserting these relations in equations 5.5, we obtain: µ



No (t) (5.6) Nomax N µ ¶ No (t) Nd (t + 1) = Nd (t) + 1 − max No N µ ¶ µ ¶ No (t) No (t) Nw (t) Nw (t + 1) = Nw (t) + 2 1 − max + max 1− No N No N Nd N No (t + 1) = No (t) + 2 1 −

Figure 5.13 shows that equations 5.6 fit simulations data very accurately, thus confirming the validity of the mean-field approximation. In the ab2 sorbing state it holds Nd ' N 2No and Nw ∼ N 2No . Thus, as we mentioned above, the memory in the convergence state, in which the S(t) = 1, diverges as N goes to infinity, so that the communication system built up by the agents is not efficient. The number of objects, on the other hand, is a simple multiplicative factor.

5.6

Some remarks on weights

In the original Naming Game model, competing synonyms were associated with weights or scores (see Sec. 1.3). The model we have introduced in this thesis (Sec. 2), on the other hand, is able to account for the emergence of a shared lexicon without resorting to this additional parameter. The purpose of this paragraph is to define a possible implementation of weights in our Naming Game model, inspired by the model proposed in [179], and to show that the overall dynamics is not altered in significant way. Where, with “significant”, we intend “so drastic to justify a considerable increase in the complexity of the model”, which would make in-depth investigations much more costly. We define weight, or score, a real number which is associated with a word (and somehow determines its fitness). The rules of a model including weights, could be as follows. As it is usual, two agents are randomly selected, one to be the speaker and the other the hearer, then, ˆ The speaker selects from its inventory the word with the highest score. If two or more words share the highest weight, the agent selects randomly among them. Finally, if the inventory is empty, the speaker invents a new words, whose initial score will be ∆.

5.6 Some remarks on weights

133

Nw

max

tmax

8

10 6 10 4 10 2 10 0 108 10 6 10 4 10 2 10 0 10

∆u=.1; ∆d=.1 ∆u=.3; ∆d=.1 ∆u=.5; ∆d=.1 unweighted

tconv

8

10 6 10 4 10 2 10 0 10

1

10

2

10

3

4

10

10

5

10

N Figure 5.14: Scaling relations for the weighted model. Three weight settings are considered. Weights assure a greater efficiency in terms of reduced memory requirements, and a slight speeding up in convergence time. These improvements, however, strongly depend on the chosen values of the new parameters. The presence of weights requires a considerable increase in the complexity of the agents internal architecture and, consequently, of their interaction rules.

ˆ If the hearer has the uttered word in its inventory, the interaction is a success. Both agents increase the weight of the spoken word of an amount ∆u . They also perform lateral inhibition, decreasing the scores of competing words of an amount ∆d . ˆ If the hearer does not have the uttered word in its inventory, it records it, and the interaction is a failure. The initial score of the new word is ∆, while the score of all the other words is diminished by ∆d . The speaker, on the other hand, decreases the weight of the spoken word of ∆∗d .

Weights are bounded, and 0 ≤ ∆ ≤ 1. If the score associated with a word becomes smaller than zero, we prescribe that the word is deleted from the inventory. On the contrary, weights cannot become grater than one, and any further increase is simply discarded. Finally, we must assign the numerical values of ∆, ∆u and ∆d and ∆∗d , which crucially affect the dynamics. The rules defined in Sec. 2.2 for the usual unweighted model can then be recovered by ∆ = ∆u = 1, ∆d = 1 + ² (∀² > 0), and ∆∗d = 0. In general, there are infinite combinations of the four parameters. In

134

Beyond the Naming Game

Nd(t)

0 0 600

S(t)

∆u=.1; ∆d=.1 unweighted

5000

25000

400 200 0 0 1 0.8 0.6 0.4 0.2 0 0

50000

Nd(t)

Nw(t)

10000

10

75000

1e+05

2 0

25000

10 3 10 50000

25000

50000

t

t

4

10 75000

75000

10

5

1e+05

1e+05

Figure 5.15: Evolution in time of a Naming Game with weights. Curves for the total number of words, Nw (t), the number of different words, Nd (t), and the success rate, S(t), are presented (for a population of N = 1000 agents). Weights associated with words in the agents inventories reduce the height of the peak of Nw (t), and prevent the formation of the plateau in the Nd (t) curve. However, the overall dynamics in not qualitatively altered. After a first long period in which inventories correlate, convergence is reached with a sudden jump in the success rate, and a correspondent fast decay in the number of different words. The latter is clear from the inset of the Nd (t) graph, thanks to a double logarithmic scale on the axis.

Figure 5.14, we fix ∆ = 0.5, ∆d = ∆∗d = 0.1 and we present curves for three values of ∆u , a) ∆u = ∆d = 0.1, b) ∆u = 3∆d = 0.3 and c) ∆u = ∆ = 5∆d = 0.5. The scaling of the time of convergence, tconv , with the system size is almost the same of the usual unweighted case for setting c) (tconv ∼ N 1.5 ), while configurations a) and b) allow for some speeding up (tconv ∼ N 1.4 ). The time, tmax and height, Nwmax , of the peak of the total number of words Nwmax , on the contrary, scale respectively as tmax ∼ N 1.35±0.1 and Nwmax ∼ N , while in the model without scores they scale as tmax ∼ Nwmax ∼ N 1.5 . In Figure 5.15 we show the evolution in time of the total number of words, Nw (t), of the number of different words, Nd (t), and of the success rate S(t) for the configuration a) described above, along with the correspondent curves of the unweighted model. The effect of weights is not only to reduce the height of the peak of Nw (t), as noted above, but also to prevent the plateau in the number of different words. However, the initial bending in the Nd (t) curve, which is to be ascribed to the inhibition rule adopted by the speaker (parameter ∆∗d ), does not have any significative effect on the

5.7 Conclusions success rate, whose behavior is qualitatively unchanged. Thus, the general picture developed for the unweighted model remains valid also when scores are present. A long initial reorganization transient in which inventories correlate is followed by a sudden jump in the success rate, and a correspondent fast reduction in the number of competing synonyms, which is highlighted by logarithmic scale in the inset of Fig. 5.15. In conclusion, while weights associated with single words give some advantages in terms of reduced memory requirements and faster convergence, they do not alter the overall dynamics leading the population to converge. The price is however quite high, since both the agents internal structure and the interaction rules become considerably more complex. Moreover, as we have seen, the introduction of weights requires the definition of a certain number of parameters (four, in our case) and further specifications of the model (e.g. the bounds of weights and the rules describing what happens when they are reached), which weaken the transparency of the dynamics. In-depth analysis become much more costly, and analytical approaches more difficult. Moreover, preliminary results show that weights are not necessary even when homonymy is taken into account. For all these reasons we have decided not to consider weights in the work presented in this thesis, where we have favored the simpler unweighted Naming Game.

5.7

Conclusions

In this Chapter we have proposed and explored some modifications to the Naming Game introduced in Chapter 2. The results of our investigations shed further light on the role of the different rules that characterize the model. But they also open the way to some new promising directions for future work. Indeed, a very simple remodeling allows, in certain cases, to improve the original scheme, or to generalize it in very fruitful ways. First of all we have introduced a stochastic update rule, according to which agents update their inventories with a given probability β after a successful interaction (Sec. 5.2). Thus, the resulting model is a generalization of the original one, which is recovered for β = 1. Our analysis has shown that at β = βc = 1/3 there is the first of a series of transitions (occurring at lower values of β), which change the final state of the model. For β > βc convergence is characterized by a global consensus on a uniquely accepted word, while for β < βc more conventions coexist permanently (in the thermodynamic limit). We have been able to recover exactly, by means of an analytical approach, the value of the first transition. The result acquires particular interest from the point of view of socio-physics models, where such polarized final states are usually determined by the presence of a bounded confidence constraint imposed on the agents (see Sec. 1.5). Indeed, our mechanism, based on an tendency of the agents to stay in their

135

136

Beyond the Naming Game present opinions, is completely different. We have then altered the symmetry of the update that follows a success (Sec. 5.3). Remarkably, if it is only the hearer to perform the update, the fundamental dynamical properties do not change, i.e. the scaling exponents of the main quantities are not altered. This finding allows to introduce in a very natural way the possibility of broadcasting in the Naming Game, since a speaker, having transmitted a word to all its neighbors, does not have to collect their answers to perform itself an eventual update. From preliminary experiments, broadcasting turns out to speed up convergence on complex networks, particularly when they present a broad degree distribution. When the update is performed only by the speaker, on the other hand, convergence slows down drastically. The reason is that the speaker-only update rule sets the system in a critical situation correspondent to the β = 1/3 case of the symmetric update scheme. Another important aspect we have checked is the procedure describing which word has to be played by the speaker (Sec. 5.4). We have substituted the usual random extraction with extremely simple deterministic rules, which exploit only the information regarding the time at which different words are inserted in the inventory of the agent. In particular, we can distinguish two of them, namely the oldest and the newest. We have shown that simple deterministic rules can capitalize on, and somehow tune, the correlation among the inventories, thus increasing the efficiency of the Naming Game in terms of both the individual use of memory and the convergence time. Finally, we have investigated two more academic cases. The first is a not efficient model in which agents never delete words (Sec. 5.5), which bears some interest because it corresponds to the β = 0 limit of the generalized Naming Game. The dynamics is very simple, and exact equations can be derived, which describe the evolution in time of the main global quantities. Moreover, such equations can be simplified by a mean-field assumption, and we have shown that also in this case they describe perfectly the correspondent simulations data. Then, going back to the model that inspired our Naming Game, we have analyzed a possible implementation of scores to be associated with single words in agents inventories (Sec. 5.6). The new model is of course significantly more complex, so that in-depth analysis and analytical approaches would become more difficult. Moreover, we have shown that, while weights can improve slightly the performance of the Naming Game (if the new parameters are tuned appropriately), they do not seem to alter the dynamics from a qualitative point of view. A further reason to concentrate on the unweighted model.

Chapter 6

Random walk on complex networks

6.1

Introduction

In the Naming Game, the population builds up a (proto-)communication system in which all the agents agree on the same unique convention, for example the name to assign to an object. Different conventions, which we call words, are created at the beginning of the process and compete to survive. In general, their possibilities depend on the time of their creation (Sec. 2.6) and their popularity (Sec. 2.4.2). However, when the population is embedded in a complex topology, it is also important to consider which agent store a given word (see Chap. 3 and 4), or, conversely, which are the chances of a given node to be reached by a convention. It is then very interesting, for our purposes, to study how a non trivial topology can affect the spreading of information [32, 69, 71, 131, 169]. In this perspective, we investigate the basic problem of a random walk on complex networks [110], with particular focus on the problem of the mean first passage time (MFPT) of a walker on a generic node. This, in fact, can be seen as the first possible schematization of the process in which an agent acknowledges the existence of a new convention. Thus, semiotic dynamics is the fundamental inspiring framework for the work we present in this last chapter, but in what follows we concentrate on the random-walk problem, addressing it from a strict statistical physics point of view and presenting the results contained in [23]. It is well known that random walk is a fundamental process to explore an environment [110, 167, 132, 29, 7, 196], and recently great attention has been devoted to the study of random walks on networks (see, for instance, [1, 8, 149, 151, 95, 174, 36, 171, 88, 197]). In this process a walker, situated

138

Random walk on complex networks on a given node at time t, can be found with probability 1/k on any of the k neighbors of that node at time t + 1. In particular, we are interested in the mean first passage time (MFPT) on a node s for a random walker starting from a generic, unknown, node x. It is important to note here that Noh and Rieger [151] have derived an exact formula for the MFPT Tsj of a random walker between two nodes s and j in a generic finite network exploiting the properties of Markov chains. However, we do not trivially average Tsj over all j 6= s, a very costly operation, but we introduce the concept of ring (see also [70]). In this perspective we study the graph as seen by node s, and partition it in rings according to the topological distance of the different nodes from s (see also [145, 174]). This allows us to map the original Markov problem (of N states) in a new Markov chain of drastically reduced dimension (O(ln N/ lnhki×ln N/ lnhki)) and, as a consequence, to calculate the MFPT on a generic node s with a reduced computational cost. On the other hand with the new process the identity of the single target node s is lost, and all the nodes with the same connectivity (i.e. number of neighbors) are not distinguishable. Our explicit calculation is almost free of approximations only for Erd¨osRenyi random graphs, for which we obtain an excellent agreement between theory and numerical simulations. The more disordered scenario of other complex networks makes the extension of our approach progressively more problematic. Nevertheless we find quite surprisingly that our approach is able to make very good predictions also for other synthetic networks, such as the Barab´asi-Albert scale free networks, and at least two real world graphs. In all these cases, the considered networks behave, with respect to the property studied, as if they were random graphs with the same average degree. Finally our approach allows us to show that a random walker recovers rapidly the degree distribution of the network it is exploring. The chapter is organized as follows. In Sec. 6.2 the concept of ring is introduced and the new Markov process on which the original problem can be mapped is defined. In Sec. 6.3 explicit calculations for the case of random graphs are performed. It is shown that the description of a random graph in term of rings is very accurate, and that theoretical predictions for the MFPT are in excellent agreement with results from simulations. Sec. 6.4, on the other hand, is devoted to the possible extension of the theory to other networks. Notwithstanding the difficulties that arise in the analytical extension of the theory, it is shown that MFPTs of walkers in both artificial and real life networks can be predicted quite accurately with our theory. It is also shown that a random walk can be used for the reconstruction of the degree distribution of a network. Conclusions are contained in Sec. 6.5.

6.2 A new process - rings

Figure 6.1: Ring structures. Rings from the point of view of the target node s. Six nodes belong to the first ring (white vertices), so that n1 = 6, and 10 nodes belong to the second ring (red vertices), thus n2 = 10. Note that, by definition of ring, there are no connections between nodes in the second ring and the target node s.

6.2

A new process - rings

Every graph, or network, can be described in terms of an associated adjacency matrix A (See Appendix A for more details) whose element Aij = 1 if nodes i and j are connected, and Aij = 0 otherwise. We restrict ourselves to work with undirected graphs, i.e. Aij = Aji , which do not contain any links connecting a node with itself (Aii = 0, ∀i), and we concentrate on the case in which they are connected, i.e. in which each pair of nodes i, j, are connected by at least one single path. From a random walk point of view the matrix A can be interpreted as the N × N symmetric transition matrix of the associated Markov process1 . We are interested in the problem of the average MFPT on a node s of degree k(s) of a random walker that started 1

A random walk can be seen as a Markov process with the identification position-state.

139

140

Random walk on complex networks from a different, unknown, node x. Our idea is mapping the original Markov process A on a much smaller process B which will be asymmetric and will contain self-loops (i.e. Bii 6= 0). More precisely we reduce the N × N matrix to a O(ln N/ lnhki × ln N/ lnhki) matrix. Given the target node s, we start by subdividing the entire network in subnetworks, or rings (see also [145, 174]), rl , with the following property: rl = {nodes j | dsj = l}

(6.1)

where dsj = djs is the distance between nodes s and j, i.e. the smallest number of links that a random walker has to pass to get from j to s (Fig 6.1). These rings will be the states of the new matrix B. Their number, being proportional to the maximum distance between any two nodes in the network, i.e. to the diameter of the network, is O(ln N/ lnhki) [66, 67] (where hki is the average degree of the nodes of the graph). Other important quantities are the average number mrl ,rl+1 ≡ ml,l+1 of links that connect all the elements of rl with all the elements of rl+1 , and the average number ml,l of links between nodes belonging to the same rl . We have trivially ml,l−1 = ml−1,l and ml,k = 0 if dlk > 1. We now have all the elements to define our new process. We are no more interested in the exact position of the random walker. The relevant information is now the ring in which the random walker is. The matrix of this process has size (lmax + 1) × (lmax + 1), where lmax is the diameter of the original graph. The matrix has the following structure 2 (for the case lmax = 6):       B=      P

0 0 0 0 0 0 0 b10 b11 b12 0 0 0 0 0 b21 b22 b23 0 0 0 0 0 b32 b33 b34 0 0 0 0 0 b43 b44 b45 0 0 0 0 0 b54 b55 b56 0 0 0 0 0 b65 b66

           

(6.2)

P

lmax where bij = mij /( lk6max =i =0 mik +2mii ) for i 6= j, and bii = 2mii /( k6=i =0 mik + 2mii ). bij thus represents the probability of going from ring i to ring j. By definition of rings it is clear that it is not possible to move from a ring to a non-adjacent other ring, while it is obviously possible to move inside a ring, and in this case the number of links must be doubled to take into account that each internal link can be passed in two directions. The elements of the first row of the matrix are set equal to 0 because we are interested in the (t) first passage time in the target node s. The probability Pij of going from 2

For an easier notation rows (and columns) are labeled with indexes starting from 0, instead of 1.

6.3 Explicit calculation for random graphs

141

state i to state j in t steps is given by (B t )ij . If we set b01 = 1, we would allow the walker to escape from node s, while b00 = 1 should be used if we were interested in the probability that the walker reached node s before time t. The probability Fk(s) (t) that the first passage on node s occurs at time t is then: Fk(s) (t) =

lX max l=1

nl (B t )l0 N −1

(6.3)

where nl is the number of nodes belonging to the ring rl and each matrix term is weighted with the probability that the random walker started in the ring corresponding to its row, i.e. nl /(N − 1). The average time MFPT τ (k(s)) can be calculated using eq.(6.3) as: τ (k(s))) =

∞ X

lFk(s) (l)

(6.4)

l=1

6.3

6.3.1

Explicit calculation for random graphs

Static

We start recalling the concept of random graph (see also Appendix A). A random graph is obtained in the following manner: given a finite set of isolated N nodes, all the N × (N − 1)/2 pairs of nodes are considered and a link between two nodes is added with probability p. This yields (in the limit N → ∞) to Poisson’s distribution for the degree k of a node: hkik −hki e (6.5) k! with hki = p(N − 1). It is clear that such a graph does not contain any relevant correlations between nodes and degrees, and this will allow us to obtain exact average relations for the quantities illustrated in the previous section. The first important quantity to calculate is the average number nl of elements of rl . It holds P (k) =

Ã

nl+1 =

N−

l X

!

nk (1 − (1 − p)nl )

(6.6)

k=0

where nl+1 is calculated as the expected number of nodes not belonging to any interior ring that are connected with at least a member of rl . Figure 6.2 illustrates that eq.(6.6) is in excellent agreement with results from simulations. Obviously n0 = 1 and n1 = hki. However, if we know the degree k(s)

142

Random walk on complex networks 1.5

e

e

1.5 1 0.5

simulation theory

k(s)=2

k(s)=4

400

nl

nl

400

200

0

200

0

2

4

l

6

0

8

k(s)=6

0

2

4

l

6

8

k(s)=8

400

nl

nl

400

200

200

0

0

1

1

e

e

1 0.5

0.5

l

0.5 0

2

4

6

8

0

2

4

l

6

8

l

Figure 6.2: Nodes nl per ring. Rings populations nd are shown in this figure. Comparisons of theoretical previsions from eq.(6.6) and data from simulations for different values of k(s) in a single E-R graph of size N = 103 and hki = 6 are shown. The fractional error e, defined as the ratio between measured and calculated values, is also plotted for each represented value of k(s). Theoretical average quantities are in excellent agreement with single graph measurements.

of node s we can impose n1 = k(s) and calculate the following nl>1 in the usual manner. In fact, this is the way in which we will use eq.(6.6). For the average numbers ml,l+1 of links that connect all the elements of rl with all the elements of rl+1 , and ml,l of links between nodes belonging to the same rl we have: Ã

ml,l+1 = nl N −

l X

!

nk p

(6.7)

k=0

ml,l =

nl (nl − 1) p 2

where the fact that a link between two nodes exists with probability p is exploited. As a practical prescription we add that when eq. (6.7) yields to non-physical ml,l < 0 one has to redefine ml,l = 0. Also these expressions, which are crucial for the construction of the B matrix of eq. (6.2), give predictions in excellent agreement with results from simulations (data not shown). As expected we have that, in the limit p → 0, ml,l+1 → nl+1 , while,

6.3 Explicit calculation for random graphs

143

12 k(s)=2 k(s)=4 k(s)=6 k(s)=8 k(s)=10 k(s)=12 theory k(s)=6

10



8 6 4 2 0

2

4

6

8

l Figure 6.3: Average ring degree. Average degrees of nodes belonging to different rings are shown in function of rings’ distances for an ER random graph (N = 104 nodes and hki = 6). A dependence on ring’s distance appears clearly, as predicted by theory. Results for different values of k(s) are shown (predictions shown only for the case k(s) = 6).

when p → 1, ml,l+1 → nl × nl+1 . Note that from eq. (6.7) one has that nl < 1 ⇒ ml,l < 0 which has clearly no physical meaning. Before going on it is interesting to make a remark. It is known [145, 146] that the nearest neighbors of a generic node have particular properties, e.g. an average degree different from the hki of the graph. Our relations are able to predict this fact. Combining eq.(6.6) and eq.(6.7) it is easy to see that the average degree of the nodes belonging to rl depends on l, being constant (and larger than hki) for low values of l and decreasing rapidly for l large enough. Data from simulation reported in Figure 6.3 show that this prediction is correct. This agreement is not surprising since eq.(6.6) and (6.7) are separately in excellent agreement with simulations, thus being able to predict very accurately the value of the average degree of the nodes belonging to the same ring.

6.3.2

Dynamics

So far, we have shown that predictions of the above relations on the static properties of a random graph are correct. We now explore the predictions on the diffusion processes. Figure 6.4 shows, for different values of the degree k(s) of the target node, the comparison of the predicted values of Fk(s) (t),

144

Random walk on complex networks

-3

Fk(s)(t)

10

k(s)=1 k(s)=6 k(s)=11 theory

10

-3

-4

10

-4

10

0

3

5×10

-5

10

-6

10 0

4

2×10 t

4

4×10

Figure 6.4: First passage time distributions. FPT probability distributions both measured and calculated for an ER graph with hki = 6 are presented for different values of k(s). Theoretical predictions are in excellent agreement with data of random walk on a single graph. Theoretical curves are obtained with eq.(6.3), and fit the relation Fk(s) (t) = (1/τ )exp(−t/τ ) with τ obtained with eq.(6.4). In the inset the first part of the distribution is showed in more detail.

calculated with eq.(6.3) and results from simulations. Data from simulations are obtained by selecting randomly, for each of the approximately N × N runs, one of the nodes of selected degree k(s) present in the network to play the role of s, and one of the remaining N −1 nodes to be the starting point of the random walker. The agreement between theory and simulations is very good. The exponential behavior, typical of finite ergodic Markov chains, has the form f (t) = (1/τ )exp(−t/τ ). Figure 6.5 shows that the average MFPT τ (k(s)), calculated using eq.(6.4) (with the B matrix built using (eq.(6.6) and (6.7)), are in good agreement (though slightly smaller) with those obtained in simulations. We shall return to the origin of the small disagreement between theory and simulations at the beginning of the next section. The relation, found both in theory and simulations (see also [151]), τ (k(s)) = τ (1) × k(s)−1 , can be explained with elementary qualitative probabilistic arguments. In fact, since, as shown in Figure 6.3, the average degree of the nodes in r1 does not depend on k(s) (i.e. on the size n1 of r1 ), also the MFPT on a node of r1 is independent from n1 . This means that, while the probability of passing from r1 to s does not depend on n1 , a larger r1 is visited more often than a smaller one. Combining these observations, it seems plausible guessing that the MFPT on a target node s with k(s) > 1

6.3 Explicit calculation for random graphs 10

145

4

τ

simulation theory

10

3

1

1.10

e

10 k(s)

1.05

1.00

1

10 k(s)

Figure 6.5: Mean first passage times. In upper graph MFPT both measured and calculated (using eq.(6.4)) are reported for an ER graph of size N = 103 with hki = 6. Error bars on measured values are not visible on the scale of the graph. The line τ (k(s)) ' τ (1) × k(s)−1 is also plotted. It holds τ (1)sim = 7413 and τ (1)calc = 7164 for values obtained respectively from simulations and from calculation. It can be noted that the order of magnitude of τ (1) is given by 2M , where M is the total number of links in the graph; in our case we have 2M ' hki × N = 6000. In lower graph the fractional error e, defined as the ratio between simulated and calculated MFPTs, is reported.

will be 1/k times the MFPT of a node s with k = 1, and this behavior is indeed observed. It is worth noting that both the k(s)−1 trend and the order of magnitude of τ (1) can be derived with a simple mean field approach [171, 88]. In fact, once neglected all the possible correlations in a graph, the whole random walk process can be approximated with a two state Markov process where the two states correspond to the walker being at the target node and on any other node. Easy calculations shows that the probability for the walker to arrive at a node s is given by q(s) = q(k(s)) = k(s)/2M , where M is the total number of links of the graph. For a fully connected graph this relation gives the exact value of τ = τ (N − 1) = N − 1. In a random graph the mean field approach gives better and better results the larger is the mean degree hki. For small values of hki, only the order of magnitude of τ (1) is in fact predicted by this approach. The method based on rings, thought being less simple, is able to make more accurate predictions for all values of hki.

146

Random walk on complex networks 10

-3

theory simulation

-4

P(t)

10

10

-5

-6

10

-7

10

-8

e

10 0 1.5

10000

20000

30000

40000

10000

20000 t

30000

40000

1.0 0.5 0

Figure 6.6: First passage time distribution - Average on all target nodes s. The distribution of the FPT obtained is shown (ER graph: N = 103 , hki = 6). At each run both the target node s and the starting node of the random walker are randomly chosen. This means that in this distribution the degree of the target nodes is no more fixed. The non exponential curve is the result of the convolution of several exponential curves obtained for fixed k(s). The theoretical curve, obtained according to eq.(6.8), is in excellent agreement with data from simulation performed on a single graph. The fractional error e, defined as the ratio between simulated and calculated data, is also plotted. For large times poor statistics causes bigger fluctuations of e.

Just for comparison we report here data shown in Figure 6.5 relative to a network of N = 103 nodes with hki = 6 : we have τ (1)sim /τ (1)calc = 1.03 and τ (1)sim /2M = 1.24, where τ (1)sim and τ (1)calc are MFPT obtained respectively with simulations and with the ring method. All the results discussed above allow us to explain the curve presented in Figure 6.6, which represents the distribution P (t) of the MFPT in a graph when both s and the starting point of the walker are randomly chosen at each run. P (t) can be calculated here as the convolution of several exponential FPT distributions Fk(s) (t) corresponding to the different values of k(s), each weighted with the probability of encountering a node of degree k(s) in the graph. More precisely, according to eq.(6.5), for each time step t every k(s) (hki) −hki . We Fk(s) (t) must be weighted with Poisson’s weights ck(s) = hki k(s)! e have:

6.4 Extensions to other networks

P (t) = =

∞ X (hki)

cj

j=1 ∞ X j=1

147

Fj (t)

(hki) (−t/τ (j))

cj

e

1 τ (j)

(6.8)

This relation can be written in a more compact way exploiting the fact that τ (k(s)) = τ (1) × k(s)−1 . Defined Z = hki × exp(−t/τ (1)), it holds:

P (t) =

∞ X j=1

where c is the constant

6.4

Z

Z j−1 e−hki = cZeZ (j − 1)! τ (1)

(6.9)

e−hki τ (1) .

Extensions to other networks

So far, we have described a method that allows to calculate the average MFPT on a node s of a walker that started from a generic other node of the graph. We have then obtained exact (average) expressions for the case of random graphs. Unfortunately, the analytical extension of the relations found for this kind of graphs to other graphs (such as, for example, scale free networks) is difficult. This is due to the fact that eq.(6.6) and eq.(6.7) exploit the knowledge of the rules according to which a random graph is generated. In other words the absence of correlations between nodes is the main feature those equations are based on. When correlations are present the calculation of the number of nodes of the second ring, n2 , is already very difficult (for finite networks) and requires some empirical assumptions [146]. In addition there is a more subtle reason that makes our method difficult to extend. Given the set of all nodes of a graph with a certain degree k, their first rings, although having the same number of nodes, present two kinds of fluctuations. On a global scale, the average degree k(r1 ) of the nodes of the first ring does not have a unique value, but in general is distributed according to some probability density. On a local scale, on the other hand, a single ring is not made by identical nodes, and its average degree has a certain variance σ. In Figure 6.7 we show global and local fluctuations for both a random graph and a Barab´asi-Albert (BA) scale free graph [16] (See Sec. A). The preferential attachment rule of the Barab´asi-Albert network generates a graph with a scale free form P (k) ' k −c , with c = 3, for the degree distribution. As it is evident from Figure 6.7, BA graphs have larger fluctuations than random graphs.

148

Random walk on complex networks

N()

Erdos-Renyi graph

Barabasi-Albert graph

3

10

2000

10

2

1000

10 0

0

10

5

15

20 10

1

0

10

Nσ/k

400

1000

200

500

0 0

0.2

0.4

0.6

σ(r1)/k(r1)

100





0.8

1

0 0

0.5

1

1.5

σ(r1)/k(r1)

2

Figure 6.7: First ring fluctuations. Fluctuations related to nodes of degree k = 3 are presented for ER (left) and BA (right) graphs of size N = 1 × 105 and hki = 6. On the two top figures global fluctuations are analyzed. Histograms of the number of nodes vs. the average degree k(r1 ) of the nodes of the first ring r1 for ER graphs (top-left) and BA graphs (top-right). Logarithmic scale for the ordinate of BA graph must be noted. In the two bottom figures local fluctuations are analyzed. Here the histograms represent the number of rings in function of the ratio σ(r1 )/k(r1 ), where σ(r1 ) is the variance of the degree of the nodes belonging to each r1 for ER graphs (top-left) and BA graphs (top-right). It is evident the higher degree of fluctuations of the BA graphs.

With our method of rings, described before, the fluctuations cannot be taken into account. Matrix B (6.2) is in fact defined under both the assumptions of (i) equivalence of all the nodes with a given k (global homogeneity) and (ii) equivalence of all the nodes inside each ring (local homogeneity). The slight disagreement between our theory and simulation results present in Figure 6.5 are thus easily explained in term of local fluctuations of the first ring. In fact, as it is easy to demonstrate using Lagrange multipliers, the assumed local ordered configuration is the most advantageous for a walker that has to reach the node s from the first ring. This is then the reason for which our calculated MFPT are always smaller than those obtained from simulations. Notwithstanding these difficulties in extending our theory, we found a

6.4 Extensions to other networks

MFPT

10

149

5

BA ER brain internet

4

10

10

3

2

10

e

10

0

1

10

k(s)

2

10

3

10

3.0 0.0 3.0 0.0 3.0 0.0

0

10

10

1

10

2

3

10

k(s) Figure 6.8: Mean first passage times vs. the degree of the target node for different networks. Continuous lines with small filled points (upper graph) are obtained from eq.(6.4), i.e. for random ER graphs, while empty symbols of different shapes come from simulations. It is evident the excellent agreement between theory and simulations both for ER graphs (circles) and, less obviously, for BA graphs (squares). Data from real networks are also reported: also in this case the agreement with theoretical predictions is good. Error bars are not visible on the scale of the graph. In lower graph fractional errors e, defined as the ratio between measured and calculated values, are reported both for real networks and BA graph. Dashed lines indicate correspond to e = 1.

quite surprisingly result, shown in Figure 6.8: given a BA graph with a given average degree hki, the average MFPT for a walker starting from a generic node on a node s of degree k(s) is almost equal to the corresponding average MFPT of the same random walk on a random graph with the same average degree. This means that our theory continues to predict very well the MFPT (and hence its exponential distribution). It is remarkable that the theory predicts well also the MFPT on nodes with high degree, which are absent in the corresponding random graph. The ability of our theory to predict diffusion processes on BA graphs can be due to the modest presence of correlations between its nodes. Many properties of real networks, in fact, are not reproduced by the BA model. One important measure of correlations3 in a graph is the measure of the 3

We refer, as it is usual, to Appendix A for further details on networks definitions and properties.

150

Random walk on complex networks 0

10

BA graph P(k)*k/ (BA) ER graph P(k)*k/ (ER)

-1

t(k)

10

-2

10

-3

10

-4

10

0

10

1

2

10

10

10

3

k

Figure 6.9: Degree distribution exploration. A random walker explores the degree distribution of two networks: a BA graph of size N = 105 and hki = 6 and an ER random graph of size N = 105 and hki = 15. Simulations in which a walker travels the networks for N time steps have been performed. Empty symbols in figure represent the fraction of time spent on a node of degree k. Filled symbols joined by light lines are obtained with the relation P (k)k/hki, where P (k) is the degree distribution of the considered network. Filled points fit well the experimental data, and, as it is obvious, a longer walking process would allow for better agreement. The probability of finding a random walker on a node of a given degree k is related to the degree distribution of the graph via the inverse of 1/k MFPT scaling relation (apart from a normalizing factor). For BA graphs it holds P (k) ∝ k −3 . The first part of the curve P (k)k/hki presented in Figure is indeed c × k −2 , where c is a constant. The region of higher degrees, on the other hand, grows linearly due to the fact that in a finite size realization the statistics on high degree nodes is poor and the degree distribution in this region is flat. We avoided any binning in experimental or theoretical data to make clear that the walker exploration of the network is not perfect.

average degree of the nearest neighbors (i.e. of the nodes of the first ring) of vertices of degree k, called knn [159]. While random and BA graphs have a flat knn , indicating the absence of strong correlations among nodes, many real networks exhibit either an assortative or disassortative behavior. In the first case high degree vertices tend to be neighbors to high degree vertices, while in the second case they have a majority of low degree neighbors. Another important measure of correlation is the clustering coefficient which is proportional to the probability that two neighbors of a given node are also

6.5 Conclusions neighbor of themselves. Again, BA and random graphs, in which clustering is very poor, do not reproduce the clustering properties of many real networks. In order to check how far our theory can predict the MFPT on correlated graphs we have performed two sets of experiments on real networks. We have considered in particular a network of Internet at the level of Autonomous Systems [75, 162] which exhibits a disassortative mixing feature (see Appendix A) and a recently proposed scale-free brain functional network [84] which exhibits an assortative mixing feature as well as a strong clustering coefficient. The results for the MFPT for these two networks, as a function of the degree of the target node, are reported in Figure 6.8. Though the agreement between theory and simulation is not any more perfect, it remains good. In particular we find again the approximate trend τ (k(s)) = τ (1) × k(s)−1 . Now we have all the elements to estimate the probability of finding a random walker on a node of a given degree k. On the one hand, in fact, it seems obvious that this probability is related to the fraction f (k) of nodes of that degree in the network, while on the other hand we now know that the MFPT on such a node is proportional, on average, to 1/k. It is then reasonable arguing that the probability for a random walker being on a node of degree k is proportional to kf (k). We have tested this hypothesis in an experiment reported in Figure 6.9. In the experiment a walker has explored a BA network and an ER random graph with N = 105 nodes for N time steps. At each time step the degree of the visited node was recorded and the normalized histogram of the fraction of time spent on nodes of any degree is reported in Figure 6.9. In the limit of infinite time steps this histogram would indicate exactly the probability of finding the walker of a node of a given k. According to our hypothesis, this histogram should be described by the function P (k)k/hki, where P (k) is the degree distribution of the considered network, and the Figure shows this is in fact the case, already after a relatively small number of time steps. Finally, it is worth noting that the previous argument can be reversed. A walker able to record the degree of each node it traverses can be used to determine the degree distribution of the network it travels. In fact, if t(k) is the fraction of time spent on nodes of degree k it holds f (k) ∝ t(k)/k. The average degree hki is then trivially obtained by requiring the normalization of the estimated P (k).

6.5

Conclusions

Inspired by the fascinating issue of the spreading of information on complex topologies, we have addressed the problem of the computation of the

151

152

Random walk on complex networks mean first passage time on a selected node s of random walkers starting from different nodes on a generic network. Thus, in this chapter we have not dealt directly with the Naming Game, but rather we have concentrated on a fundamental problem of statistical mechanics which can be seen as a useful conceptual reference for semiotic dynamics. We have introduced an approximate method, based on the concept of rings, which maps the original random walk Markov process on another Markov process in a much smaller space. This allows for a drastic reduction of the computational cost. Moreover, since the focus is on the target node that has to be reached by the walker, our technique is particularly suitable for the schematization of the problem of an individual that has to be reached by a piece of information which diffuses randomly in the population. In the case of ER random graphs we have been able to analytically derive all the quantities of interest and we have shown that our method gives predictions, both for static and dynamic properties, in excellent agreement with results found in simulations. We believe that our method is promising both for more close approaches to the convention dynamics in the Naming Game or other socio-physics models, and, more in general, for the basic study of diffusion processes in complex environments. In this perspective, the difficulty to extend the analytical approach to more complex networks could represent a problem. However, quite surprisingly, we have found that MFPT calculated with our theory for ER graphs are in excellent agreement also with simulations of dynamics on BA networks and in good agreement with results obtained with random walkers on two real networks, thus making our method an approximated, but easy, tool to predict MFPT related quantities in many cases.

Conclusions and outlook In this thesis we have focused on how new conventions propagate and eventually stabilize in a population of agents. We have developed a simple model, and we have studied it both numerically and analytically, exploiting tools and concepts of statistical physics and complex systems science (Chap. 2-5). Finally, inspired by the same set of problems, we have also investigated the behavior of random walkers on complex networks (Chap 6). Our work has been stimulated by the recent developments in the interdisciplinary area that studies the emergence and evolution of language. Here, the idea that language can be seen as an evolving and self-organizing system, constantly (re-)shaped by its users, has recently gained ground. At the same time, computer based models, and experiments with artificial agents, have become fundamental tools to investigate the dynamics ruling the emergence and the evolution of language. Compared with previously existing approaches, the contribution of this thesis is twofold. First of all, the model we have proposed is by far simpler than the ones we have been inspired by, and, notwithstanding this, it keeps all their most important conceptual elements. Then, as a consequence, we have been able to perform a detailed quantitative investigation that, as far as we know, is unprecedented both from the point of view of the detail of the analysis and from that of the number of features taken into account. Our approach introduces some elements of novelty also from the point of view of the opinion-dynamics-like models traditionally studied by physicists. Interactions between agents are based on negotiation, rather than on imitation (like, for instance, in the voter model). Microscopic rules are asymmetric and feedback determines the output of each decision process. Moreover, individuals are endowed with memory, and they can wait before reaching an agreement. Finally, the number of states (or “conventions” or “words” ) an agent can store is neither fixed nor limited, and this feature leads to a complex and interesting microscopic dynamics. Having introduced our model, at first we have investigated its dynamics in the mean-field case (Chap. 2). We have pointed out which are the main timescales involved in the convergence process and which are the mechanisms allowing the population to reach a consensus. Then, we have focused on the role played by different interaction topologies, ranging from regular

154

Conclusions and outlook lattices to complex networks, showing how the different topological properties affect the overall behavior of the agents (Chap. 3). Beyond influencing the global quantities, however, the topology also leaves a clear mark on the microscopic interactions patterns, and we have identified a non-trivial relation between the number of neighbors of an agent and its memory usage (Chap 4). Next, we have critically reviewed the microscopic rules of our model, collecting also a number of interesting hints for future work (Chap 5). Indeed, suitable modifications yield important results, like a phase transition or the improvement of the overall performances of the population in the convergence process. Finally, we have addressed the problem of the first passage time of a random walker on complex networks, that can be seen as a first schematization of the process in which an individual (i.e. a node of the network) becomes aware of a new convention (i.e. the walker) (Chap. 6). Present and future work will move in several directions, and we have discussed the possible developments of the different results as we have presented them, across the whole thesis. We list here briefly the topics we are currently addressing, or we plan to investigate in the near future. ˆ In Sec. 5.2 we have seen that, inserting a simple parameter in the definition of our model a phase transition occurs. For values of the parameter above a certain threshold agents are able to reach final consensus, while for smaller values two or more conventions survive in the asymptotic state. As we have discussed, the result is particularly interesting since the transition is not due to any bounded-confidencelike mechanism, as in other models of opinion dynamics, but rather it is caused by a diminished disposition of the agents to local agreement. So far, we have only looked at the mean-field case, where we have been able to explain analytically our findings, but we are now investigating the role of different interaction patterns. Preliminary results show that the transition always occurs, but the critical value of the parameter can vary with the underlying topology. ˆ In this thesis, we have addressed the effects that topology has on the dynamics of our model. In real systems (e.g. social networks), however, dynamics and topology often co-evolve. Topology describes the set of possible interactions at a given time, and dynamics continuously reshapes the underlying landscape. We aim at addressing this issue in the context of our model. The phase transition we have discussed in Sec. 5.2 provides promising insights. Indeed, a single parameter allows to build up an asymptotic state with an arbitrary large number of conventions. Thus, as a next step, links could be rewired depending on the opinions of the agents, so that the population could split in different communities, or on the contrary prevent the formation of iso-

Conclusions and outlook lated groups. Moreover, while in Sec. 5.2 we have introduced a global parameter affecting the behavior of all agents, we plan to explore the possibility of endowing each individual with a given quenched value of the parameter, probably to be extracted from some probability distribution. Thus, from the viewpoint of the interplay between topology and dynamics, different agents could play different social roles, promoting the homogeneity or the fragmentation of the population. ˆ Even if we have extensively investigate our model under many respects, some questions are still open. For instance, we want to investigate further the role of memory and feedback on the overall dynamics. In particular, it would be interesting to explore whether one can define a simpler model which still exhibits a similar dynamics. Of course, the new rules could be more abstract, and possibly less significant for the modeling of real phenomena, but they could help to gain a deeper analytical comprehension of the different features we have introduced. ˆ Defining our model, we have assumed the absence of homonymy. Accordingly, we have been able to work with a single object, i.e. with a single inventory for each agent. At present, we are investigating the role of this assumption, and preliminary results show that our rules are perfectly able to deal with homonymy, without making any significant new assumption. More generally, it is worth recalling that, from the point of view of semiotic dynamics, we have addressed the most basic question in the understanding of the evolution of a language, i.e. the emergence of a shared repertoire of conventions, or more precisely, of form-meaning (or name-object) associations. The next step would be to describe how agents can agree on a set of categories (Sec. 1.3.4). The emergence of a shared set of categories is indeed an important breakthrough that increases the efficiency of a language allowing for quick reference to large sets of objects. Starting from the results presented in this thesis, we are working on a model that copes with this higher task. ˆ Finally, in collaborations with a group of experimental researchers led by Prof. Dario Floreano at the Ecole Polytechnique F´ed´erale de Lausanne, we are planning to perform experiments with simple robots following the interaction rules we have defined. Indeed, as we have pointed out, our microscopic rules are suitable for actual implementation (Sec. 2.8), and the results of this thesis are relevant also from the point of view of applications. For instance, it has been suggested recently that our interaction scheme could find an application as an algorithm for the autonomous key creation (or selection) for encrypted communication in a community of sensor nodes [133]. The experiments we are setting up would be an important test to verify the reliability

155

156

Conclusions and outlook of our model when noise and other physical limitations, that are hard to foresee, become an essential part of the game.

Appendix A

A short overview of complex networks The aim of this appendix is to provide the non-familiar reader with some elements of complex network theory. We concentrate only on those elements which are fundamental to a clear understanding of the network-oriented chapters of this thesis. We refer to [6, 82, 162, 34, 47] for much more complete treatments of this fascinating subject. At first we present some of the most important measures adopted to describe a graph, then we anlyze some real-world examples of complex networks, and finally we describe some models of network generation.

A.1

Measures

A graph, or network, is the union of two sets, V and E, called the set of vertices (also nodes or sites), and the set of edges (also links or bonds), respectively. A profitable way to represent a graph is plotting the vertices as points and the edges as lines between them. From a mathematical point of view, all the information about a generic graph is contained in the adjacency matrix A whose element aij = 1 if nodes i and j are connected, and aij = 0 otherwise. We shall consider here only undirected graphs, i.e. aij = aji , which do not contain any links connecting a node with itself (aii = 0, ∀i). An important generalization is that of weighted graphs, in which the values aij can assume real values different from 1. In this thesis we have dealt only with unweighted networks, and so we shall do here, but it must be stressed that most real-world networks are indeed weighted. Finally, we shall concentrate only on the case of connected graphs, i.e. graphs in which each pair of nodes i, j, with k(i), k(j) 6= 0, are connected with at least one single path. In what follows, we present some of the fundamental measures which characterize a network.

158

A short overview of complex networks Degree distribution The degree of a node i, ki , is the number of connections linking it to other P nodes. In the matrix language, it holds ki = j aij . The histogram of the number of nodes with given degree is the degree distribution, P (k), of a network, and it is probably the most important measure used to characterize a graph. Its first momentum is the average degree, hki, which is given by: hki =

2E 1 X kP (k) = , N k N

(A.1)

where E is the number of links of the graph. Also the second moment of the distribution hk 2 i, which measures the fluctuations of the degree distribution, is very important. Indeed, an important distinction concerns homogeneous versus heterogeneous networks. The degree distribution of the former is peaked around the average value, while that of the latter is skewed, often fat-tailed, or more generally characterized by large fluctuations around the average value. Historically, the attention towards complex networks originated by the discovery that many real systems presented large fluctuations, with the presence of few nodes with a large number of connections (the hubs), and a vast majority of low connected vertices. In fact, most of the models existing up to that moment yielded very peaked degree distributions, in which all nodes have approximately the same degree. Nearest neighbors average degree - (dis)assortativity The average degree of the nearest neighbors of a given node i, said knn,i , is a second order measure. Formally it is defined as [159]: knn,i =

N 1 X aij kj ki j=1

(A.2)

A more interesting question concerns the average degree of the nodes of a given degree k, which is a simple measure of the correlations between degrees. Given a node of degree k, we are interested in the probability P (k 0 |k) that it is linked with a node of degree k 0 . This quantity satisfies the normalization P condition k0 P (k 0 |k) = 1 and the detailed balance condition [35] kP (k 0 |k)P (k) = k 0 P (k|k 0 )p(k 0 ),

(A.3)

corresponding to the absence of dangling bonds. If the graph is uncorrelated, i.e. P (k 0 |k) does not depend on k, it holds P (k 0 |k) =

k 0 P (k 0 ) , hki

(A.4)

A.1 Measures

159

which follows by the normalization condition and eq. (A.3). In this formalism we have that the nearest neighbors of a node of degree k have an average degree given by: knn (k) =

X

k 0 P (k 0 |k).

(A.5)

k0

The distribution of the knn (k) provides helpful insights into the structure of a network and is the fundamental brick of assortativity measures [144]. A network is said to be assortative if high connected nodes are preferentially linked with each others, and the same happens for low degree vertices. On the contrary, in a disassortative network the hubs are connected mostly with low degree nodes, and vice-versa. In general, it turns out that most technological networks are disassortative, while social networks are assortative. Again, (dis)assortativity is a problematic property from the point of view of models, since most of them build up networks which are neither assortative nor disassortative, but present a flat knn distribution. Clustering coefficient The clustering coefficient of a node i, Ci , is a third order measure given by the ratio between the number of links connecting two neighbors of the considered node, ni , and the maximum theoretical values of this number, ki (ki − 1)/2 [192]. Said in different words it is the ratio of triangles of which i is a vertices and their maximum possible number: 2ni Ci = = ki (ki − 1)

P

lm ail alm ami

ki (ki − 1)

.

(A.6)

The clustering coefficient of a network with N nodes, C, is given by the average: C=

N 1 X Ci . N i=1

(A.7)

It turns out that most real networks have high clustering coefficient: another unforeseen property that pushed the interest toward this field. Distance between nodes The distance between two noes i and j is defined as the number of vertices traversed by the shortest path lij connecting them. Thus, in connected networks (the only case exploited in this thesis), the distance between any pair of nodes is finite. The largest distance connecting two nodes is called

160

A short overview of complex networks the diameter lc of the network. The average distance hli between vertices is thus given by hli =

X X 2 lij = lPl (l), N (N − 1) i
(A.8)

where Pl (l) is the probability distribution of having a distance of length l in the network. In most real networks it turns out that this distribution is symmetrical and sharply peaked around hli, which is then an important quantity to characterize them. Moreover, it turns out that the average distance length is often ”very small”, hence the name small-world networks. More precisely, hli scales logarithmically, or slower, with the system size. This property is already captured by random graphs, where hli ∼ log N [66, 67], but it is absent in regular structures, where hli ∼ N 1/d . Centrality of a node Centrality measures aim at defining the importance of a node in a network. Among the different centrality measures, we focus here on the so-called betweenness centrality [143, 98, 41], which is probably the most important. According to this measure a node is more central if it is part of a large number of shortest paths connecting pairs of nodes in the network. More precisely, if Lab is the number of shortest path between nodes a and b, and Laib is the number of these paths that pass through vertex i, the betweenness centrality of i, bi , is defined as bi =

X Laib a6=b

Lab

.

(A.9)

The betweenness usually takes values of order O(N ) or larger. For instance, in a star graph in which N − 1 nodes are connected only to a central vertex, the betweenness of the latter is N (N −1)/2, while that of the others is N −1.

A.2

Examples of real networks

We describe now shortly the essential properties of some real-world networks, presenting some examples collected in the recent review by Boccaletti et al. [34]. As it is usual, we focus only on unweighed networks, but it is important to recall again that very often real networks are weighted, i.e. links are associated with real numbers describing somehow their properties (e.g. the traffic passing on them, the strength of a relationship between two nodes, etc.). Table A.1 summarizes data from different real networks [34]. The first three line concern technological examples. AS2001 stands for the Internet at the autonomous system level (AS) as on April 2001, while Routers indicate

A.2 Examples of real networks Network AS2001 Routers Gnutella Protein Math1999 Actors

N 11174 228263 709 2115 57516 225226

hki 4.19 2.80 3.6 6.80 5.00 61

hli 3.62 9.5 4.3 2.12 8.46 3.65

161 C 0.24 0.03 0.014 0.07 0.15 0.79

d/a d a d d a a

γ 2.38 2.18 2.19 2.4 2.47 2.3

Refs. [162, 159, 160] [162, 159, 160] [46] [117] [76, 102] [192, 9]

Table A.1: Real networks. Measured quantities are: the number of nodes N , the average connectivity hki, the average distance between vertices hli, the average cluster C, the degree correlations (d isassortativity or assortativity) and the exponent of the degree distribution γ (which is in all cases power-law P (k) ∼ k −γ ). The table is taken from [34].

the router level graph representation of the Internet [162, 34]. Gnutella, on the other hand, is a peer-to-peer network used to shared files by users over local and wide-area networks. In all cases, the average connectivity hki is small with respect to N (i.e. they are sparse graphs), the average distance between nodes is small (“small-world” property), and the average clustering is large. The degree distribution is broad and can be described by a power law of the form P (k) ∼ k −γ . The three graphs differ from the point of view of degree correlations. The AS and Gnutella are in fact disassortative networks, with a tendency of nodes with large degrees to be connected with low connected vertices, and vice-versa. In contrast, the Internet at the router level exhibit an assortative behavior, with a tendency of similar nodes to link with each others. The fourth example of Table A.1 is a biological network, the proteinprotein interaction network in the yeast [117]. Here proteins are the nodes, and links are established if two proteins physically interact. The resulting graphs is highly heterogeneous, exhibiting features similar to those of the technological examples we have discussed above. In particular, we notice that the degree correlations are disassortative, and that the average distance between vertices is remarkably small (hli = 2.12). Finally, the properties of two social networks are summarized in the last two lines of Table A.1. A social network is by definition made of social entities (the nodes) connected each time that some kind of interactions exists between two of them. Math1999 is a network of scientific collaborations between mathematicians in which two individuals are linked if they have co-authored at least a paper, while the movie actor collaboration network, based on the Internet Movie Database (www.imdb.com), links together actors that have casted in the same movie. Of course, in both cases it would be very easy to add weights to the links, representing, for instance, the num-

162

A short overview of complex networks

0.06 0.05

P(k)

0.04 0.03 0.02 0.01

0.00 20

30

40

60

50

70

80

90

k

Figure A.1: Homogeneous graph. Left: degree distribution of an Erd˝os-R´enyi random graph with N = 3 × 105 nodes and average connectivity hki = 50. Right: an homogeneous graph.

ber of collaborations, but here, as it is usual, we consider the unweighted case. As the examples we have considered before, these networks are “smallworld”(small average distance), sparse (hki ¿ N ) and high clustered. The degree distribution is power law, and degree correlations are assortative, meaning that similar nodes tend to establish connections with each others.

A.3

Models

We describe now briefly some of the most representative models of networks. We discuss also their main properties, but we shall not derive them, referring, as before, the interested reader to more complete references [6, 147, 82, 162, 47, 34]. The Erd˝ os-R´ enyi random graph The model proposed by Erd˝os and R´enyi [85, 86] has been for a long time the paradigm for network generation. A random graph is a static network obtained in the following manner: given a finite set of isolated N nodes, all the N × (N − 1)/2 pairs of nodes are considered and a link between two vertices is added with probability p. This yields, in the two limits N → ∞ and p → 0, and provided that p > 1/N , to a Poisson’s distribution for the degree k of a node: P (k) =

hkik −hki e , k!

(A.10)

A.3 Models

163

with hki = p(N − 1). Thus, the average degree is representative of all the nodes in the graph, which indeed share homogeneous properties (Fig A.1). Moreover, it is clear that such a graph does not contain any relevant correlations between nodes and degrees. The clustering coefficient of Erd˝os-R´enyi graphs can be easily estimated considering that, if the average degree is sufficiently smaller than the number of nodes, i.e. if the network is sparse enough, the probability that two nodes sharing a neighbor are also neighbors of each others is simply p. Finally, it can be easily shown that for the diameter lc it holds lc ∼

log N . loghki

(A.11)

Generalized random graphs Random graphs can be generalized so as to present different degree distributions [139, 140, 2, 3, 98, 56]. An interesting example to generate such networks is the uncorrelated configuration model [56]. It prescribes to assign to each vertex i a number ki of half-edges drawn from the desired √ degree probability P (k), with a cut-off N that prevents the emergence of undesired correlations. The network is then built pairing randomly (i.e., with uniform probability) half edges belonging to different nodes, preventing multiple links. Given the tunability of the degree distribution, generalized random graphs allow to model real world networks better than simple Erd˝osR´enyi networks. However, also their clustering coefficient C vanishes in the thermodynamic limit, even if it can be significative for finite size networks obtained from highly skewed degree distributions (see [34]). The Watts-Strogatz small-world network In most real-world complex networks, the average hopping distance between two vertices is very small, and it is possible to reach every vertex in a small number of steps. This is the so called small-world property we mentioned above, which is exhibited also by random graphs. However, most real network show also a strong clustering, i.e. they contain many triangles, while random graphs are locally tree-like. To recover both properties, Watts and Strogatz proposed a model that interpolates between a clustered regular lattice (with large average distance) and a small-world random graphs (with low clustering) [192]. The fundamental ingredient of the model is the rewiring probability p. One starts with a one dimensional regular lattice where each vertex is connected to its m nearest neighbors. Then vertices are sequentially visited: each link connecting the vertex to one of its neighbors in the clockwise sense is left in place with probability 1 − p, while with probability p it is rewired

164

A short overview of complex networks

10

0

-1

10

P(k)

10

-2

-3

10 10

-4

-5

10 10

-6 0

10

1

10

10

2

3

4

10

10

k

Figure A.2: Heterogeneous graph. Left: degree distribution of a Barab´asiAlbert network with N = 3 × 105 nodes and m = 5. Right: a Barab´asi-Albert graph.

to another vertex (i.e. the other extremity of the vertex is removed from the neighboring node and connected with a randomly extracted node). For p=1, the network is completely random (but is not equivalent to a random graph since m is a lower bound for the connectivity of a node). In the limit p → 0, on the other hand, the network is essentially a regular lattice. Interestingly, however, in the region 1/N ¿ p ¿ 1, the small-property emerges along with a high average clustering C, that can be determined noting that two neighbors in the regular structure remain neighbors with probability (1 − p) hci =

3(m − 1) (1 − p)3 . 2(2m − 1)

(A.12)

For the degree distribution, in the interesting range 1/N ¿ p ¿ 1 it holds [27]: min(k−m,m) Ã

P (k) =

X i=0

m i

!

(1 − p)i pm−i

(pm)k−m−i −pm e . (k − m − i)!

(A.13)

The Barab´ asi-Albert model The Barab´asi-Albert model was expressly conceived to model real growing networks, like the Web. Its main ingredients to get a power-law degree distribution are indeed growth and preferential attachments. New nodes constantly enter the network, and they attach to pre-existing vertices selecting

A.3 Models

165

them on the basis of their degree. Thus, the so-called “rich get richer” effect is produced, since hubs tend to collect more new links than low connected nodes, and their degree keeps growing. The assumption is that, in both technological and social networks, new nodes try to gain a good position connecting with important and central nodes. The algorithm starts with a core of m0 connected nodes (their properties do not alter the statistical properties of the network in the thermodynamic limit). At each time step (t = 1, 2.., N −m0 ) a new node j enters the network and establishes m < m0 links with pre-existing nodes, selecting a given node i with a probability ki Πj→i = P . i ki

(A.14)

The number of nodes at time t is N (t) = m0 + t. Since the number of links m2 is M ∼ 20 + mt, the average degree is trivially hki = 2m for large t. The degree distribution, on the other hand has power-law behavior P (k) ∼ k −γ with γ = 3 [16] (Fig. A.2). Further properties of networks generated with the Barab´asi-Albert model are flat degree correlations and almost vanishing clustering. It is also interesting to note that the linearity in k of the attaching kernel of eq. (A.14) is a necessary condition to get a power-law distribution [127, 125]. Finally, it is worth mentioning that the ideas of growth and/or preferential attachment introduced by the Barab´asi-Albert model have been extensively adopted by a series of different models, including node aging [122], fitness [48, 87], edge rewiring [5], limited information [142], etc. The Dorogovtsev-Mendes-Samukhin model An interesting variant of the Barab´asi-Albert model is the DorogovtsevMendes-Samukhin model [83]. Here, each new vertex is attached to the extremities of a randomly chosen node. Thus, the probability that a node is chosen is proportional to the number of its edges, i.e. to its degree. The important point is that with this scheme there is no need of any a-priori knowledge of the degrees of the different vertices. Also in this case the degree distribution is a power law P (k) ∼ k −γ with γ = 3, but, contrarily to what happens following the Barab´asi-Albert procedure, the network generated in this way is highly clustered (hci ' 0.74 for m = 1, where m is the number of links selected by each entering node) [26].

Bibliography [1] L. Adamic, R. Lukose, A. Puniyani, and B. Huberman. Search in power-law networks. Phys. Rev. E, 64(4):46135, 2001. [2] W. Aiello, F. Chung, and L. Lu. A random graph model for massive graphs. In Proceedings of the 32nd ACM Symposium on Theory of Computing, pages 171–180, 2000. [3] W. Aiello, F. Chung, and L. Lu. A random graph model for power law graphs. Experimental Mathematics, 10:53–66, 2001. [4] A. Akmajian, R. A. Demers, A. K. Farmer, and R. M. Harnish. Linguistics: An Introduction to Language and Communication. The MIT Press, (Cambridge, MA (USA)), 5 edition, 2001. [5] R. Albert and A.-L. Barab´asi. Topology of evolving networks: Local events and universality. Phys. Rev. Lett., 85(24):5234–5237, Dec 2000. [6] R. Albert and A.-L. Barab´asi. Statistical mechanics of complex networks. Review of Modern Physics, 74:559–564, 2002. [7] D. Aldous and J. Fill. Reversible markov chains and random walks on graphs. Book in preparation, 2001. [8] E. Almaas, R. V. Kulkarni, and D. Stroud. Scaling properties of random walks on small-world networks. Phys. Rev. E, 68(5):056105, 2003. [9] L. Amaral, A. Scala, M. Barthelemy, and H. Stanley. Classes of smallworld networks. Proc. Natl. Acad. Sci. USA, 97(21):11149, 2000. [10] P. W. Anderson, K. Arrow, and D. Pines. The Economy as an Evolving Complex System. Addison-Wesley, Redwood, CA (USA), 1988. [11] J. S. Andrade Jr, H. J. Herrmann, R. F. S. Andrade, and L. R. da Silva. Apollonian networks: Simultaneously scale-free, small world, euclidean, space filling, and with matching graphs. Phys. Rev. Lett., 94:018702, 2005.

168

BIBLIOGRAPHY [12] T. Antal, M. Droz, G. Gy¨orgyi, , and Z. R´acz. 1/f noise and extreme value statistics. Phys. Rev. Lett., 87:240601, 2001. [13] S. Auyang. Foundations of Complex-System Theories in Economics, Evolutionary Biology, Statistical Physics. Cambridge University Press, New York (USA), 1998. [14] R. Axelrod. The Complexity of Cooperation. Princeton University Press, 1997. [15] R. Axelrod. The dissemination of culture: A model with local convergence and global polarization. Journal of Conflict Resolution, 41(2):203–226, 1997. [16] A.-L. Barab´asi and R. Albert. Emergence of scaling in random networks. Science, 286:509, 1999. [17] A.-L. Barab´asi, E. Ravasz, and T. Vicsek. Deterministic scale-free networks. Physica A, 299:47, 2001. [18] A. Baronchelli, L. Dall’Asta, A. Barrat, and V. Loreto. Bootstrapping communication in language games: Strategy, topology and all that. In A. Cangelosi, A. D. M. Smith, and K. Smith, editors, The Evolution of Language: Proceedings of the 6th International Conference (EVOLANG6). World Scientific Publishing Company, 2006. [19] A. Baronchelli, L. Dall’Asta, A. Barrat, and V. Loreto. Nonequilibrium phase transition in negotiation dynamics. Submitted for publication, 2006. [20] A. Baronchelli, L. Dall’Asta, A. Barrat, and V. Loreto. Strategies for fast convergence in semiotic dynamics. In L. M. R. et al., editor, Artificial Life X, pages 480–485. MIT Press, 2006. [21] A. Baronchelli, L. Dall’Asta, A. Barrat, and V. Loreto. Topologyinduced coarsening in language games. Phys. Rev. E, 73(1):015102, 2006. [22] A. Baronchelli, M. Felici, E. Caglioti, V. Loreto, and L. Steels. Sharp transition towards shared vocabularies in multi-agent systems. Journal of Statistical Mechanics, P06014, 2006. [23] A. Baronchelli and V. Loreto. Ring structures and mean first passage time in networks. Phys. Rev. E, 73(2):26103, 2006. [24] A. Barrat. Unpublished, 2005. [25] A. Barrat and R. Pastor-Satorras. Rate equation approach for correlations in growing network models. Phys. Rev. E, 71:036127, 2005.

BIBLIOGRAPHY [26] A. Barrat and R. Pastor-Satorras. Rate equation approach for correlations in growing network models. Phys. Rev. E, 71(3):036127, 2005. [27] A. Barrat and M. Weigt. On the properties of small-world network models. Europ. Phys. J. B, 13:547, 2000. [28] M. Barth´elemy, A. Barrat, R. Pastor-Satorras, and A. Vespignani. Dynamical patterns of epidemic outbreaks in complex heterogeneous networks. Journal of Theoretical Biology, 235:275–288, 2005. [29] D. Ben-Avraham and S. Havlin. Diffusion and reactions in fractals and disordered systems. Cambridge University Press, Cambridge (UK), 2000. [30] E. Ben-Naim, L. Frachebourg, and P. L. Krapivsky. Coarsening and persistence in the voter model. Phys. Rev. E, 53:3078–3087, 1996. [31] E. Bertin. Global fluctuations and gumbel statistics. Phys. Rev. Lett., 95(17):170601, 2005. [32] L. M. Bettencourt, A. Cintron-Arias, D. I. Kaiser, and C. CastilloChavez. The power of a good idea: Quantitative modeling of the spread of ideas from epidemiological models. Physica A, 364:513, 2006. [33] D. Biber, S. Conrad, and R. Reppen. Corpus Linguistics - Investigating Language Structure and Use. Cambridge University Press, Cambridge (UK), 1998. [34] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang. Complex networks: Structure and dynamics. Physics Reports, 424:175–308, 2006. [35] M. Bogu˜ n´a and R. Pastor-Satorras. Epidemic spreading in correlated complex networks. Phys. Rev. E, 66(4):047104, 2002. [36] E. M. Bollt and D. ben Avraham. What is special about diffusion on scale-free nets? New Journal of Physics, 7:26, 2005. [37] R. Boyd and P. J. Richerson. Culture and the evolutionary process. University of Chicago Press, Chicago (USA), 1985. [38] D. Boyer and O. Miramontes. Interface motion and pinning in smallworld networks. Phys. Rev. E, 67:035102, 2003. [39] S. T. Bramwell, K. Christensen, J.-Y. Fortin, P. C. W. Holdsworth, H. J. Jensen, S. Lise, J. M. Lopez, M. Nicodemi, J.-F. Pinton, and M. Sellitto. Universal fluctuations in correlated systems. Phys. Rev. Lett., 84:3744, 2000.

169

170

BIBLIOGRAPHY [40] S. T. Bramwell, P. C. W. Holdsworth, and J.-F. Pinton. Universality of rare fluctuations in turbulence and critical phenomena. Nature, 396:552–554, 1998. [41] U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25, 25:163–177, 2001. [42] B. Bransden and C. Joachain. Physics of Atoms and Molecules. Longman Group Limited, New York (USA), 1983. [43] A. J. Bray. Theory of phase-ordering kinetics. Advances In Physics, 51(2):481–587, 2002. [44] H. Brighton and S. Kirby. The survival of the smallest: Stability conditions for the cultural evolution of compositional language. In J. Kelemen and P. Sosik, editors, ECAL01, pages 592–601. SpringerVerlag, 2001. [45] E. J. Briscoe, editor. Linguistic Evolution through Language Acquisition: Formal and Computational Models. Cambridge University Press, 2002. [46] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph structure in the Web. Computer Networks, 33(1-6):309–320, 2000. [47] G. Caldarelli. Scale-Free Networks: Complex Webs in Nature and Technology. Oxford University Press, 2006 (in press). [48] G. Caldarelli, A. Capocci, P. De Los Rios, and M. A. Mu˜ noz. Scalefree networks from varying vertex intrinsic fitness. Phys. Rev. Lett., 89(25):258702, Dec 2002. [49] A. Cangelosi and D. Parisi, editors. Simulating the Evolution of Language. Springer Verlag, London (UK), 2002. [50] A. Cangelosi, A. Smith, and K. Smith, editors. The Evolution of Language: Proceedings of the 6th International Conference on the Evolution of Language. Singapore: World Scientific, 2006. [51] C. Castellano. Effect of network topology on the ordering dynamics of voter models. AIP Conference Proceeding, 779:114, 2005. [52] C. Castellano, V. Loreto, A. Barrat, F. Cecconi, and D. Parisi. Comparison of voter and glauber ordering dynamics on networks. Phys. Rev. E, 71:066107, 2005.

BIBLIOGRAPHY [53] C. Castellano, M. Marsili, and A. Vespignani. Nonequilibrium phase transition in a model for social influence. Phys. Rev. Lett., 85:3536– 3539, 2000. [54] C. Castellano, D. Vilone, and A. Vespignani. Incomplete ordering of the voter model on small-world networks. Europhys. Lett., 63(1):153– 158, 2003. [55] X. Castello, V. M. Eguiluz, and M. S. Miguel. Ordering dynamics with two non-excluding options: Bilingualism in language competition. Arxiv preprint physics/0609079, 2006. [56] M. Catanzaro, M. Boguna, and R. Pastor-Satorras. Generation of uncorrelated random scale-free networks. Phys. Rev. E, 71:027103, 2005. [57] C. Cattuto, V. Loreto, and L. Pietronero. Collaborative tagging and semiotic dynamics. Arxiv preprint cs.CY/0605015, 2006. [58] L. L. Cavalli-Sforza and F. Cavalli-Sforza. The great human diasporas. Addison-Wesley, Reading, MA (USA), 1995. [59] L. L. Cavalli-Sforza and M. W. Feldman. Cultural Transmission and Evolution: A quantitative approach. Princeton University Press, 1981. [60] D. Challet, M. Marsili, and Y.-C. Zhang. Minority Games. Oxford University Press, November 2004. [61] P. Chen and S. Redner. Majority rule dynamics in finite dimensions. Phys. Rev. E, 71:036101, 2005. [62] N. Chomsky. Aspects of the Theory of Syntax. MIT Press, Cambridge, MA (USA), 1965. [63] N. Chomsky. Language and Mind. Harcourt Brace Jovanovich, New York (USA), 1972. [64] N. Chomsky. The minimalist program. MIT Press, Cambridge, MA (USA), 1995. [65] M. Christiansen and S. Kirby. Language evolution. Oxford University Press, 2003. [66] F. Chung and L. Lu. The Diameter of Sparse Random Graphs. Advances in Applied Mathematics, 26(4):257–279, 2001. [67] F. Chung and L. Lu. The average distances in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA, 99(25):15879– 15882, 2002.

171

172

BIBLIOGRAPHY [68] J. Clark and D. A. Holton. A first look at graph theory. World Scientific, 1991. [69] R. Cowan and N. Jonard. Network structure and the diffusion of knowledge. Journal of Economic Dynamics and Control, 28(8):1557– 1575, June 2004. [70] L. da Fontoura Costa and L. E. C. da Rocha. A generalized approach to complex networks. Arxiv preprint cond-mat/0408076 and condmat/0412761, 2004. [71] D. J. Daley and D. g. Kendall. Epidemics and rumors. Nature, 1964. [72] L. Dall’Asta and A. Baronchelli. Microscopic activity patterns in the naming game. J. Phys. A: Math. Gen., 39:14851–14867, 2006. [73] L. Dall’Asta, A. Baronchelli, A. Barrat, and V. Loreto. Agreement dynamics on small-world networks. Europhys. Lett., 73(6):969–975, 2006. [74] L. Dall’Asta, A. Baronchelli, A. Barrat, and V. Loreto. Nonequilibrium dynamics of language games on complex networks. Phys. Rev. E, 74:036105, 2006. [75] Data have been downloaded from the site www.cosin.org. [76] R. de Castro and J. Grossman. Famous trails to paul erd¨os. Mathematical Intelligencer, 21(3):51–63, 1999. [77] P. G. de Gennes. Granular matter: a tentative view. Rev. Mod. Phys., 71:S374–S382, 1999. [78] T. W. Deacon. The Symbolic Species: The Co-evolution of Language and the Brain. W.W. Norton, 1997. [79] G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch. Mixing beliefs among interacting agents. Advances in Complex Systems, 3:87, 2000. [80] C. Di Chio and P. Di Chio. Game Theory and Linguistic Meaning, chapter Evolutionary models of language. Current Research in the Semantics/Pragmatics Interface. Elsevier, 2006. [81] I. Dornic, H. Chat´e, J. Chave, and H. Hinrichsen. Critical coarsening without surface tension: The universality class of the voter model. Phys. Rev. Lett., 87:045701, 2001. [82] S. Dorogovtsev and J. F. Mendes. Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford University Press, 2003.

BIBLIOGRAPHY [83] S. N. Dorogovtsev, J. F. F. Mendes, and A. N. Samukhin. Sizedependent degree distribution of a scale-free growing network. Phys. Rev. E, 63:062101, 2001. [84] V. Egu´ıluz, D. Chialvo, G. Cecchi, M. Baliki, and A. Apkarian. ScaleFree Brain Functional Networks. Phys. Rev. Lett., 94(1):18102, 2005. [85] P. Erd˝os and A. R´enyi. On random graphs. Publicationes Mathematicae Debrecen, 6:290–297, 1959. [86] P. Erd˝os and A. R´enyi. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 5:17–61, 1960. [87] G. Erg¨ un and G. Rodgers. Growing random networks with fitness. Physica A, 303(1-2):261–272, 2002. [88] T. S. Evans and J. P. Saramaki. Scale-free networks from selforganization. Phys. Rev. E, 72(2):026138, 2005. [89] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. Proc. ACM SIGCOMM, Computer Communication Review, 29:251–262, 1999. [90] L. Frachebourg and P. L. Krapivsky. Exact results for kinetics of catalytic reactions. Phys. Rev. E, 53:R3009–R3012, 1996. [91] S. Franz and F. Ritort. Relaxation processes and entropic traps in the Backgammon model. J. Phys. A: Math. Gen, 30:L359–L365, 1997. [92] S. Galam. Social paradozes of majority rule voting and renormalization group. J. Stat. Phys., 61:943–951, 1990. [93] S. Galam. Rational group decision making: a random field ising model at t=0. Physica A, 238:66–80, 1997. [94] J. Galambos. The asymptotic theory of extreme order statistics. Wiley, New York (USA), 1978. [95] L. Gallos. Random walk and trapping processes on scale-free networks. Phys. Rev. E, 70(4):046116, 2004. [96] D. Garlaschelli, G. Caldarelli, and L. Pietronero. Universal scaling relations in food webs. Nature, 423:165, 2003. [97] P. L. Garrido, J. Marro, and M. A. Mu˜ noz, editors. Modeling cooperative behavior in the social sciences: Eighth Granada Lectures on Modeling Cooperative Behavior in the Social Sciences. AIP Conference Proceedings Volume 779, 2005.

173

174

BIBLIOGRAPHY [98] K.-I. Goh, B. Kahng, and D. Kim. Universal behaviour of load distribution in scale-free networks. Phys. Rev. Lett., 87:278701, 2001. [99] S. Golder and B. Huberman. The Structure of Collaborative Tagging Systems. Arxiv preprint cs.DL/0508082, 2005. [100] M. Granovetter. The strength of weak ties. American Journal of Sociology, 78:1360–1380, 1973. [101] J. H. Greenberg. Language, culture and communication. Stanford University Press, CA (USA), 1971. [102] J. Grossman and P. Ion. On a portion of the well-known collaboration graph. Congressus Numerantium, 108:129–131, 1995. [103] E. J. Gumbel. Statistics of extremes. Columbia University Press, New York (USA), 1958. [104] H. Haken. Information and Self-Organisation: a macroscopic approach to complex systems. Springer-Verlag, Berlin, 1988. [105] R. Hegselmann and U. Krause. Opinion dynamics and bounded confidence models, analysis and simulation. Journal of Artificial Societies and Social Simulation, 5(3):paper 2, 2002. [106] P. Holme and B. J. Kim. Growing scale-free networks with tunable clustering. Phys. Rev. E, 65:026107, 2002. [107] P. Holme, B. J. Kim, C. N. Yoon, and S. K. Han. Attack vulnerability of complex networks. Phys. Rev. E, 65:056109, 2002. [108] http://del.icio.us/. [109] http://www.flickr.com/. [110] B. Hughes. Random walks and random environments. Clarendon Press, Oxford (UK), 1995. [111] J. Hurford. Biological evolution of the saussurean sign as a component of the language acquisition device. Lingua, 77(2):187–222, 1989. [112] J. Hurford. Nativist and functional explanations in language acquisition. In I. M. Roca, editor, Logical Issues in Language Acquisition, pages 85–136. Foris, Dordrecht (NL), 1990. [113] J. Hurford, M. Studdert-Kennedy, and C. Knight. Approaches to the Evolution of Language: Social and Cognitive Bases. Cambridge University Press, Cambridge (UK), 1998.

BIBLIOGRAPHY

175

[114] E. Hutchins and B. Hazlehurst. How to invent a lexicon: the development of shared symbols in interaction. In G. N. Gilbert and R. Conte, editors, Artificial Societies: The computer simulation of social life. UCL Press, London (UK), 1995. [115] F. Jasch and A. Blumen. Target problem on small-world networks. Phys. Rev. E, 63:041108, 2001. [116] F. Jasch and A. Blumen. Trapping of random walks on small-world networks. Phys. Rev. E, 64:066104, 2001. [117] H. Jeong, S. Mason, A.-L. Barab´asi, and Z. Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41–2, 2001. [118] H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A.-L. Barab´asi. The large-scale organization of metabolic networks. Nature, 407:651–654, October 2000. [119] J. Ke, J. Minett, C. Au, and W. Wang. Self-organization and selection in the emergence of vocabulary. Complexity, 7(3):41–54, 2002. [120] S. Kirby. Natural language from artificial life. In R. K. Standish, M. A. Bedau, and H. A. Abbass, editors, Artificial Life 8, pages 185– 215. The MIT press, 2002. [121] K. Klemm, V. Egu´ıluz, R. Toral, and M. San Miguel. Nonequilibrium transitions in complex networks: A model of social interaction. Phys. Rev. E, 67(2):026120, 2003. [122] K. Klemm and V. M. Egu´ıluz. Highly clustered scale-free networks. Phys. Rev. E, 65(3):036123, Feb 2002. [123] N. Komarova and P. Niyogi. Optimizing the mutual intelligibility of linguistic agents in a shared world. Artif. Intell., 154(1-2):1–42, 2004. [124] P. L. Krapivsky. Kinetics of monomer-monomer surface catalytic reactions. Phys. Rev. A, 45:1067–1072, 1992. [125] P. L. Krapivsky and S. Redner. Organization of growing random networks. Phys. Rev. E, 63(6):066123, May 2001. [126] P. L. Krapivsky and S. Redner. Dynamics of majority rule in two-state interacting spin systems. Phys. Rev. Lett., 90:238701, 2003. [127] P. L. Krapivsky, S. Redner, and F. Leyvraz. Connectivity of growing random networks. Phys. Rev. Lett., 85(21):4629–4632, Nov 2000. [128] M. Kuperman. Cultural propagation on social networks. preprint nlin.AO/0509004, 2005.

Arxiv

176

BIBLIOGRAPHY [129] T. Lenaerts, B. Jansen, K. Tuyls, and B. De Vylder. The evolutionary language game: An orthogonal approach. Journal of Theoretical Biology, 235(4):566–582, August 2005. [130] T. Ligett. Interacting particle systems. Springer-Verlag, New York (USA), 1985. [131] P. Lind, L. da Silva, J. Andrade Jr, and H. Herrmann. How Gossip Propagates. Arxiv preprint cond-mat/0603824, 2006. [132] L. Lov´asz. Random walks on graphs: A survey. In Combinatorics, Paul Erd¨ os is Eighty, pages 353–398. J´anos Bolyai Mathematical Society, Budapest, 1996. [133] Q. Lu, G. Korniss, and B. K. Szymanski. Naming games in spatiallyembedded random networks. In AAAI Fall Symposium Series: Interaction and Emergent Phenomena in Societies of Agents, 2006. [134] J. Marro and R. Dickman. Nonequilibrium Phase Transitions in Lattice Models. Cambridge University Press, Cambridge (UK), 1999. [135] J. Maynard Smith. Evolution and the theory of games. Cambridge University Press, Cambridge (UK), 1982. [136] J. Maynard Smith and E. Szathm´ary. The Major Transitions in Evolution. Freeman, Oxford (UK), 1995. [137] S. Milgram. The small world problem. Psychology Today, 2:60–67, 1967. [138] M. Mobilia and S. Redner. Majority versus minority dynamics: Phase transition in an interacting two spin system. Phys. Rev. E, 68:046106, 2003. [139] M. Molloy and B. Reed. A critical point for random graphs with a given degree sequence. In Proceedings of the Sixth International Seminar on Random Graphs and Probabilistic Methods in Combinatorics and Computer Science, “Random Graphs ’93” (Pozna´ n, 1993), volume 6, pages 161–179, 1995. [140] M. Molloy and B. Reed. The size of the giant component of a random graph with a given degree sequence. Combin. Probab. Comput., 7(3):295–305, 1998. [141] C. Moore and M. E. J. Newman. Epidemics and percolation in smallworld networks. Phys. Rev. E, 61:5678–5682, 2000.

BIBLIOGRAPHY [142] S. Mossa, M. Barth´el´emy, E. H. Stanley, and N. Amaral. Truncation of power law behavior in “scale-free” network models due to information filtering. Phys. Rev. Lett., 88(13):138701, 2002. [143] M. E. J. Newman. Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys. Rev. E, 64:016132, 2001. [144] M. E. J. Newman. Assortative mixing in networks. Phys. Rev. Lett., 89:208701, 2002. [145] M. E. J. Newman. Ego-centered networks and the ripple effect. Social networks, 25(1):83–95, 2003. [146] M. E. J. Newman. Random graphs as models of networks. In S. Bornholdt and H. G. Schuster, editors, Handbook of Graphs and Networks, pages 35–68. Wiley-VCH, Berlin (Germany), 2003. [147] M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167, 2003. [148] M. E. J. Newman. Coauthorship networks and patterns of scientific collaboration. Proc. Natl. Acad. Sci. USA, pages 5200–5, Apr 2004. [149] M. E. J. Newman. A measure of betweenness centrality based on random walks. Social networks, 27(1):39–54, 2005. [150] E. L. Newport. Maturational constraints on language learning. Cognitive Science, 14(1):11–28, 1990. [151] J. Noh and H. Rieger. Random Walks on Complex Networks. Phys. Rev. Lett., 92(11):118701, 2004. [152] M. A. Nowak and D. C. Krakauer. The evolution of language. PNAS, 96(14):8028–8033, July 1999. [153] M. A. Nowak, J. B. Plotkin, and V. A. Jansen. The evolution of syntactic communication. Nature, 404(6777):495–498, March 2000. [154] M. A. Nowak, J. B. Plotkin, and J. D. Krakauer. The evolutionary language game. Journal Theoretical Biology, 200:147, 1999. [155] G. Odor. Universality classes in nonequilibrium lattice systems. Reviews of Modern Physics, 76:663, 2004. [156] M. Oliphant. Formal approaches to innate and learned communication: Laying the foundations for language. PhD thesis, University of California, San Diego, CA (USA), 1997.

177

178

BIBLIOGRAPHY [157] M. Oliphant and J. Batali. Learning and the emergence of coordinated communication. The newsletter of the Center for Research in Language, 11(1), 1997. [158] M. Osborne and A. Rubinstein. A Course in Game Theory. The MIT Press, Boston (USA), 1994. [159] R. Pastor-Satorras, A. V´azquez, and A. Vespignani. Dynamical and correlation properties of the internet. Phys. Rev. Lett., 87(25):258701, 2001. [160] R. Pastor-Satorras, A. V´azquez, and A. Vespignani. Topology, hierarchy, and correlations in internet graphs. Lecture Notes In Physics, 650:425–442, 2004. [161] R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scalefree networks. Phys. Rev. Lett., 86(14):3200, 2001. [162] R. Pastor-Satorras and A. Vespignani. Evolution and Structure of the Internet: A Statistical Physics Approach. Cambridge University Press, Cambridge (USA), 2004. [163] S. Pinker. The language instinct. Morrow, New York (USA), 1994. [164] S. Pinker and P. Bloom. Natural language and natural selection. Behavioral and Brain Sciences, 13(4):707–784, 1990. [165] W. V. O. Quine. Word and Object. The MIT Press, Cambridge, MA (USA), 1960. [166] A. Rapoport and A. M. Chammah. Prisoner’s dilemma: a study in conflict and cooperation. The University of Michigan Press, Ann Arbor (USA), 1965. [167] S. Redner. A Guide to First-Passage Processes. Cambridge University Press, Cambridge (UK), 2001. [168] F. Ritort and P. Sollich. Glassy dynamics of kinetically constrained models. Advances in Physics, 52(4):219–342, 2003. [169] E. Rogers. The diffusion of innovations. Free Press, New York (USA), 1995. [170] T. Rosenthal and B. Zimmerman. Social Learning and Cognition. Academic Press, New York (USA), 1978. [171] J. Saram¨aki and K. Kaski. Scale-free networks generated by random walkers. Physica A, 341:80–86, 2004.

BIBLIOGRAPHY

179

[172] K. Smith, S. Kirby, and H. Brighton. Iterated learning: a framework for the emergence of language. Artificial Life, 9(4):371–386, 2003. [173] V. Sood and S. Redner. Voter model on heterogeneous graphs. Phys. Rev. Lett., 94:178701, 2005. [174] V. Sood, S. Redner, and D. ben Avraham. First-passage properties of the erd¨os renyi random graph. Journal of Physics A: Mathematical and General, 38(1):109–123, 2005. [175] D. Stauffer. Sociophysics simulations II: opinion dynamics. In P. L. Garrido, J. Marro, and M. A. Mu˜ noz, editors, Modeling cooperative behavior in the social sciences: Eighth Granada Lectures on Modeling Cooperative Behavior in the Social Sciences. AIP Conference Proceedings Volume 779, 2005. [176] L. Steels. A self-organizing spatial vocabulary. 2(3):319–332, 1995.

Artificial Life,

[177] L. Steels. Self-organizing vocabularies. In C. G. Langton and K. Shimohara, editors, Artificial Life V, pages 179–184, Nara, Japan, 1996. [178] L. Steels. The synthetic modeling of language origins. Evolution of communication, 1:1, 1997. [179] L. Steels. The Talking Heads Experiment. Volume 1. Words and Meanings. Laboratorium, Antwerpen (Belgium), 1999. [180] L. Steels. Language as a complex adaptive system. In M. Schoenauer, editor, Proceedings of PPSN VI, Lecture Notes in Computer Science, Berlin (Germany), 2000. Springer-Verlag. [181] L. Steels. Evolving grounded communication for robots. Trends in Cognitive Sciences, 7(7):308–312, 7 2003. [182] L. Steels. The emergence and evolution of linguistic structure: from lexical to grammatical communication systems. Connection Science, 17:213–230, 2005. [183] K. Suchecki, V. M. Eguiluz, and M. S. Miguel. Voter model dynamics in complex networks: Role of dimensionality, disorder, and degree distribution. Phys. Rev. E, 72:036132, 2005. [184] E. Szathm´ary. Private communication, 2005. [185] K. Sznajd-Weron and J. Sznajd. Opinion evolution in closed community. International Journal of Modern Physics C, 11:1157, 2000. [186] T. Vicsek. A question of scale. Nature, 411:421, 2001.

180

BIBLIOGRAPHY [187] T. Vicsek. The bigger picture. Nature, 418:131, 2002. [188] D. Vilone and C. Castellano. Solution of voter model dynamics on annealed small-world networks. Phys. Rev. E, 69:016109, 2004. [189] W. Wang and J. Minett. The invasion of language: emergence, change and death. TRENDS in Ecology and Evolution, 20(5), 2005. [190] S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, New York and Cambridge, ENG, 1994. [191] D. J. Watts. Small-worlds: The Dynamics of Networks between Order and Randomness. Princeton University Press, Princeton, NJ (USA), 1999. [192] D. J. Watts and S. H. Strogatz. Collective dynamics of ’small world’ networks. Nature, 393:440, 1998. [193] W. Weidlich. Sociodynamics: A Systematic Approach to Mathematical Modelling in the Social Sciences. Harwood Academic Publishers, 2000. [194] L. Wittgenstein. Philosophical Investigations. (Translated by Anscombe, G.E.M.). Basil Blackwell, Oxford (UK), 1953. [195] L. Wittgenstein. Philosophische Untersuchungen. Suhrkamp Verlag, Frankfurt am Main (Germany), 1953. [196] W. Woess. Random Walks on Infinite Graphs and Groups. Cambridge University Press, Cambridge (UK), 2000. [197] S.-J. Yang. Exploring complex networks by walking on them. Phys. Rev. E, 71(1):016107, 2005. [198] T. Zhou, G. Yan, and B.-H. Wang. Maximal planar networks with large clustering coefficient and power-law degree distribution. Phys. Rev. E, 71:046141, 2005.

Statistical mechanics approach to language games

analytical tools, so that computer simulations have acquired a central role. ... a growing number of experiments where artificial software agents or robots.

5MB Sizes 0 Downloads 182 Views

Recommend Documents

Complex system approach to language games
tem science has started to contribute, mainly by means of computer simula- .... the success rate, S(t), defined as the probability of a successful interaction.

Topic-aware pivot language approach for statistical ... - Springer Link
Journal of Zhejiang University-SCIENCE C (Computers & Electronics). ISSN 1869-1951 (Print); ISSN 1869-196X (Online) www.zju.edu.cn/jzus; www.springerlink.com. E-mail: [email protected]. Topic-aware pivot language approach for statistical machine transl

[PDF BOOK] Introduction to Modern Statistical Mechanics
... essay writing service 24 7 Enjoy proficient essay writing and custom writing services provided by professional academic writers 9781606100493 1606100491 ...

Thermodynamics and Statistical Mechanics
Jan 1, 1991 - There was no Morpheus urging Thermodynamics and Statistical Mechanics me to morph to more minimalism. We stayed. 2 nights and wish that Thermodynamics and Statistical Mechanics it could have been longer. AEIOMed is a leader in customer

Physical Chemistry. Thermodynamics, Statistical Mechanics, and ...
Physical Chemistry. Thermodynamics, Statistical Mechanics, and Kinetics - Andrew Cooksy.pdf. Physical Chemistry. Thermodynamics, Statistical Mechanics ...

A Playcentric Approach to Creating Innovative Games
Jan 1, 2008 - ... のパラベラム 心的爆撃 [Shissou suru Shishunki no Parabellum: Shinteki Bakugeki] - Makoto Fukami - Book,PDF The Bone Labyrinth - James ...

Statistical Physics Approach to the Topology and ...
finally to informatics data mapping the structure and dynamics of the Internet ...... agreement between the predicted and the observed behaviour is again excel-.