Cifre PhD Proposal: “Learning in Blotto games and applications to modeling attention in social networks” Keywords: Game theory, sequential learning, Blotto game, social networks, modeling

Supervisors • Alonso Silva ([email protected]): https://www.bell-labs.com/usr/alonso.silva • Patrick Loiseau ([email protected]): http://www.eurecom.fr/~loiseau/

Laboratories • Alcatel-Lucent Bell Labs France, Nozay (near Paris), Mathematics of Complex and Dynamic Networks Department. • EURECOM, Sophia-Antipolis (near Nice), Data Science Department.

Background and PhD topic description: The Colonel Blotto game is a fundamental model of strategic resource allocation: two players allocate a fixed amount of resources to a fixed number of battlefields with given values, each battlefield is then won by the player who allocated more resources to it, and each player maximizes the aggregate value of battlefields he wins. It recently gained a very high interest in theoretical and applied research communities because of its potential to model many important problems of resource allocation in strategic settings ranging from international war to computer security. In particular, it provides a good model of competition for attention of users in social networks. There, battlefields correspond to users and their values correspond to the value of convincing the users (e.g., in advertisement, it would be the value of the product bought by the user). Theoretical solutions of the Colonel Blotto game could therefore enable important progresses in designing strategies to allocate resources optimally to capture the attention of users in a social network, a topic of high importance in the online world with applications for instance to advertisement campaigns or information propagation. Applications of the Colonel Blotto game, however, have remained limited so far mostly due to the lack of solutions of the game in realistic cases. Indeed, although it was originally proposed by Borel in 1921 [2], the first Nash equilibrium solution of the game was given in 1950 [6] in a simple case (2 or 3 battlefields). In 2006, a Nash equilibrium solution was given for an arbitrary number of battlefields [9] (see also a survey in [10]), but only if all battlefields have the same value, which is not realistic in applications. In our recent work, we proposed first ideas towards a Nash equilibrium solution for arbitrary battlefields values [11], and towards a Nash equilibrium solution of the Blotto 1

game on a graph [8]; but those ideas need to be developed to reach a general Nash equilibrium solution useful in the application to competition for attention of users in social networks. Another barrier for applications is that the Nash equilibrium solution assumes complete information on the players payoffs, which is not always appropriate. The machine learning community has been very active in recent year to develop sequential learning methods in order to adjust the strategies while learning the unknown payoff parameters, in particular in the classical setting of the multi-armed bandit problem [3, 5]. These methods, however, are not adapted in a competitive environment such as the one modeled by the Colonel Blotto game and developing learning algorithms in fully game-theoretic settings is currently an open problem. The overall goal of this thesis will be to develop solutions of the Blotto games in order to use it to model competition for attention of users in social networks. Specifically, we will look to address the two key barriers mentioned above, that is: (i) First, we will look for a general Nash equilibrium solution. In particular, we will include arbitrary battlefield values, more than two players (in order to be able to model more than two competitors) and to take into account externalities on a graph (i.e., the fact that winning a battlefield has an effect on the value of neighboring battlefields in the social network). We will leverage preliminary ideas mentioned above and propose and analyze heuristics to compute approximate Nash equilibria in cases where the exact solution is not possible. (ii) Second, we will develop sequential learning methods adapted to the game-theoretic setting where several competitors are concurrently performing a learning task whose outcome depends on each other; and apply those to design strategies for the competitors to dynamically adjust their resource allocation while learning the users value. To this end, we will combine ideas from the multi-armed bandit literature [3, 5] with game theoretic ideas from repeated games [1, 4, 12] (see also [7]).

Further information and application procedure Candidates should have a strong background in mathematics (probability and preferentially either learning or game theory/optimization or both) and an interest in the application to modeling social networks. Interested candidates are invited to send the following documents to [email protected] and [email protected]: • a detailled CV, • a list of courses and grades in the last two years (at least), • the name of 2-3 references willing to provide a recommendation letter for their application, • a short statement of interest and any other information useful to evaluate the application. The position will be open until filled but the screening of application will start on May 16, so interested candidates are invited to send their application material by May 16, 2016. The start of the PhD is expected in Fall 2016 (or after a minimum delay of 2 months due to administrative procedures). The PhD is fully funded. The PhD student will be mainly based in the Alcatel-Lucent Bell Labs France research center (in the Paris region) and will spend short visits at EURECOM (in Sophia-Antipolis). 2

References [1] R. J. Aumann and M. Maschler. Repeated Games with Incomplete Information. MIT Press, 1995. [2] E. Borel. La th´eorie du jeu et les ´equations int´egrales `a noyau sym´etrique. Comptes Rendus de l’Acad´emie des Sciences, 173(1304–1308):58, 1921. [3] S. Bubeck and N. Cesa-Bianchi. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5(1):1–122, 2012. [4] F. Forges. Chapter 6 repeated games of incomplete information: Non-zero-sum. In R. Aumann and S. Hart, editors, Handbook of Game Theory with Economic Applications, volume 1, pages 155–177. Elsevier, 1992. [5] J. Gittins, K. Glazebrook, and R. Weber. Multi-Armed Bandit Allocation Indices. Wiley, 2011. [6] O. Gross and R. Wagner. A continuous Colonel Blotto game. Rand, 1950. [7] V. Kamble, P. Loiseau, and J. Walrand. Regret-optimal strategies for playing repeated games with discounted losses, 2016. Preprint, available as arXiv:1603.04981. [8] A. M. Masucci and A. Silva. Strategic resource allocation for competitive influence in social networks. In Proceedings of Allerton, 2014. [9] B. Roberson. The Colonel Blotto game. Economic Theory, 29(1):1–24, 2006. [10] B. Roberson. Allocation games. In J. J. Cochran, L. A. Cox, P. Keskinocak, J. P. Kharoufeh, and J. C. Smith, editors, Wiley Encyclopedia of Operations Research and Management Science. John Wiley and Sons, Inc., 2010. [11] G. Schwartz, P. Loiseau, and S. S. Sastry. The heterogeneous colonel blotto game. In Proceedings of NetGCooP, 2014. [12] S. Sorin. A First Course on Zero Sum Repeated Games. Springer, 2002.

3

Cifre PhD Proposal: “Learning in Blotto games and ... - Eurecom

Keywords: Game theory, sequential learning, Blotto game, social networks, modeling. Supervisors ... a list of courses and grades in the last two years (at least),.

124KB Sizes 2 Downloads 86 Views

Recommend Documents

An Experimental Investigation of Colonel Blotto Games
Sep 16, 2011 - The function that maps the two players' resource allocations into their respective ...... Princeton: Princeton University Press (2003). Chau, A.

Learning in Games
Encyclopedia of Systems and Control. DOI 10.1007/978-1-4471-5102-9_34-1 ... Once player strategies are selected, the game is played, information is updated, and the process is repeated. The question is then to understand the long-run ..... of self an

Anticipatory Learning in General Evolutionary Games - CiteSeerX
“anticipatory” learning, or, using more traditional feedback ..... if and only if γ ≥ 0 satisfies. T1: maxi ai < 1−γk γ. , if maxi ai < 0;. T2: maxi ai a2 i +b2 i. < γ. 1−γk

Learning in Network Games - Quantitative Economics
Apr 4, 2017 - arguably, most real-life interactions take place via social networks. In our .... 10Since 90% of participants request information about the network ...

Anticipatory Learning in General Evolutionary Games - CiteSeerX
of the Jacobian matrix (13) by ai ±jbi. Then the stationary ... maxi ai. , if maxi ai ≥ 0. The proof is omitted for the sake of brevity. The important ..... st.html, 2004.

ASPIRATION LEARNING IN COORDINATION GAMES 1 ... - CiteSeerX
This work was supported by ONR project N00014- ... ‡Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, ...... 365–375. [16] R. Komali, A. B. MacKenzie, and R. P. Gilles, Effect of selfish node ...

ASPIRATION LEARNING IN COORDINATION GAMES 1 ... - CiteSeerX
‡Department of Electrical and Computer Engineering, The University of Texas .... class of games that is a generalized version of so-called coordination games.

Learning in Network Games - Quantitative Economics
Apr 4, 2017 - solely on observed action choices lead us to accept certain learning rules .... arguably, most real-life interactions take place via social networks.

SoK: Fraud in Telephony Networks @Eurecom
Perpetrating fraud in telecom networks is relatively easy. Most of the ...... a TDoS attack, such as organizing people on social media ... mistakes in tariff plan or campaigns. ..... Using Only Phone Numbers,” www.forbes.com, December 2016. [4].

SoK: Fraud in Telephony Networks @Eurecom
1. Introduction. Telephony, which used to be a closed system, has un- dergone .... phone networks are the operators (carrier, telecom service provider) and third ...

PhD Proposal - UQAM, Canada The use of ...
Monte-Carlo experiments and computer experiments. Jean-François ... throughout the experimental region to cover all the input space. This technique is called ...

PhD in Machine Learning applied to Multi-modal ... -
will be put on novel deep-learning approaches, machine vision and audio processing. Address/Job Location: University of Parma (main site) / Henesis s.r.l.– Parma, Italy. We require: • Master degree in Computer Science or Physics or Applied Mathem

PhD Scholarships on Spatial Learning ... - Angela Schwering
Citizens collect and analyze data to investigate a (scientific) question which is of relevance for themselves and the city. The project to be developed can (but ...

PhD Scholarships on Spatial Learning ... - Angela Schwering
spatial learning / GI education in the context of open cities. ... Teachers are often the bottleneck when it comes to using new technologies in education. Thus, the PhD ... the PhD topic will be aligned to the background of the PhD candidate.

P2P Cache-and-Forward Mechanisms for Mobile Ad Hoc ... - Eurecom
minimizing the information access cost or the query delay. ... apply the two mobility models and develop the dissemination .... we implemented a simple application that allows nodes to ..... average and standard deviation of the χ2 index.

P2P Cache-and-Forward Mechanisms for Mobile Ad Hoc ... - Eurecom
desired content distribution. We consider a tagged1 information content and we target two desirable distributions of information: the first uniform over the spatial ...

Learning to precode in outage minimization games ...
Learning to precode in outage minimization games over MIMO .... ment learning algorithm to converge to the Nash equilibrium ...... Labs, Technical Report, 1995.

Observational Learning in Large Anonymous Games
Sep 7, 2017 - Online. Appendix available at http://bit.ly/stratlearn. †Collegio Carlo .... Finally, I show that some degree of information aggregation also occurs with signals ...... equilibrium agents coordinate on the superior technology, but in

An experiment on learning in a multiple games ...
Available online at www.sciencedirect.com ... Friedl Schoeller Research Center for Business and Society, and the Spanish Ministry of Education and Science (grant .... this does not prove yet that learning spillovers do occur since behavior may be ...

Multiagent Social Learning in Large Repeated Games
same server. ...... Virtual Private Network (VPN) is such an example in which intermediate nodes are centrally managed while private users still make.

PhD opportunity in ecohydrology and biofuel production in Brazil A ...
Candidates will require a strong analytical background and writing skills, and an ability to manage large data sets and supervise field assistants. Additionally, the ...

Keith Lohse, PhD , Lara Boyd, PT PhD , and ...
R package version 2.5. http://CRAN.R-project.org/package=wordcloud. 3. Meyer, D., Hornik, K., & Feinerer, I. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5): 1-54. URL: http://www.jstatsoft.org/v25/i05/. 4. R Core Team

Machine Learning for Computer Games
Mar 10, 2005 - GDC 2005: AI Learning Techniques Tutorial. Machine Learning for ... Teaching: Game Design and Development for seven years. • Research: ...