Tutorial ACL 2016

Game Theory and Natural Language: Origin, Evolution and Processing

Rocco Tripodi and Marcello Pelillo

Tutorial at ACL 2016, Berlin, August 7, 2016

The slides (part1part2) and the code are now available!


Motivation

The development of game theory in the early 1940’s by John von Neumann was a reaction against the then dominant view that problems in economic theory can be formulated using standard methods from optimization theory. Indeed, most real-world economic problems involve conflicting interactions among decision-making agents that cannot be adequately captured by a single (global) objective function. The main idea behind game theory is to shift the emphasis from optimality criteria to equilibrium conditions. Game theory provides a framework to model complex scenarios, with applications in economics and social science but also in different fields of information technology. With the recent development of algorithmic game theory, it has been used to solve problems in computer vision, pattern recognition, machine learning and natural language processing.

Game-theoretic frameworks have been used in different ways to study language origin

[1, 2] and evolution [3]. Furthermore, the so-called game metaphor has been used by philosophers and linguists to explain how language evolved and how it works. Ludwig Wittgenstein, for example, famously introduced the concept of a language game to explain the conventional nature of language, and put forward the idea of the spontaneous formation of a common language that gradually emerges from the interactions among the speakers within a population.

This concept opens the way to the interpretation of language as a complex adaptive system composed of linguistic units and their interactions, which gives rise to the emergence of structural properties. It is the core part of many computational models of language that are based on classical game theory and evolutionary game theory. With the former it is possible to model how speakers form a signaling system in which the ambiguity of the symbols is minimized [4]; with the latter it is possible to model how speakers coordinate their linguistic choices according to the satisfaction that they have about the outcome of a communication act, converging to a common language. In the same vein, many other attempts have been proposed to explain how other characteristics of language follow similar dynamics. Nowak [5, 6] proposed a framework for the evolution of a common lexicon in a population, identifying what is the minimum reproductive rate of the words, which are maintained in the lexicon of a language and what is the maximum size of a lexicon. Grammatical aspects of language evolution were taken into account in [7], identifying the dynamics that involve grammatical acquisition. Aspects of syntax emergence have been studied in [8]. Studies on semantics and pragmatics have been proposed in [9] with the concept of equilibrium semantics, which tries to unify these two aspects of language. Strategic uses of language have been studied by Clark [10], using a game theoretic framework.

Game theory and especially evolutionary game theory, thanks to their ability to model interactive situations and to integrate information from multiple sources, have also been used to solve specific problems in natural language processing and information retrieval. Van Deemter [11] proposed game theory as a potential tool for natural language generation, motivated from the observation that within a game theoretic framework it is possible to interpret language generation as a decision problem among different linguistic forms. The use of game theory in this case is related to the application of a utility function, which is able to weight the choices in case of vagueness.

More recently, Tripodi and Pelillo [12, 13, 14, 15, 16] demonstrated how a game theoretic model could be used to solve the word sense disambiguation problem. They interpreted the words to be disambiguated as players and the senses as strategies, which each player has to choose in order to maximize its utility. The concept of utility has been used in different ways in the game theory literature; in general, it refers to the satisfaction that a player obtains from the outcome of a game. In this case it means increasing the textual coherence, since the equilibrium state of the system is reached when the data are consistently labeled. An application to document clustering has been proposed in [14]. Their model tries to overcome the problem of clustering streaming data. In fact many algorithm for document clustering have problem in clustering datasets that evolve over time. They consider each document as a player and each cluster as a strategy that a player has to choose. The players have to choose their class membership according to the decisions that similar documents are taking. In both cases the data are processed concurrently and the disambiguation/clustering of a single instance depends at least implicitly on the choices of the other instances in the system.

The goal of this tutorial is to offer an introduction to the basic concepts of game theory and to show its main applications in the study of language, from different perspectives. We shall assume no pre-existing knowledge of game theory by the audience, thereby making the tutorial self-contained and understandable by a non-expert. We plan to set up a webpage, which, among other things, will contain the tutorial’s slides and pointer to relevant literature and software.

References

  1. Pietarinen, Ahti-Veikko. 2007. Game theory and linguistic meaning. BRILL.
  2. Skyrms, Brian. 2010. Signals: Evolution, learning, and information. Oxford University Press.
  3. Nowak, Martin A, Natalia L Komarova, and Partha Niyogi. 2001. Evolution of universal grammar. Science, 291(5501):114–118.
  4. David Lewis. Convention: A philosophical study. Harvard University Press, 1967.
  5. Martin A Nowak. The basic reproductive ratio of a word, the maximum size of a lexicon. Journal of theoretical biology, 204(2):179–189, 2000.
  6. Martin A Nowak, Natalia L Komarova, and Partha Niyogi. Computational and evolutionary aspects of language. Nature, 417(6889):611–617, 2002
  7. Natalia L Komarova, Partha Niyogi, and Martin A Nowak. The evolutionary dynamics of grammar acquisition. Journal of theoretical biology, 209(1):43–59, 2001.
  8. Luc Steels. The origins of syntax in visually grounded robotic agents. Artificial Intelligence, 103(1):133–156, 1998
  9. Parikh, Prashant. Language and equilibrium. Cambridge, MA: MIT Press, 2010.
  10. Clark, Robin. “Meaningful games.” (2011).
  11. Van Deemter, Kees. “What game theory can do for NLG: the case of vague language.” Proceedings of the 12th European Workshop on Natural Language Generation. Association for Computational Linguistics, 2009.
  12. Tripodi, Rocco, and Marcello Pelillo. WSD-games: a Game-Theoretic Algorithm for Unsupervised Word Sense Disambiguation, Proceedings of SemEval-2015,2015.
  13. Tripodi, Rocco, and Marcello Pelillo, A Game Theoretic Model for Word Sense Disambiguation, Computational Linguistics, (to appear). arXiv preprint https://arxiv.org/pdf/1606.07711v4.pdf.
  14. Tripodi, Rocco, and Marcello Pelillo. “Document clustering games.” The 5th International Conference on Pattern Recognition Applications and Methods. 2016.
  15. Tripodi, Rocco, and Marcello Pelillo. “Document Clustering Games in Static and Dynamic Scenarios.” (ICPPAM2016, selected papers to appear in Springer LNCS). arXiv preprint http://arxiv.org/pdf/1607.02436v1.pdf, 2016.
  16. Tripodi, Rocco, Sebastiano Vascon, and Marcello Pelillo. “Context Aware Nonnegative Matrix Factorization Clustering.”. (Accepted to ICPR) arXiv preprint http://arxiv.org/pdf/1609.04628.pdf 2016.

Outline

1. Introduction to game theory

  a. What is a game?

  b. Mixed strategies, expected payoffs, Nash equilibria

  c. Evolutionary stable strategies, correlated equilibria

  d. Complexity and algorithmic issues

Coffee Break

2. Origin and evolution of language

  a. Origin of language

  b. Language evolution

3. Applications

  a. Word sense disambiguation

  b. Document clustering

About the presenters

Marcello Pelillo is Full Professor of Computer Science at Ca’ Foscari University, where he directs the European Centre for Living Technology (ECLT). He held visiting research positions at Yale University, McGill University, the University of Vienna, York University (UK), the University College London, and the National ICT Australia (NICTA). He has published more than 200 technical papers in refereed journals, handbooks, and conference proceedings in the areas of machine learning, pattern recognition and computer vision. He has initiated several conference series, including EMMCVPR in 1997, IWCV in 2008, SIMBAD in 2011, and he chairs the EMMCVPR and SIMBAD steering committees. He serves (has served) on the Editorial Boards of the journals IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), IET Computer Vision, Pattern Recognition, Brain Informatics, and he serves on the Advisory Board of the International Journal of Machine Learning and Cybernetics. He has served (serves) as Guest Editor for various special issues of IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Pattern Recognition, Pattern Recognition Letters. He is (or has been) scientific coordinator of several research projects, including SIMBAD, an EU-FP7 project devoted to similarity-based pattern analysis and recognition whose activity is described in a recently published Springer book, and he has recently won an award from the Samsung Global Research Outreach (GRO) program. Prof. Pelillo is a Fellow of the IEEE and a Fellow of the IAPR. He has recently been appointed IEEE SMC Distinguished Lecturer.

Rocco Tripodi is Post-doctoral researcher at European Centre for Living Technology (Ca’ Foscari University of Venice), where he is working on learning models based on game theoretic principles. He completed his PhD in Computer Science at Ca’ Foscari University in 2015 with a thesis titled “Evolutionary Game Theoretic Models for Natural Language Processing”. He was Research Assistant and Adjunct Professor at Ca’ Foscari University in 2010, where he worked on ontological semantics and taught Corpus Linguistics and Natural Language Processing for Online Applications. In 2015, he was visiting researcher at Staffordshire University (UK). His research interests are in the areas of machine learning and natural language processing. He is particularly interested in classification problems on networked data, from a semantic perspective and on the design, learning and evolution of linguistic communication systems. At present, he is working on game theoretic learning models, with applications to the analysis of multi-modal signals from interacting agents and on group dynamics.