Tutorials

LIST OF ACCEPTED TUTORIALS:

MORNING:

T1: Latent Structure Models for Natural Language Processing

André F. T. Martins, Tsvetomila Mihaylova, Nikita Nangia and Vlad Niculae

Latent structure models are a powerful tool for modeling compositional data, discovering linguistic structure, and building NLP pipelines. They are appealing for two main reasons: they allow incorporating structural bias during training, leading to more accurate models; and they allow discovering hidden linguistic structure, which provides better interpretability.

This tutorial will cover recent advances in discrete latent structure models. We discuss their motivation, potential, and limitations, then explore in detail three strategies for designing such models: gradient approximation, reinforcement learning, and end-to-end differentiable methods. We highlight connections among all these methods, enumerating their strengths and weaknesses. The models we present and analyze have been applied to a wide variety of NLP tasks, including sentiment analysis, natural language inference, language modeling, machine translation, and semantic parsing.

Examples and evaluation will be covered throughout. After attending the tutorial, a practitioner will be better informed about which method is best suited for their problem.

T2: Graph-Based Meaning Representations: Design and Processing

Alexander Koller, Stephan Oepen and Weiwei Sun

The last several years have seen extensive interest in encoding and processing sentence meaning in the form of labeled directed graphs. Frameworks instantiating this line of research include e.g. Abstract Meaning Representation, graph-based rendering of Minimal Recursion Semantics, Bilexical Semantic Dependency Graphs, and Universal Conceptual Cognitive Annotation.

Complementary to advanced vector-based representations of meaning, parsing to such hierarchically structured and discrete semantic representations has been a cornerstone of Natural Language Understanding since the early days and will continue to make essential contributions to ‘making sense’ of natural language. This tutorial will (a) briefly review relevant background in formal and linguistic semantics; (b) semi-formally define a unified abstract view on different flavors of semantic graphs and associated terminology; (c) survey common frameworks for graph-based meaning representation and available graph banks; and (d) offer a technical overview of a representative selection of different parsing approaches.

The ultimate goal is to provide a unified view on different semantic graph banks and associated parsing work and, thus, to reduce the barrier to entry for NLP developers and users to benefit from recent successes and best practices in this exciting field.

T3: Discourse Analysis and Its Applications

Shafiq Joty, Giuseppe Carenini, Raymond Ng and Gabriel Murray

Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many text mining applications. This involves identifying the topic structure, the coherence structure, the coreference structure, and the conversation structure for conversational discourse. Taken together, these structures can inform text summarization, essay scoring, sentiment analysis, machine translation, information extraction, question answering, and thread recovery.

The tutorial starts with an overview of basic concepts in discourse analysis -- monologue vs. conversation, synchronous vs. asynchronous conversation, and key linguistic structures in discourse analysis. It then covers traditional machine learning methods along with the most recent works using deep learning, and compares their performances on benchmark datasets.

For each discourse structure we describe, we show its applications in downstream text mining tasks. Methods and metrics for evaluation are discussed in detail. We conclude the tutorial with an interactive discussion of future challenges and opportunities.

T4: Computational Analysis of Political Texts: Bridging Research Efforts Across Communities

Goran Glavaš, Federico Nanni and Simone Paolo Ponzetto

The usage of computational methods for the study of political texts has drastically expanded in scope, allowing for a sustained growth of the text-as-data community in political science. NLP methods have been extensively used for a number of analyses and tasks, including inferring policy positions of actors from textual evidence, detecting topics in political documents, and analyzing stylistic aspects of political communication (e.g., assessing the role of language ambiguity in framing the political agenda). Political scientists created resources and used available NLP methods to process textual data largely in isolation from the NLP community.

At the same time, NLP researchers addressed closely related tasks such as election prediction, ideology classification, and stance detection. These two communities still remain largely agnostic of one another, with NLP researchers mostly unaware of interesting applications and use cases in political science and political scientists lagging behind in applying cutting-edge NLP methods to their problems. This tutorial will provide a comprehensive overview of the body of work on computational analysis of political texts. We first look at the role that textual data play in political analyses and then proceed to examine the concrete resources and tasks addressed by the text-as-data political science community.

Next, we present the research efforts carried out so far by the NLP community with a focus on methods for the topical analysis of political texts, covering both unsupervised topic induction and supervised topic classification studies. Finally, we conclude the tutorial by focusing on political text scaling, a challenging task on ideology detection from textual data, which is at the center of quantitative political science and has recently also attracted attention from NLP scholars.

T5: Wikipedia as a Resource for Text Analysis and Retrieval

Marius Pasca

Articles within Wikipedia collectively form what might be the largest, publicly-available, decentralized resource of unstructured or semi-structured knowledge, reflecting an ever-growing number of topics of interest to people, in general, and Web users, in particular. This tutorial examines the role of Wikipedia in tasks related to text analysis and retrieval. Text analysis tasks, which take advantage of Wikipedia, include coreference resolution, word sense and entity disambiguation and information extraction.

In information retrieval, a better understanding of the structure and meaning of queries helps in matching queries against documents, clustering search results, answer and entity retrieval and retrieving knowledge panels for queries asking about popular entities. The tutorial reviews characteristics, advantages and limitations of Wikipedia relative to other existing, human-curated resources of knowledge; derivative resources, created by converting semi-structured content in Wikipedia into structured data; and the role of Wikipedia and its derivatives in text analysis and in enhancing information retrieval.

AFTERNOON:

T6: Deep Bayesian Natural Language Processing

Jen-Tzung Chien

This introductory tutorial addresses the advances in deep Bayesian learning for natural language with ubiquitous applications ranging from speech recognition to document summarization, text classification, text segmentation, information extraction, image caption generation, sentence generation, dialogue control, sentiment classification, recommendation system, question answering and machine translation, to name a few. Traditionally, "deep learning" is taken to be a learning process where the inference or optimization is based on the real-valued deterministic model. The "semantic structure" in words, sentences, entities, actions and documents drawn from a large vocabulary may not be well expressed or correctly optimized in mathematical logic or computer programs. The "distribution function" in discrete or continuous latent variable model for natural language may not be properly decomposed or estimated.

This tutorial addresses the fundamentals of statistical models and neural networks, and focus on a series of advanced Bayesian models and deep models including hierarchical Dirichlet process, Chinese restaurant process, hierarchical Pitman-Yor process, Indian buffet process, recurrent neural network, long short-term memory, sequence-to-sequence model, variational auto-encoder, generative adversarial network, attention mechanism, memory-augmented neural network, skip neural network, stochastic neural network, policy neural network, and Markov recurrent neural network. We present how these models are connected and why they work for a variety of applications on symbolic and complex patterns in natural language.

The variational inference and sampling method are formulated to tackle the optimization for complicated models. The word and sentence embeddings, clustering and co-clustering are merged with linguistic and semantic constraints. A series of case studies are presented to tackle different issues in deep Bayesian learning and understanding. At last, we will point out a number of directions and outlooks for future studies.

T7: Unsupervised Cross-Lingual Representation Learning

Sebastian Ruder, Anders Søgaard and Ivan Vulić

In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations. After providing a brief history of supervised cross-lingual word representations, we focus on: 1) how to induce weakly-supervised and unsupervised cross-lingual word representations in truly resource-poor settings where bilingual supervision cannot be guaranteed; 2) critical examinations of different training conditions and requirements under which unsupervised algorithms can and cannot work effectively; 3) more robust methods for distant language pairs that can mitigate instability issues and low performance for distant language pairs; 4) how to comprehensively evaluate such representations; and 5) diverse applications that benefit from cross-lingual word representations (e.g., MT, dialogue, cross-lingual sequence labeling and structured prediction applications, cross-lingual IR).

T8: Advances in Argument Mining

Katarzyna Budzynska and Chris Reed

Argument and debate form cornerstones of civilised society and of intellectual life. Processes of argumentation run our governments, structure scientific endeavour and frame religious belief. As our understanding of how arguments are assembled, are interpreted and have impact has improved, so it has become possible to frame computational questions about how it might be possible for machines to model and replicate the processes involved in identifying, reconstructing, interpreting and evaluating reasoning expressed in natural language arguments.

This course aims to introduce students to an exciting and dynamic area that has witnessed remarkable growth over the past 36 months. Argument mining builds on opinion mining, sentiment analysis and related to tasks to automatically extract not just what people think, but why they hold the opinions they do. From being largely beyond the state of the art barely five years ago, there are now many hundreds of papers on the topic and millions of dollars of commercial and research investment. This tutorial provides a synthesis of the major advances in the area over the past three years.

T9: Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective

Abhijit Mishra, Anirban Laha, Karthik Sankaranarayanan, Parag Jain and Saravanan Krishnan

In this tutorial, we discuss the foundational, methodological, and system development aspects of translating structured data (such as data in tabular form) and knowledge bases (such as knowledge graphs) into natural language discourses. The tutorial covers challenges and approaches for Natural Language Generation (NLG), with a primary focus on the (structured) data-to-text paradigm.

Our attendees will be able to take home the following: (1) the basic as well as trending ideas around how modern NLP and NLG techniques could be applied to describe and summarize textual data that is non-linguistic in nature or has some structure, and (2) a few interesting open-ended questions, which could lead to significant research contributions in future. We will provide an overview of diverse approaches ranging from data representation techniques to domain adaptable solutions for the data-to-text problem setting. Various solutions, starting from traditional rule-based/heuristic-driven, modern data-driven and ultra-modern deep-neural style architectures will be discussed, followed by a brief discussion on evaluation and quality estimation.

Since large scale domain independent labelled (parallel) data is rarely available for data-to-text problems, a significant portion of the tutorial will be dedicated towards unsupervised, scalable, and domain-adaptable approaches.