Library

Natural Language Processing

Partha Niyogi

The Computational Nature of Language Learning and Evolution

The nature of the interplay between language learning and the evolution of a language over generational time is subtle. We can observe the learning of language by children and marvel at the phenomenon of language acquisition; the evolution of a language, however, is not so directly experienced. Language learning by children is robust and reliable, but it cannot be perfect or languages would never change―and English, for example, would not have evolved from the language of the Anglo-Saxon Chronicles. In this book Partha Niyogi introduces a framework for analyzing the precise nature of the relationship between learning by the individual and evolution of the population. Learning is the mechanism by which language is transferred from old speakers to new. Niyogi shows that the evolution of language over time will depend upon the learning procedure―that different learning algorithms may have different evolutionary consequences. He finds that the dynamics of language evolution are typically nonlinear, with bifurcations that can be seen as the natural explanatory construct for the dramatic patterns of change observed in historical linguistics. Niyogi investigates the roles of natural selection, communicative efficiency, and learning in the origin and evolution of language―in particular, whether natural selection is necessary for the emergence of shared languages. Over the years, historical linguists have postulated several accounts of documented language change. Additionally, biologists have postulated accounts of the evolution of communication systems in the animal world. This book creates a mathematical and computational framework within which to embed those accounts, offering a research tool to aid analysis in an area in which data is often sparse and speculation often plentiful.


Kees van Deemter

Computational Models of Referring

To communicate, speakers need to make it clear what they are talking about. The act of referring, which anchors words to things, is a fundamental aspect of language. In this book, Kees van Deemter shows that computational models of reference offer attractive tools for capturing the complexity of referring. Indeed, the models van Deemter presents cover many issues beyond the basic idea of referring to an object, including reference to sets, approximate descriptions, descriptions produced under uncertainty concerning the hearer's knowledge, and descriptions that aim to inform or influence the hearer. The book, which can be read as a case study in cognitive science, draws on perspectives from across the cognitive sciences, including philosophy, experimental psychology, formal logic, and computer science. Van Deemter advocates a combination of computational modeling and careful experimentation as the preferred method for expanding these insights. He then shows this method in action, covering a range of algorithms and a variety of methods foring them. He shows that the method allows us to model logically complicated referring expressions, and demonstrates how we can gain an understanding of reference in situations where the speaker's knowledge is difficult to assess or where the referent resists exact definition. Finally, he proposes a program of research that addresses the open questions that remain in this area, arguing that this program can significantly enhance our understanding of human communication.


Michael Brady

Computational Models of Discourse

As the contributions to this book make clear, a fundamental change is taking place in the study of computational linguistics analogous to that which has taken place in the study of computer vision over the past few years and indicative of trends that are likely to affect future work in artificial intelligence generally. The first wave of efforts on machine translation and the formal mathematical study of parsing yielded little real insight into how natural language could be understood by computers or how computers could lead to an understanding of natural language. The current wave of research seeks both to include a wider and more realistic range of features found in human languages and to limit the dimensions of program goals. Some of the new programs embody for the first time constraints on human parsing which Chomsky has uncovered, for example. The isolation of constraints and the representations for their expression, rather than the design of mechanisms and ideas about process organization, is central to the work reported in this volume. And if present goals are somewhat less ambitious, they are also more realistic and more realizable.


M. Fernandez

The Handbook of Psycholinguistics

Incorporating approaches from linguistics and psychology, The Handbook of Psycholinguistics explores language processing and language acquisition from an array of perspectives and features cutting edge research from cognitive science, neuroscience, and other related fields. The Handbook provides readers with a comprehensive review of the current state of the field, with an emphasis on research trends most likely to determine the shape of psycholinguistics in the years ahead. The chapters are organized into three parts, corresponding to the major areas of psycholinguists: production, comprehension, and acquisition. The collection of chapters, written by a team of international scholars, incorporates multilingual populations and neurolinguistic dimensions. Each of the three sections also features an overview chapter in which readers are introduced to the different theoretical perspectives guiding research in the area covered in that section. Timely, comprehensive, and authoritative, The Handbook of Psycholinguistics is a valuable addition to the reference shelves of researchers in psychology, linguistics, and cognitive science, as well as advanced undergraduates and graduate students interested in how language works in the human mind and how language is acquired.


James Pustejovsky

Natural Language Annotation for Machine Learning

Create your own natural language training corpus for machine learning. Whether you’re working with English, Chinese, or any other natural language, this hands-on book guides you through a proven annotation development cycle—the process of adding metadata to your training corpus to help ML algorithms work more efficiently. You don’t need any programming or linguistics experience to get started. Using detailed examples at every step, you’ll learn how the MATTER Annotation Development Process helps you Model, Annotate, Train,, Evaluate, and Revise your training corpus. You also get a complete walkthrough of a real-world annotation project. Define a clear annotation goal before collecting your dataset (corpus) Learn tools for analyzing the linguistic content of your corpus Build a model and specification for your annotation project Examine the different annotation formats, from basic XML to the Linguistic Annotation Framework Create a gold standard corpus that can be used to train and ML algorithms Select the ML algorithms that will process your annotated data Evaluate the results and revise your annotation task Learn how to use lightweight software for annotating texts and adjudicating the annotations This book is a perfect companion to O’Reilly’s Natural Language Processing with Python.


Steven Bird

Natural Language Processing with Python

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, youll learn how to write Python programs that work with large collections of unstructured text. Youll access richly annotated datasets using a comprehensive range of linguistic data structures, and youll understand the main algorithms for analyzing the content and structure of written communication. Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify named entities. Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If youre interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if youre simply curious to have a programmers perspective on how human language works -- youll find Natural Language Processing with Python both fascinating and immensely useful.