Synopses & Reviews
Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions.
The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking - and bewildering. This book explores the theory and practice of the “web as corpus”. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.
Gatto seeks to provide a comprehensive introduction to key issuesthat have been raised and the tools and resources that have been developed in the multifaceted research field known as Web as Corpus.More specifically, she presents, explains, and question some of the theoretical implications of using the web as a corpus, whileintroducing a variety of methods that have become standard in the field. The web cannot be considered a traditional corpus in its ownright, she says, but it is crucially important to be aware of the many ways its enormous potential can be exploited by corpuslinguistics, and to acknowledge fruitful interaction between established methodological standards and novel approaches.Annotation ©2014 Ringgold, Inc., Portland, OR (protoview.com)
About the Author
Maristella Gatto is a Researcher and Lecturer in English Language and Translation at the Faculty of Modern Languages, University of Bari, Italy.
Table of Contents
1. Corpus, Concordance, collocation: Basic notions
2. From Body to Web: An Introduction to the Web as Corpus
3. Challenging Anarchy: Web Search from a Corpus Perspective
4. Beyond Ordinary Search Engines: Concordancing the Web
5. Building and Using Comparable Web Corpora: Tools and Methods
6: Sketches of Language and Culture from Large Web Corpora
7. From Download to Upload. The web as Corpus in the web 2.0 era