1st Edition

Ellipsis and wa-marking in Japanese Conversation

By John Fry Copyright 2003
    220 Pages
    by Routledge

    This book investigates the operation of two linguistic mechanisms, ellipsis and wa-marking, in a corpus of colloquial Japanese speech. Its data set is the CallHome Japanese (CHJ) corpus, a collection of transcripts and digitized speech data for 120 telephone conversations between native speakers of Japanese. To make the CHJ data useful for linguistic research, John Fry annotates the original transcripts with a comprehensive set of acoustic, phonetic, syntactic, and semantic tags. John Fry demonstrates that Japanese conversation obeys certain principles of argument ellipsis that appear to be language universal: namely, the tendency to omit transitive and human subjects and the tendency to express at most one argument per clause. He identifies a set of syntactic and semantic factors that correlate significantly with the ellipsis of grammatical particles following a noun phrase. These factors include the grammatical construction type (question, idiom), length of the NP, utterance length, proximity of the NP to the predicate, and the animacy and definiteness of the NP. The animacy and definiteness constrains are of particular interest because these too seem to reflect language-universal principles. Analyzing the CHJ data further, Fry investigates the use and function of the topic-marking particle wa. His study identifies a set of semantic and prosodic properties that tend to distinguish wa from the subject-marking particle ga. This book shows that wa-phrases exhibit more prominent intonation, as measured by peak F0, than ga-phrases in the CHJ speech data, contradicting accounts which predict that ga-phrases, because they are associated with new information, should be more prominent.

    1 Introduction
    1.1 Overview
    1.1.1 Part I: The CHJ Corpus
    1.1.2 Part II: Ellipsis and wa-marking
    1.2 Notes to the Reader
    1.2.1 Intended Audience
    1.2.2 Japanese Language Examples

    I. The CHJ Corpus
    2 Corpora and Conversation
    2.1 Introduction to Part I
    2.2 Introduction to Language Corpora
    2.2.1 The Role of the Corpus in Linguistics
    2.2.2 Basic Features of Corpora
    2.2.3 Annotated Corpora
    2.3 Speech Corpora
    2.3.1 Spoken versus Written Language
    2.3.2 Planned Speech
    2.3.3 Pragmatic or Task-Oriented Dialogues
    2.3.4 Casual Conversations
    2.4 Characteristics of Conversation
    2.4.1 Turn-taking Behavior
    2.4.2 Backchannel Behavior
    2.4.3 Disfluencies
    2.4.4 Conversational Structure
    3 The CHJ Corpus
    3.1 The LDC CallHome Corpora
    3.2 About the CHJ Corpus
    3.3 About the Speakers
    3.4 The CHJ Transcripts
    3.4.1 Morphological Segmentation
    3.4.2 Size of the CHJ Corpus
    3.4.3 Other Transcription Conventions
    3.4.4 Alterations to the Transcripts
    4 Annotating the CHJ Corpus
    4.1 Introdution
    4.1.1 Native-Speaker Annotators
    4.1.2 NTT Goi-Taikei Semantic Dictionary
    4.2 The CHJ Lexicon
    4.2.1 Overview of the Lexicon
    4.2.2 GT Semantic Categories
    4.3 Semantic and POS Annotations
    4.3.1 Format of the Annotated Transcripts
    4.3.2 POS Annotations
    4.4 Predicate-Argument Annotations
    4.4.1 Structural Annotation
    4.4.2 Goi-Taikei Transfer Dictionary
    4.4.3 Hand-tagging of Predicate-Argument Relations
    4.4.4 Results of the Hand-tagging
    4.4.5 Predicate-Argument Annotation Format
    4.5 Acoustic Annotations
    4.5.1 Overview of Speech Processing
    4.5.2 F0 Measurements
    4.5.3 Word Segmentation
    4.5.4 Format of Acoustic Annotations

    II. Ellipsis and wa-marking
    5 Ellipsis
    5.1 Introduction to Part II
    5.2 Introduction to Ellipsis
    5.2.1 What Is Ellipsis?
    5.2.2 Examples of Ellipsis
    5.2.3 Functions of Ellipsis
    5.3 Argument Ellipsis
    5.3.1 Note on Zero Pronoun Resolution
    5.3.2 Argument Ellipsis in the CHJ Corpus
    5.3.3 Subject Ellipsis
    5.3.4 Ellipsis in Transitive and Intransitive Predicates
    5.3.5 Conclusion: Argument Ellipsis
    5.4 Particle Ellipsis
    5.4.1 Introduction
    5.4.2 Sex and Dialect
    5.4.3 Syntactic Factors in Particle Ellipsis
    5.4.4 Animacy and Definiteness
    5.4.5 Focus and Particle Ellipsis
    5.4.6 Conclusion: Particle Ellipsis
    6 Wa-marking
    6.1 Introduction
    6.1.1 Topic and Subject in Japanese
    6.1.2 Mechanics of Wa-marking
    6.2 Semantics of wa- and ga- Phrases
    6.2.1 Kuno's Taxonomy of wa and ga
    6.2.2 Categorical versus Thetic Judgments
    6.2.3 Wa as a Backgrounding Particle
    6.2.4 Old versus New Information
    6.2.5 File Card-Based Accounts of wa and ga
    6.2.6 The Strong Familiarity Condition
    6.2.7 Conclusion: Semantics of wa- and ga- Phrases
    6.3 Intonation of wa and ga
    6.3.1 Intonation and Focus
    6.3.2 F0 Correlates of wa- Phrases
    6.3.3 F0 Correlates of wa and ga in CHJ
    6.3.4 Conclusion: Intonation of wa and ga
    6.4 Properties of wa-marked Nouns
    6.4.1 Accessibility to wa-marking
    6.4.2 Semantic Properties of wa- and ga-marked Nouns
    6.4.3 Conclusion: Properties of wa-marked Nouns

    III. Appendices
    A Background on the Japanese Language
    A.1 Introduction
    A.2 Grammar
    A.3 Dialects
    A.4 Politeness and Formality
    A.5 Sentence-final Discourse Particles



    John Fry received his Ph.D. in Linguistics from Stanford University in 2002, and is currently a consultant at Stanford's Center for the Study of Language and Information (CSLI). His research interests include natural language processing, speech processing, and Japanese semantics and syntax.