Cargando…

Compiling and Annotating a Learner Corpus for a Morphologically Rich Language

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Rosen, Alexandr
Otros Autores:	Hana, Jiří, Vidová Hladká, Barbora
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Prague : Karolinum Press, 2020.
Temas:	Corpora (Linguistics) Czech language. Corpus (Linguistique) Tchèque (Langue) Czech language
Acceso en línea:	Texto completo

Tabla de Contenidos:

Cover
Contents
List of abbreviations
Introduction
About this book
Reasons to study non-native Czech
Some properties of non-native Czech
Morphology
Syntax
Word segmentation
Learner corpus
Roadmap
Learner corpora
Terminology
Various types of learner corpora
The choice of texts
Annotation
Textual annotation
Linguistic annotation
Error annotation
correction
Error annotation
categorization
Annotation scheme
Data access
Some learner corpora
ASK
CLC
COPLE2
CroLTeC
Falko
ICLE
MERLIN
RLC
SweLL
Relationships of CzeSL with other learner corpora
Introducing the CzeSL project
Specifications of CzeSL
Intended usage
AKCES
the umbrella project
Procurement of texts
Text collection
Transcription
Anonymization
Metadata
Error annotation
Errors and learner language
More than one way to annotate errors in CzeSL
A wishlist for error annotation
Interference and other types of explanation
Interpretation in terms of TH
Word order
Style
Communication goal
The two-tier annotation scheme
Annotation scheme as a compromise
Why multiple tiers
How many tiers
Multiple tiers in a tabular format
Content of the tiers
A sample text with T1 vs. T2 corrections
Links between tiers
Error tags
Morphosyntactic references
Follow-up corrections
Alternative target hypotheses
Error tagset
Based on linguistic categories
Grammar-based vs. formal errors
Extent of the annotated unit
Grammar-based tags
Errors at T1
Errors at T2
Coarse-grained
An example of complex annotation
Evaluation of the manual tiered error annotation
Inter-annotator agreement (IAA)
A pilot annotation
IAA on all doubly-annotated texts
Error tags depend on target hypothesis
Possible causes of the annotators' disagreements
Formal tags
Automatic extension and modification of error annotation
Automatic detection of formal errors on T1
Formal orthographic errors
Formal errors sometimes influencing pronunciation
Formal errors influencing pronunciation
Other types of errors
Automatic classification of word-boundary errors
Implicit error annotation
Multi-dimensional error annotation (MD)
Focus on morphology
All annotation applied to the source text
Extent of the annotated unit
Alternative error domains
Source text, target hypothesis, annotated strings
Domains and features
Linguistic annotation
Annotation with tools for Standard Czech
Annotation of target hypothesis
Annotation of T1
Annotation of source texts
Annotation of interlanguage in UD
Tokenization
Part-of-speech and morphology
Lemmata
Syntactic Structure
Evaluation
Annotation process
Overview of the annotation process
Transcription and anonymization of manuscripts
Tiered error annotation
Manual error annotation

Compiling and Annotating a Learner Corpus for a Morphologically Rich Language

Ejemplares similares