Loading…

Multilingual corpora and multilingual corpus analysis.

This paper presents the metadata model of the EXMARaLDA system and its implementations. It will first take a look on existing metadata schemes for transcriptions of spoken language as well as written texts and emphasize on their advantages and disadvantages. The paper will justify the decisions agai...

Full description

Bibliographic Details
Call Number:	Libro Electrónico
Main Author:	Velupillai, Viveka
Format:	Electronic eBook
Language:	Inglés
Published:	[Place of publication not identified] : John Benjamins, 2012.
Series:	Hamburg studies on multilingualism ; v. 14
Subjects:	Multilingualism > Germany. Linguistic minorities > Germany. Contrastive linguistics. Corpora (Linguistics) Multilinguisme > Allemagne. Minorités linguistiques > Allemagne. Linguistique contrastive. Corpus (Linguistique) EDUCATION > Bilingual Education. Contrastive linguistics Linguistic minorities Multilingualism Germany
Online Access:	Texto completo

Table of Contents:

Multilingual Corpora and Multilingual Corpus Analysis
Editorial page
Title page
LCC page
Dedication page
Table of contents
Introduction
Section 1. Learner and attrition corpora
The LeaP corpus: A multilingual corpus of spoken learner German and learner English
1. Introduction
2. LeaP corpus: Primary data
3. Corpus annotation
4. Corpus data format
5. Corpus search
6. Exploring fluency in second language learner speech with the LeaP corpus
7. Conclusion
References
Technological and methodological challenges in creating, annotating and sharing a learner corpus of
1. Introduction
2. The Hamburg Map Task Corpus
3. Manual interpretative annotation
4. Conclusion
References
Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in cont
1. Introduction
2. The Corpus of Reading Comprehension Exercises in German (CREG)
3. Corpus collection and the WELCOME tool
4. Inter-annotator agreement analysis for meaning assessment
5. Meaning assessment results
6. Avenues for future research
7. Summary
Acknowledgments
References
The ALeSKo learner corpus: Design
annotation
quantitative analyses
1. Introduction
2. Design of the corpus
3. Annotation layers
4. Quantitative descriptive analyses
5. Applications for the corpus
Acknowledgements
References
Corpora of spoken Spanish by simultaneous and successive German-Spanish bilingual and Spanish monoli
1. Introduction
2. Description of the corpora
3. Further research
Acknowledgements
References
Monolingual and bilingual phonoprosodic corpora of child German and child Spanish
1. Introduction
2. The PAIDUS corpus
3. The corpus PhonBLA
4. Concluding remarks
References.
Pragmatic corpus analysis, exemplified by Turkish-German bilingual and monolingual data
1. An introductory note on methodology
2. The data: Corpus and constellation
3. Research questions and aspects of frequency
4. Procedures of quantitative analysis
5. Classification of search results
6. Contextual interpretation of the items
7. Extending the analysis: Interpretative procedures
8. Consequences and further research
Abbreviations and conventions
References
Corpus of Polish spoken in Germany: Collecting and analysing written & spoken data for investigating
1. Introduction
2. Participants of the study
3. Corpus design
4. Data acquisition and storage
5. Transcription
6. Corpus publication and reuse
References
The HABLA-Corpus (German-French and German-Italian)
1. Introduction
2. Research on simultaneous bilingualism and the weaker (heritage) language
3. Corpus design
4. Transcription
5. Availability
References
Appendix
Section 2. Language contact corpora
The Hamburg Corpus of Argentinean Spanish (HaCASpa)
1. Introduction
2. Argentinean Porteño Spanish as a contact variety: The role of multilingualism and Second Language
3. Corpus design
4. Main findings
5. Remaining issues
References
Ad hoc contact phenomena or established features of a contact variety? Evidence from corpus analysis
1. Introduction
2. The language situation on the Faroe Islands: Sociopolitical and linguistic factors
3. Written and spoken language in language contact
4. Corpus-based analyses of contact-induced transfer
5. The corpora
6. Case study: The use of subjunctions in conditional clauses in Faroe-Danish
7. Conclusion
References
Phonoprosodic corpus of spoken Catalan (PhonCAT)
1. Introduction: PhonCAT
2. Data collection.
3. Data segmentation and coding
4. Collected data
5. Data analysis
6. Conclusions
References
Researching the intelligibility of a (German) dialect
1. Why passive knowledge of a dialect?
2. Focussing on language variation and its intelligibility in health care institutions
3. The corpus design
4. The annotation system
5. Evaluating the (results of) the annotating system
References
Annotating ambiguity: Insights from a corpus-based study on syntactic change in Old Swedish
1. Specific problems in historical corpora
2. The HaCOSSA corpus
3. Digital representation and linguistic annotation
4. Syntactic ambiguity in Old Swedish
5. Concluding remarks
References
Section 3. Interpreting corpora
Sharing community interpreting corpora: A pilot study
1. Introduction
2. Data for the pilot study
3. Technical heterogeneity of the data
4. Common platform for sharing the data: Integration of sound, text, and images
5. Common approaches to annotating the data
6. Conclusion and outlook
References
CoSi
A Corpus of Consecutive and Simultaneous Interpreting
1. Introduction
2. Corpus design
3. Corpus creation and editing
4. Corpus use
5. Getting access to the corpus
References
The corpus "Interpreting in Hospitals": Possible applications for research and communication trainin
1. The corpus "Interpreting in Hospitals": Design and background
2. The corpus "Interpreting in Hospitals" as a source for research
3. Using the corpus in communication trainings
4. Conclusions
References
Section 4. Comparable and parallel corpora
The GeWiss corpus: Comparing spoken academic German, English and Polish
1. Putting GeWiss into context: Motivation, aims and applications
2. The design of the GeWiss corpus
3. Data acquisition
4. Metadata.
5. Transcription
6. Annotation
7. Perspectives
References
Corpora
Appendix 1
Appendix 2
Korpus C4: A distributed corpus of German varieties
1. A German variety corpus
2. Design of the Korpus C4
3. Corpus format and metadata
4. Access to the Korpus C4
5. Conclusion
References
Treebanks in translation studies: The CroCo Dependency Treebank
1. Introduction
2. The CroCo Dependency Bank
3. Treebanks in translation studies
4. Conclusion and outlook
References
Section 5. Corpus tools
Multilingual phonological corpus analysis: The tools behind the PhonBank Project
1. Introduction
2. PhonBank
3. Phon
4. A practical illustration
5. Outlook
References
Finding the balance between strict defaults and total openness: Collecting and managing metadata for
1. What is metadata?
2. Why metadata?
3. Metadata standards
4. ISLE Meta Data Initiative (IMDI)
5. Institut für Deutsche Sprache (IDS)
6. EXMARaLDA metadata
7. Using EXMARaLDA metadata
8. Possible enhancements to the toolset regarding metadata
9. Outlook
References
General index
Corpora index
Language index.

Multilingual corpora and multilingual corpus analysis.

Similar Items