Multilingual corpora and multilingual corpus analysis.
This paper presents the metadata model of the EXMARaLDA system and its implementations. It will first take a look on existing metadata schemes for transcriptions of spoken language as well as written texts and emphasize on their advantages and disadvantages. The paper will justify the decisions agai...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
[Place of publication not identified] :
John Benjamins,
2012.
|
Colección: | Hamburg studies on multilingualism ;
v. 14 |
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Multilingual Corpora and Multilingual Corpus Analysis
- Editorial page
- Title page
- LCC page
- Dedication page
- Table of contents
- Introduction
- Section 1. Learner and attrition corpora
- The LeaP corpus: A multilingual corpus of spoken learner German and learner English
- 1. Introduction
- 2. LeaP corpus: Primary data
- 3. Corpus annotation
- 4. Corpus data format
- 5. Corpus search
- 6. Exploring fluency in second language learner speech with the LeaP corpus
- 7. Conclusion
- References
- Technological and methodological challenges in creating, annotating and sharing a learner corpus of
- 1. Introduction
- 2. The Hamburg Map Task Corpus
- 3. Manual interpretative annotation
- 4. Conclusion
- References
- Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in cont
- 1. Introduction
- 2. The Corpus of Reading Comprehension Exercises in German (CREG)
- 3. Corpus collection and the WELCOME tool
- 4. Inter-annotator agreement analysis for meaning assessment
- 5. Meaning assessment results
- 6. Avenues for future research
- 7. Summary
- Acknowledgments
- References
- The ALeSKo learner corpus: Design
- annotation
- quantitative analyses
- 1. Introduction
- 2. Design of the corpus
- 3. Annotation layers
- 4. Quantitative descriptive analyses
- 5. Applications for the corpus
- Acknowledgements
- References
- Corpora of spoken Spanish by simultaneous and successive German-Spanish bilingual and Spanish monoli
- 1. Introduction
- 2. Description of the corpora
- 3. Further research
- Acknowledgements
- References
- Monolingual and bilingual phonoprosodic corpora of child German and child Spanish
- 1. Introduction
- 2. The PAIDUS corpus
- 3. The corpus PhonBLA
- 4. Concluding remarks
- References.
- Pragmatic corpus analysis, exemplified by Turkish-German bilingual and monolingual data
- 1. An introductory note on methodology
- 2. The data: Corpus and constellation
- 3. Research questions and aspects of frequency
- 4. Procedures of quantitative analysis
- 5. Classification of search results
- 6. Contextual interpretation of the items
- 7. Extending the analysis: Interpretative procedures
- 8. Consequences and further research
- Abbreviations and conventions
- References
- Corpus of Polish spoken in Germany: Collecting and analysing written & spoken data for investigating
- 1. Introduction
- 2. Participants of the study
- 3. Corpus design
- 4. Data acquisition and storage
- 5. Transcription
- 6. Corpus publication and reuse
- References
- The HABLA-Corpus (German-French and German-Italian)
- 1. Introduction
- 2. Research on simultaneous bilingualism and the weaker (heritage) language
- 3. Corpus design
- 4. Transcription
- 5. Availability
- References
- Appendix
- Section 2. Language contact corpora
- The Hamburg Corpus of Argentinean Spanish (HaCASpa)
- 1. Introduction
- 2. Argentinean Porteño Spanish as a contact variety: The role of multilingualism and Second Language
- 3. Corpus design
- 4. Main findings
- 5. Remaining issues
- References
- Ad hoc contact phenomena or established features of a contact variety? Evidence from corpus analysis
- 1. Introduction
- 2. The language situation on the Faroe Islands: Sociopolitical and linguistic factors
- 3. Written and spoken language in language contact
- 4. Corpus-based analyses of contact-induced transfer
- 5. The corpora
- 6. Case study: The use of subjunctions in conditional clauses in Faroe-Danish
- 7. Conclusion
- References
- Phonoprosodic corpus of spoken Catalan (PhonCAT)
- 1. Introduction: PhonCAT
- 2. Data collection.
- 3. Data segmentation and coding
- 4. Collected data
- 5. Data analysis
- 6. Conclusions
- References
- Researching the intelligibility of a (German) dialect
- 1. Why passive knowledge of a dialect?
- 2. Focussing on language variation and its intelligibility in health care institutions
- 3. The corpus design
- 4. The annotation system
- 5. Evaluating the (results of) the annotating system
- References
- Annotating ambiguity: Insights from a corpus-based study on syntactic change in Old Swedish
- 1. Specific problems in historical corpora
- 2. The HaCOSSA corpus
- 3. Digital representation and linguistic annotation
- 4. Syntactic ambiguity in Old Swedish
- 5. Concluding remarks
- References
- Section 3. Interpreting corpora
- Sharing community interpreting corpora: A pilot study
- 1. Introduction
- 2. Data for the pilot study
- 3. Technical heterogeneity of the data
- 4. Common platform for sharing the data: Integration of sound, text, and images
- 5. Common approaches to annotating the data
- 6. Conclusion and outlook
- References
- CoSi
- A Corpus of Consecutive and Simultaneous Interpreting
- 1. Introduction
- 2. Corpus design
- 3. Corpus creation and editing
- 4. Corpus use
- 5. Getting access to the corpus
- References
- The corpus "Interpreting in Hospitals": Possible applications for research and communication trainin
- 1. The corpus "Interpreting in Hospitals": Design and background
- 2. The corpus "Interpreting in Hospitals" as a source for research
- 3. Using the corpus in communication trainings
- 4. Conclusions
- References
- Section 4. Comparable and parallel corpora
- The GeWiss corpus: Comparing spoken academic German, English and Polish
- 1. Putting GeWiss into context: Motivation, aims and applications
- 2. The design of the GeWiss corpus
- 3. Data acquisition
- 4. Metadata.
- 5. Transcription
- 6. Annotation
- 7. Perspectives
- References
- Corpora
- Appendix 1
- Appendix 2
- Korpus C4: A distributed corpus of German varieties
- 1. A German variety corpus
- 2. Design of the Korpus C4
- 3. Corpus format and metadata
- 4. Access to the Korpus C4
- 5. Conclusion
- References
- Treebanks in translation studies: The CroCo Dependency Treebank
- 1. Introduction
- 2. The CroCo Dependency Bank
- 3. Treebanks in translation studies
- 4. Conclusion and outlook
- References
- Section 5. Corpus tools
- Multilingual phonological corpus analysis: The tools behind the PhonBank Project
- 1. Introduction
- 2. PhonBank
- 3. Phon
- 4. A practical illustration
- 5. Outlook
- References
- Finding the balance between strict defaults and total openness: Collecting and managing metadata for
- 1. What is metadata?
- 2. Why metadata?
- 3. Metadata standards
- 4. ISLE Meta Data Initiative (IMDI)
- 5. Institut für Deutsche Sprache (IDS)
- 6. EXMARaLDA metadata
- 7. Using EXMARaLDA metadata
- 8. Possible enhancements to the toolset regarding metadata
- 9. Outlook
- References
- General index
- Corpora index
- Language index.