How to do linguistics with R : data exploration and statistical analysis /
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Amsterdam ; Philadelphia :
John Benjamins Publishing Company,
[2015]
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- How to do Linguistics with R
- Title page
- LCC data
- Dedication page
- Table of contents
- Acknowledgements
- Introduction
- 1. Who is this book written for?
- 2. The quantitative turn in linguistics
- 3. How to use this textbook
- 1. What is statistics?
- What you will learn from this chapter:
- 1.1 Statistics and statistics
- 1.2 Formulating and testing your hypotheses
- 1.2.1 Null and alternative hypotheses
- 1.2.2 Those mysterious p-valuesâ#x80;¦
- 1.2.3 Type I and Type II errors
- 1.2.4 One-tailed and two-tailed statistical tests
- 1.3 What statistics cannot do for you
- 1.4 Types of variables
- 1.5 Summary
- 2. Introduction to R
- What you will learn from this chapter:
- 2.1 Introduction
- 2.2 Installation of the basic distribution and add-on packages
- 2.3 First steps with R
- 2.3.1 Starting R
- 2.3.2 R syntax
- 2.3.3 Exiting from R or terminating a process
- 2.3.4 Getting help
- 2.4 Main types of R objects
- 2.5 RStudio
- 2.6 Importing and exporting your data and saving your graphs
- 2.6.1 Importing your data to R
- 2.6.2 Exporting your data from R
- 2.6.3 Saving your graphs
- 2.7 Summary
- 3. Descriptive statistics for quantitative variables
- What you will learn from this chapter:
- 3.1 Analysing the distribution of word lengths: Basic descriptive statistics
- 3.1.1 The data
- 3.1.2 Measures of central tendency
- 3.1.3 Measures of dispersion
- 3.2 Bad times, good times: Visualization of a distribution and finding outliers
- 3.3 Zipf's law and word frequency: Transformation of quantitative variables
- 3.4 Summary
- 4. How to explore qualitative variables
- What you will learn from this chapter:
- 4.1 Frequency tables, proportions and percentages
- 4.2 Visualization of categorical data
- 4.3 Basic Colour Terms: Deviations of Proportions in subcorpora
- 4.3.1 The data and hypothesis.
- 4.3.2 Deviation of proportions as a measure of dispersion
- 4.4 Summary
- 5. Comparing two groups
- What you will learn from this chapter:
- 5.1 Comparing group means (medians): An overview of the tests
- 5.2 Comparing the number of associations triggered by high- and low-frequency nouns with the help of an independent t-test
- 5.2.1 Data and hypothesis
- 5.2.2 Descriptive statistics and visualizations
- 5.2.3 Choosing an appropriate test to compare the measures of central tendency in two groups
- 5.2.4 Confidence intervals and standard errors
- 5.3 Comparing concreteness scores of high- and low-frequency nouns with the help of a two-tailed Wilcoxon test
- 5.3.1 Data and hypotheses
- 5.3.2 Descriptive statistics and visualizations: Strip charts and rug plots
- 5.3.3 Inferential statistics: The two-tailed Wilcoxon test
- 5.4. Comparing associations produced by native and non-native speakers: A paired one-tailed t-test
- 5.4.1 Creating simulation data
- 5.4.2 Performing the paired t-test
- 5.5 Summary
- 6. Relationships between two numerical variables
- What you will learn from this chapter:
- 6.1 What is correlation?
- 6.2 Word length and word recognition: The Pearson product-moment correlation coefficient
- 6.2.1 The data and hypothesis
- 6.2.2 Descriptive statistics and visualizations
- 6.2.3 Testing the significance of the correlation coefficient
- 6.3 Emergence of grammar from lexicon: Spearman's Ï#x81; and Kendall's Ï#x84;.
- 6.3.1 The data and hypothesis
- 6.3.2 Exploring the data and computing correlation coefficients
- 6.4 Visualization of correlations between more than two variables with the help of correlograms
- 6.5 Summary
- 7. More on frequencies and reaction times
- What you will learn from this chapter
- 7.1 The basic principles of linear regression analysis.
- 7.2 Putting several factors together: Predicting reaction times in a lexical decision task
- 7.2.1 Data and hypotheses
- 7.2.2 The lm() function and interpretation of its output
- 7.2.3 Selecting the explanatory variables
- 7.2.4 Checking for outliers and overly influential observations
- 7.2.5 Checking the regression assumptions
- 7.2.6 Testing and interpreting interactions
- 7.2.7 Checking for overfitting
- 7.2.8 Robust regression: Bootstrap
- 7.3 Summary
- 8. Finding differences between several groups
- What you will learn from this chapter:
- 8.1 What is ANOVA?
- 8.2 Motion events in Nicaraguan Sign Language: Independent one-way ANOVA
- 8.2.1 Theoretical background and data
- 8.2.2 Exploring the data
- 8.2.3 Assumptions of one-way parametric ANOVA
- 8.2.4 Performing parametric one-way ANOVA
- 8.2.5 Alternative tests
- 8.2.6 Post-hoc tests
- 8.3 Development of spatial modulations in Nicaraguan Sign Language: Independent factorial (two-way) ANOVA
- 8.3.1 The data and hypothesis
- 8.3.2 Descriptive statistics for different groups and interaction plot
- 8.3.3 Assumptions of parametric factorial ANOVA
- 8.3.4 ANOVA and orthogonal contrasts
- 8.3.5 Alternative tests
- 8.3.6 Post-hoc tests
- 8.4 Do native English and native Mandarin Chinese speakers conceptualize time differently? Repeated-measured and mixed-design ANOVA (mixed GLM method)
- 8.4.1 The data and hypothesis
- 8.4.2 Fitting a mixed-design ANOVA with the help of mixed GLM
- 8.4.3 Post-hoc tests
- 8.5 Summary
- 9. Measuring associations between two categorical variables
- What you will learn from this chapter:
- 9.1 Testing independence
- 9.2 The story of over is not over: Metaphoric and non-metaphoric uses in two registers (analysis of a 2-by-2 contingency table)
- 9.2.1 The data and hypothesis.
- 9.2.2 Visualizations, proportions and measures of effect size: Odds ratios, Cramér's V and the ø coefficient
- 9.2.3 Testing statistical significance: The Ï#x87;2 -test of independence
- 9.3 Metaphorical and non-metaphorical uses of see in four registers (analysis of a 4-by-2 table)
- 9.3.1 The data and hypothesis
- 9.3.2 Descriptive statistics and visualizations
- 9.3.3 Testing the statistical significance and analysing the residuals: The Ï#x87;2-test and mosaic and association plots
- 9.4 Summary
- 10. Association measures
- What will you learn from this chapter:
- 10.1 Measures of association: A brief typology
- 10.1.1 Frequencies that you will need in order to compute association measures
- 10.1.2 Unidirectional (asymmetric) vs. bidirectional (symmetric) measures
- 10.1.3 Contingency-based vs. non-contingency-based measures
- 10.2 Case study: The Russian ditransitive construction and its collexemes
- 10.2.1 Theoretical background and data
- 10.2.2 Computation of some popular association measures
- 10.3 Summary
- 11. Geographic variation of quite: Distinctive collexeme analysis
- What you will learn from this chapter:
- 11.1 Introduction to distinctive collexeme analysis
- 11.2 Distinctive collexeme analysis of quite + ADJ in different varieties of English
- 11.2.1 Theoretical background and data
- 11.2.2 Simple distinctive collexeme analysis of quite + ADJ in British and American English
- 11.2.3 Multiple distinctive collexeme analysis: Quite + ADJ in the British, American and Canadian varieties of English
- 11.3 Summary
- 12. Probabilistic multifactorial grammar and lexicology
- What you will learn from this chapter:
- 12.1 Introduction to logistic regression
- 12.2 Logistic regression model of Dutch causative auxiliaries doen and laten
- 12.2.1 Theoretical background and data.
- 12.2.2 Fitting a binary logistic regression model: Main functions
- 12.2.3 Selection of variables
- 12.2.4 Testing possible interactions
- 12.2.5 Identifying outliers and overly influential observations
- 12.2.6 Checking the regression assumptions
- 12.2.7 Testing for overfitting
- 12.2.8 Interpretation of the model
- 12.3 Summary
- 13. Multinomial (polytomous) logistic regression models of three and more near synonyms
- What you will learn from this chapter:
- 13.1 What is multinomial regression?
- 13.2 Multinomial models of English permissive constructions
- 13.2.1 Data and hypotheses
- 13.2.2 Contrasting allow and permit with let
- 13.2.3 'One vs. rest' approach
- 13.3 Summary
- 14. Conditional inference trees and random forests
- What you will learn from this chapter:
- 14.1 Conditional inference trees and random forests
- 14.2 Conditional inference trees and random forests of three English causative constructions
- 14.2.1 The data and hypotheses
- 14.2.2 Fitting a conditional inference tree model
- 14.2.3 Random forests
- 14.3 Summary
- 15. Behavioural profiles, distance metrics and cluster analysis
- What you will learn from this chapter:
- 15.1 What are Behavioural Profiles?
- 15.2 Behavioural Profiles of English analytic causatives
- 15.2.1 Data and theoretical background
- 15.2.2 Computation of numeric BP vectors from the categorical data
- 15.2.3 Distance matrix
- 15.2.4 Hierarchical cluster analysis
- 15.2.4.1 Identifying the clusters
- 15.2.4.2 Interpretation of the cluster solution: Snake plots and effect size measures
- 15.2.4.3 Validation of a cluster solution
- 15.2.5 Partitioning methods
- 15.2.5.1 General introduction
- 15.2.5.2 Partitioning around centroids (k-means)
- 15.2.5.3 Partitioning around medoids
- 15.3 Summary
- 16. Introduction to Semantic Vector Spaces.