Cargando…

How to do linguistics with R : data exploration and statistical analysis /

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Levshina, Natalia
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Amsterdam ; Philadelphia : John Benjamins Publishing Company, [2015]
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • How to do Linguistics with R
  • Title page
  • LCC data
  • Dedication page
  • Table of contents
  • Acknowledgements
  • Introduction
  • 1. Who is this book written for?
  • 2. The quantitative turn in linguistics
  • 3. How to use this textbook
  • 1. What is statistics?
  • What you will learn from this chapter:
  • 1.1 Statistics and statistics
  • 1.2 Formulating and testing your hypotheses
  • 1.2.1 Null and alternative hypotheses
  • 1.2.2 Those mysterious p-valuesâ#x80;¦
  • 1.2.3 Type I and Type II errors
  • 1.2.4 One-tailed and two-tailed statistical tests
  • 1.3 What statistics cannot do for you
  • 1.4 Types of variables
  • 1.5 Summary
  • 2. Introduction to R
  • What you will learn from this chapter:
  • 2.1 Introduction
  • 2.2 Installation of the basic distribution and add-on packages
  • 2.3 First steps with R
  • 2.3.1 Starting R
  • 2.3.2 R syntax
  • 2.3.3 Exiting from R or terminating a process
  • 2.3.4 Getting help
  • 2.4 Main types of R objects
  • 2.5 RStudio
  • 2.6 Importing and exporting your data and saving your graphs
  • 2.6.1 Importing your data to R
  • 2.6.2 Exporting your data from R
  • 2.6.3 Saving your graphs
  • 2.7 Summary
  • 3. Descriptive statistics for quantitative variables
  • What you will learn from this chapter:
  • 3.1 Analysing the distribution of word lengths: Basic descriptive statistics
  • 3.1.1 The data
  • 3.1.2 Measures of central tendency
  • 3.1.3 Measures of dispersion
  • 3.2 Bad times, good times: Visualization of a distribution and finding outliers
  • 3.3 Zipf's law and word frequency: Transformation of quantitative variables
  • 3.4 Summary
  • 4. How to explore qualitative variables
  • What you will learn from this chapter:
  • 4.1 Frequency tables, proportions and percentages
  • 4.2 Visualization of categorical data
  • 4.3 Basic Colour Terms: Deviations of Proportions in subcorpora
  • 4.3.1 The data and hypothesis.
  • 4.3.2 Deviation of proportions as a measure of dispersion
  • 4.4 Summary
  • 5. Comparing two groups
  • What you will learn from this chapter:
  • 5.1 Comparing group means (medians): An overview of the tests
  • 5.2 Comparing the number of associations triggered by high- and low-frequency nouns with the help of an independent t-test
  • 5.2.1 Data and hypothesis
  • 5.2.2 Descriptive statistics and visualizations
  • 5.2.3 Choosing an appropriate test to compare the measures of central tendency in two groups
  • 5.2.4 Confidence intervals and standard errors
  • 5.3 Comparing concreteness scores of high- and low-frequency nouns with the help of a two-tailed Wilcoxon test
  • 5.3.1 Data and hypotheses
  • 5.3.2 Descriptive statistics and visualizations: Strip charts and rug plots
  • 5.3.3 Inferential statistics: The two-tailed Wilcoxon test
  • 5.4. Comparing associations produced by native and non-native speakers: A paired one-tailed t-test
  • 5.4.1 Creating simulation data
  • 5.4.2 Performing the paired t-test
  • 5.5 Summary
  • 6. Relationships between two numerical variables
  • What you will learn from this chapter:
  • 6.1 What is correlation?
  • 6.2 Word length and word recognition: The Pearson product-moment correlation coefficient
  • 6.2.1 The data and hypothesis
  • 6.2.2 Descriptive statistics and visualizations
  • 6.2.3 Testing the significance of the correlation coefficient
  • 6.3 Emergence of grammar from lexicon: Spearman's Ï#x81; and Kendall's Ï#x84;.
  • 6.3.1 The data and hypothesis
  • 6.3.2 Exploring the data and computing correlation coefficients
  • 6.4 Visualization of correlations between more than two variables with the help of correlograms
  • 6.5 Summary
  • 7. More on frequencies and reaction times
  • What you will learn from this chapter
  • 7.1 The basic principles of linear regression analysis.
  • 7.2 Putting several factors together: Predicting reaction times in a lexical decision task
  • 7.2.1 Data and hypotheses
  • 7.2.2 The lm() function and interpretation of its output
  • 7.2.3 Selecting the explanatory variables
  • 7.2.4 Checking for outliers and overly influential observations
  • 7.2.5 Checking the regression assumptions
  • 7.2.6 Testing and interpreting interactions
  • 7.2.7 Checking for overfitting
  • 7.2.8 Robust regression: Bootstrap
  • 7.3 Summary
  • 8. Finding differences between several groups
  • What you will learn from this chapter:
  • 8.1 What is ANOVA?
  • 8.2 Motion events in Nicaraguan Sign Language: Independent one-way ANOVA
  • 8.2.1 Theoretical background and data
  • 8.2.2 Exploring the data
  • 8.2.3 Assumptions of one-way parametric ANOVA
  • 8.2.4 Performing parametric one-way ANOVA
  • 8.2.5 Alternative tests
  • 8.2.6 Post-hoc tests
  • 8.3 Development of spatial modulations in Nicaraguan Sign Language: Independent factorial (two-way) ANOVA
  • 8.3.1 The data and hypothesis
  • 8.3.2 Descriptive statistics for different groups and interaction plot
  • 8.3.3 Assumptions of parametric factorial ANOVA
  • 8.3.4 ANOVA and orthogonal contrasts
  • 8.3.5 Alternative tests
  • 8.3.6 Post-hoc tests
  • 8.4 Do native English and native Mandarin Chinese speakers conceptualize time differently? Repeated-measured and mixed-design ANOVA (mixed GLM method)
  • 8.4.1 The data and hypothesis
  • 8.4.2 Fitting a mixed-design ANOVA with the help of mixed GLM
  • 8.4.3 Post-hoc tests
  • 8.5 Summary
  • 9. Measuring associations between two categorical variables
  • What you will learn from this chapter:
  • 9.1 Testing independence
  • 9.2 The story of over is not over: Metaphoric and non-metaphoric uses in two registers (analysis of a 2-by-2 contingency table)
  • 9.2.1 The data and hypothesis.
  • 9.2.2 Visualizations, proportions and measures of effect size: Odds ratios, Cramér's V and the ø coefficient
  • 9.2.3 Testing statistical significance: The Ï#x87;2 -test of independence
  • 9.3 Metaphorical and non-metaphorical uses of see in four registers (analysis of a 4-by-2 table)
  • 9.3.1 The data and hypothesis
  • 9.3.2 Descriptive statistics and visualizations
  • 9.3.3 Testing the statistical significance and analysing the residuals: The Ï#x87;2-test and mosaic and association plots
  • 9.4 Summary
  • 10. Association measures
  • What will you learn from this chapter:
  • 10.1 Measures of association: A brief typology
  • 10.1.1 Frequencies that you will need in order to compute association measures
  • 10.1.2 Unidirectional (asymmetric) vs. bidirectional (symmetric) measures
  • 10.1.3 Contingency-based vs. non-contingency-based measures
  • 10.2 Case study: The Russian ditransitive construction and its collexemes
  • 10.2.1 Theoretical background and data
  • 10.2.2 Computation of some popular association measures
  • 10.3 Summary
  • 11. Geographic variation of quite: Distinctive collexeme analysis
  • What you will learn from this chapter:
  • 11.1 Introduction to distinctive collexeme analysis
  • 11.2 Distinctive collexeme analysis of quite + ADJ in different varieties of English
  • 11.2.1 Theoretical background and data
  • 11.2.2 Simple distinctive collexeme analysis of quite + ADJ in British and American English
  • 11.2.3 Multiple distinctive collexeme analysis: Quite + ADJ in the British, American and Canadian varieties of English
  • 11.3 Summary
  • 12. Probabilistic multifactorial grammar and lexicology
  • What you will learn from this chapter:
  • 12.1 Introduction to logistic regression
  • 12.2 Logistic regression model of Dutch causative auxiliaries doen and laten
  • 12.2.1 Theoretical background and data.
  • 12.2.2 Fitting a binary logistic regression model: Main functions
  • 12.2.3 Selection of variables
  • 12.2.4 Testing possible interactions
  • 12.2.5 Identifying outliers and overly influential observations
  • 12.2.6 Checking the regression assumptions
  • 12.2.7 Testing for overfitting
  • 12.2.8 Interpretation of the model
  • 12.3 Summary
  • 13. Multinomial (polytomous) logistic regression models of three and more near synonyms
  • What you will learn from this chapter:
  • 13.1 What is multinomial regression?
  • 13.2 Multinomial models of English permissive constructions
  • 13.2.1 Data and hypotheses
  • 13.2.2 Contrasting allow and permit with let
  • 13.2.3 'One vs. rest' approach
  • 13.3 Summary
  • 14. Conditional inference trees and random forests
  • What you will learn from this chapter:
  • 14.1 Conditional inference trees and random forests
  • 14.2 Conditional inference trees and random forests of three English causative constructions
  • 14.2.1 The data and hypotheses
  • 14.2.2 Fitting a conditional inference tree model
  • 14.2.3 Random forests
  • 14.3 Summary
  • 15. Behavioural profiles, distance metrics and cluster analysis
  • What you will learn from this chapter:
  • 15.1 What are Behavioural Profiles?
  • 15.2 Behavioural Profiles of English analytic causatives
  • 15.2.1 Data and theoretical background
  • 15.2.2 Computation of numeric BP vectors from the categorical data
  • 15.2.3 Distance matrix
  • 15.2.4 Hierarchical cluster analysis
  • 15.2.4.1 Identifying the clusters
  • 15.2.4.2 Interpretation of the cluster solution: Snake plots and effect size measures
  • 15.2.4.3 Validation of a cluster solution
  • 15.2.5 Partitioning methods
  • 15.2.5.1 General introduction
  • 15.2.5.2 Partitioning around centroids (k-means)
  • 15.2.5.3 Partitioning around medoids
  • 15.3 Summary
  • 16. Introduction to Semantic Vector Spaces.