Statistical learning for biomedical data /
This highly motivating introduction to statistical learning machines explains underlying principles in nontechnical language, using many examples and figures.
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Cambridge ; New York :
Cambridge University Press,
2011.
|
Colección: | Practical guides to biostatistics and epidemiology.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- pt. 1. Introduction
- pt. 2. A machine toolkit
- pt. 3. Analysis fundamentals
- pt. 4. Machine strategies.
- Part I. Introduction
- 1. Prologue
- 1.1. Machines that learn
- some recent history
- 1.2. Twenty canonical questions
- 1.3. Outline of the book
- 1.4. A comment about example datasets
- 1.5. Software
- 2. The landscape of learning machines
- 2.1. Introduction
- 2.2. Types of data for learning machines
- 2.3. Will that be supervised or unsupervised?
- 2.4. An unsupervised example
- 2.5. More lack of supervision
- where are the parents?
- 2.6. Engines, complex and primitive
- 2.7. Model richness means what, exactly?
- 2.8. Membership or probability of membership?
- 2.9. A taxonomy of machines?
- 2.10. A note of caution
- one of many
- 2.11. Highlights from the theory
- 3. A mangle of machines
- 3.1. Introduction
- 3.2. Linear regression
- 3.3. Logistic regression
- 3.4. Linear discriminant
- 3.5. Bayes classifiers
- regular and naïve
- 3.6. Logic regression
- 3.7. k-Nearest neighbors
- 3.8. Support vector machines
- 3.9. Neural networks
- 3.10. Boosting
- 3.11. Evolutionary and genetic algorithms
- 4. Three examples and several machines
- 4.1. Introduction
- 4.2. Simulated cholesterol data
- 4.3. Lupus data
- 4.4. Stroke data
- 4.5. Biomedical means unbalanced
- 4.6. Measures of machine performance
- 4.7. Linear analysis of cholesterol data
- 4.8. Nonlinear analysis of cholesterol data
- 4.9. Analysis of the lupus data
- 4.10. Analysis of the stroke data
- 4.11. Further analysis of the lupus and stroke data
- Part II. A machine toolkit
- 5. Logistic regression
- 5.1. Introduction
- 5.2. Inside and around the model
- 5.3. Interpreting the coefficients
- 5.4. Using logistic regression as a decision rule
- 5.5. Logistic regression applied to the cholesterol data
- 5.6. A cautionary note
- 5.7. Another cautionary note
- 5.8. Probability estimates and decision rules
- 5.9. Evaluating the goodness-of-fit of a logistic regression model
- 5.10. Calibrating a logistic regression
- 5.11. Beyond calibration
- 5.12. Logistic regression and reference models
- 6. A single decision tree
- 6.1. Introduction
- 6.2. Dropping down trees
- 6.3. Growing a tree
- 6.4. Selecting features, making splits
- 6.5. Good split, bad split
- 6.6. Finding good features for making splits
- 6.7. Misreading trees
- 6.8. Stopping and pruning rules
- 6.9. Using functions of the features
- 6.10. Unstable trees?
- 6.11. Variable importance
- growing on trees?
- 6.12. Permuting for importance
- 6.13. The continuing mystery of trees
- 7. Random Forests
- trees everywhere
- 7.1. Random Forests in less than five minutes
- 7.2. Random treks through the data
- 7.3. Random treks through the features
- 7.4. Walking through the forest
- 7.5. Weighted and unweighted voting
- 7.6. Finding subsets in the data using proximities
- 7.7. Applying Random Forests to the Stroke data
- 7.8. Random Forests in the universe of machines
- Part III. Analysis fundamentals
- 8. Merely two variables
- 8.1. Introduction
- 8.2. Understanding correlations
- 8.3. Hazards of correlations
- 8.4. Correlations big and small
- 9. More than two variables
- 9.1. Introduction
- 9.2. Tiny problems, large consequences
- 9.3. Mathematics to the rescue?
- 9.4. Good models need not be unique
- 9.5. Contexts and coefficients
- 9.6. Interpreting and testing coefficients in models
- 9.7. Merging models, pooling lists, ranking features
- 10. Resampling methods
- 10.1. Introduction
- 10.2. The bootstrap
- 10.3. When the bootstrap works
- 10.4. When the bootstrap doesn't work
- 10.5. Resampling from a single group in different ways
- 10.6. Resampling from groups with unequal sizes
- 10.7. Resampling from small datasets
- 10.8. Permutation methods
- 10.9. Still more on permutation methods
- 11. Error analysis and model validation
- 11.1. Introduction
- 11.2. Errors? What errors?
- 11.3. Unbalanced data, unbalanced errors
- 11.4. Error analysis for a single machine
- 11.5. Cross-validation error estimation
- 11.6. Cross-validation or cross-training?
- 11.7. The leave-one-out method
- 11.8. The out-of-bag method
- 11.9. Intervals for error estimates for a single machine
- 11.10. Tossing random coins into the abyss
- 11.11. Error estimates for unbalanced data
- 11.12. Confidence intervals for comparing error values
- 11.13. Other measures of machine accuracy
- 11.14. Benchmarking and winning the lottery
- 11.15. Error analysis for predicting continuous outcomes
- Part IV. Machine strategies
- 12. Ensemble methods
- let's take a vote
- 12.1. Pools of machines
- 12.2. Weak correlation with outcome can be good enough
- 12.3. Model averaging
- 13. Summary and conclusions
- 13.1. Where have we been?
- 13.2. So many machines
- 13.3. Binary decision or probability estimate?
- 13.4. Survival machines? Risk machines?
- 13.5. And where are we going?