Cargando…

Introduction to statistical machine learning /

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Sugiyama, Masashi (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Waltham, MA : Morgan Kaufmann, 2016.
Temas:
Acceso en línea:Texto completo

MARC

LEADER 00000cam a2200000 i 4500
001 SCIDIR_ocn930600944
003 OCoLC
005 20231120112038.0
006 m o d
007 cr cnu|||unuuu
008 151130s2016 mau ob 001 0 eng d
040 |a N$T  |b eng  |e rda  |e pn  |c N$T  |d YDXCP  |d EBLCP  |d CDX  |d IDEBK  |d COO  |d N$T  |d OCLCF  |d OPELS  |d UMI  |d OCLCQ  |d DEBSZ  |d DEBBG  |d OCLCQ  |d RRP  |d U3W  |d VT2  |d D6H  |d CEF  |d OCLCQ  |d WYU  |d OCLCQ  |d S2H  |d OCLCO  |d LVT  |d UUM  |d OCLCO  |d OCLCQ  |d OCLCO 
019 |a 930489423  |a 932322876  |a 1017877621  |a 1066527684  |a 1229588382  |a 1235824028 
020 |a 9780128023501  |q (electronic bk.) 
020 |a 0128023503  |q (electronic bk.) 
020 |z 9780128021217 
020 |z 0128021217  |q (pbk.) 
020 |z 9780128021217  |q (pbk.) 
035 |a (OCoLC)930600944  |z (OCoLC)930489423  |z (OCoLC)932322876  |z (OCoLC)1017877621  |z (OCoLC)1066527684  |z (OCoLC)1229588382  |z (OCoLC)1235824028 
050 4 |a Q325.5 
072 7 |a COM  |2 ukslc 
072 7 |a COM  |x 000000  |2 bisacsh 
082 0 4 |a 006.3/1  |2 23 
100 1 |a Sugiyama, Masashi,  |e author. 
245 1 0 |a Introduction to statistical machine learning /  |c Masashi Sugiyama. 
264 1 |a Waltham, MA :  |b Morgan Kaufmann,  |c 2016. 
264 4 |c �2016 
300 |a 1 online resource 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
504 |a Includes bibliographical references and index. 
588 0 |a Online resource; title from PDF title page (ScienceDirect, viewed February 16, 2016). 
505 0 |a Front Cover -- Introduction to Statistical Machine Learning -- Copyright -- Table of Contents -- Biography -- Preface -- 1 INTRODUCTION -- 1 Statistical Machine Learning -- 1.1 Types of Learning -- 1.2 Examples of Machine Learning Tasks -- 1.2.1 Supervised Learning -- 1.2.2 Unsupervised Learning -- 1.2.3 Further Topics -- 1.3 Structure of This Textbook -- 2 STATISTICS AND PROBABILITY -- 2 Random Variables and Probability Distributions -- 2.1 Mathematical Preliminaries -- 2.2 Probability -- 2.3 Random Variable and Probability Distribution -- 2.4 Properties of Probability Distributions -- 2.4.1 Expectation, Median, and Mode -- 2.4.2 Variance and Standard Deviation -- 2.4.3 Skewness, Kurtosis, and Moments -- 2.5 Transformation of Random Variables -- 3 Examples of Discrete Probability Distributions -- 3.1 Discrete Uniform Distribution -- 3.2 Binomial Distribution -- 3.3 Hypergeometric Distribution -- 3.4 Poisson Distribution -- 3.5 Negative Binomial Distribution -- 3.6 Geometric Distribution -- 4 Examples of Continuous Probability Distributions -- 4.1 Continuous Uniform Distribution -- 4.2 Normal Distribution -- 4.3 Gamma Distribution, Exponential Distribution, and Chi-Squared Distribution -- 4.4 Beta Distribution -- 4.5 Cauchy Distribution and Laplace Distribution -- 4.6 t-Distribution and F-Distribution -- 5 Multidimensional Probability Distributions -- 5.1 Joint Probability Distribution -- 5.2 Conditional Probability Distribution -- 5.3 Contingency Table -- 5.4 Bayes' Theorem -- 5.5 Covariance and Correlation -- 5.6 Independence -- 6 Examples of Multidimensional Probability Distributions -- 6.1 Multinomial Distribution -- 6.2 Multivariate Normal Distribution -- 6.3 Dirichlet Distribution -- 6.4 Wishart Distribution -- 7 Sum of Independent Random Variables -- 7.1 Convolution -- 7.2 Reproductive Property -- 7.3 Law of Large Numbers. 
505 8 |a 7.4 Central Limit Theorem -- 8 Probability Inequalities -- 8.1 Union Bound -- 8.2 Inequalities for Probabilities -- 8.2.1 Markov's Inequality and Chernoff's Inequality -- 8.2.2 Cantelli's Inequality and Chebyshev's Inequality -- 8.3 Inequalities for Expectation -- 8.3.1 Jensen's Inequality -- 8.3.2 �Hlder's Inequality and Schwarz's Inequality -- 8.3.3 Minkowski's Inequality -- 8.3.4 Kantorovich's Inequality -- 8.4 Inequalities for the Sum of Independent Random Variables -- 8.4.1 Chebyshev's Inequality and Chernoff's Inequality -- 8.4.2 Hoeffding's Inequality and Bernstein's Inequality -- 8.4.3 Bennett's Inequality -- 9 Statistical Estimation -- 9.1 Fundamentals of Statistical Estimation -- 9.2 Point Estimation -- 9.2.1 Parametric Density Estimation -- 9.2.2 Nonparametric Density Estimation -- 9.2.3 Regression and Classification -- 9.2.4 Model Selection -- 9.3 Interval Estimation -- 9.3.1 Interval Estimation for Expectation of Normal Samples -- 9.3.2 Bootstrap Confidence Interval -- 9.3.3 Bayesian Credible Interval -- 10 Hypothesis Testing -- 10.1 Fundamentals of Hypothesis Testing -- 10.2 Test for Expectation of Normal Samples -- 10.3 Neyman-Pearson Lemma -- 10.4 Test for Contingency Tables -- 10.5 Test for Difference in Expectations of Normal Samples -- 10.5.1 Two Samples without Correspondence -- 10.5.2 Two Samples with Correspondence -- 10.6 Nonparametric Test for Ranks -- 10.6.1 Two Samples without Correspondence -- 10.6.2 Two Samples with Correspondence -- 10.7 Monte Carlo Test -- 3 GENERATIVE APPROACH TO STATISTICAL PATTERN RECOGNITION -- 11 Pattern Recognition via Generative Model Estimation -- 11.1 Formulation of Pattern Recognition -- 11.2 Statistical Pattern Recognition -- 11.3 Criteria for Classifier Training -- 11.3.1 MAP Rule -- 11.3.2 Minimum Misclassification Rate Rule -- 11.3.3 Bayes Decision Rule -- 11.3.4 Discussion. 
505 8 |a 11.4 Generative and Discriminative Approaches -- 12 Maximum Likelihood Estimation -- 12.1 Definition -- 12.2 Gaussian Model -- 12.3 Computing the Class-Posterior Probability -- 12.4 Fisher's Linear Discriminant Analysis (FDA) -- 12.5 Hand-Written Digit Recognition -- 12.5.1 Preparation -- 12.5.2 Implementing Linear Discriminant Analysis -- 12.5.3 Multiclass Classification -- 13 Properties of Maximum Likelihood Estimation -- 13.1 Consistency -- 13.2 Asymptotic Unbiasedness -- 13.3 Asymptotic Efficiency -- 13.3.1 One-Dimensional Case -- 13.3.2 Multidimensional Cases -- 13.4 Asymptotic Normality -- 13.5 Summary -- 14 Model Selection for Maximum Likelihood Estimation -- 14.1 Model Selection -- 14.2 KL Divergence -- 14.3 AIC -- 14.4 Cross Validation -- 14.5 Discussion -- 15 Maximum Likelihood Estimation for Gaussian Mixture Model -- 15.1 Gaussian Mixture Model -- 15.2 MLE -- 15.3 Gradient Ascent Algorithm -- 15.4 EM Algorithm -- 16 Nonparametric Estimation -- 16.1 Histogram Method -- 16.2 Problem Formulation -- 16.3 KDE -- 16.3.1 Parzen Window Method -- 16.3.2 Smoothing with Kernels -- 16.3.3 Bandwidth Selection -- 16.4 NNDE -- 16.4.1 Nearest Neighbor Distance -- 16.4.2 Nearest Neighbor Classifier -- 17 Bayesian Inference -- 17.1 Bayesian Predictive Distribution -- 17.1.1 Definition -- 17.1.2 Comparison with MLE -- 17.1.3 Computational Issues -- 17.2 Conjugate Prior -- 17.3 MAP Estimation -- 17.4 Bayesian Model Selection -- 18 Analytic Approximation of Marginal Likelihood -- 18.1 Laplace Approximation -- 18.1.1 Approximation with Gaussian Density -- 18.1.2 Illustration -- 18.1.3 Application to Marginal Likelihood Approximation -- 18.1.4 Bayesian Information Criterion (BIC) -- 18.2 Variational Approximation -- 18.2.1 Variational Bayesian EM (VBEM) Algorithm -- 18.2.2 Relation to Ordinary EM Algorithm -- 19 Numerical Approximation of Predictive Distribution. 
505 8 |a 19.1 Monte Carlo Integration -- 19.2 Importance Sampling -- 19.3 Sampling Algorithms -- 19.3.1 Inverse Transform Sampling -- 19.3.2 Rejection Sampling -- 19.3.3 Markov Chain Monte Carlo (MCMC) Method -- 20 Bayesian Mixture Models -- 20.1 Gaussian Mixture Models -- 20.1.1 Bayesian Formulation -- 20.1.2 Variational Inference -- 20.1.3 Gibbs Sampling -- 20.2 Latent Dirichlet Allocation (LDA) -- 20.2.1 Topic Models -- 20.2.2 Bayesian Formulation -- 20.2.3 Gibbs Sampling -- 4 DISCRIMINATIVE APPROACH TO STATISTICAL MACHINE LEARNING -- 21 Learning Models -- 21.1 Linear-in-Parameter Model -- 21.2 Kernel Model -- 21.3 Hierarchical Model -- 22 Least Squares Regression -- 22.1 Method of LS -- 22.2 Solution for Linear-in-Parameter Model -- 22.3 Properties of LS Solution -- 22.4 Learning Algorithm for Large-Scale Data -- 22.5 Learning Algorithm for Hierarchical Model -- 23 Constrained LS Regression -- 23.1 Subspace-Constrained LS -- 23.2?2-Constrained LS -- 23.3 Model Selection -- 24 Sparse Regression -- 24.1?1-Constrained LS -- 24.2 Solving?1-Constrained LS -- 24.3 Feature Selection by Sparse Learning -- 24.4 Various Extensions -- 24.4.1 Generalized?1-Constrained LS -- 24.4.2?p-Constrained LS -- 24.4.3?1+?2-Constrained LS -- 24.4.4?1,2-Constrained LS -- 24.4.5 Trace Norm Constrained LS -- 25 Robust Regression -- 25.1 Nonrobustness of?2-Loss Minimization -- 25.2?1-Loss Minimization -- 25.3 Huber Loss Minimization -- 25.3.1 Definition -- 25.3.2 Stochastic Gradient Algorithm -- 25.3.3 Iteratively Reweighted LS -- 25.3.4?1-Constrained Huber Loss Minimization -- 25.4 Tukey Loss Minimization -- 26 Least Squares Classification -- 26.1 Classification by LS Regression -- 26.2 0/1-Loss and Margin -- 26.3 Multiclass Classification -- 27 Support Vector Classification -- 27.1 Maximum Margin Classification -- 27.1.1 Hard Margin Support Vector Classification. 
505 8 |a 27.1.2 Soft Margin Support Vector Classification -- 27.2 Dual Optimization of Support Vector Classification -- 27.3 Sparseness of Dual Solution -- 27.4 Nonlinearization by Kernel Trick -- 27.5 Multiclass Extension -- 27.6 Loss Minimization View -- 27.6.1 Hinge Loss Minimization -- 27.6.2 Squared Hinge Loss Minimization -- 27.6.3 Ramp Loss Minimization -- 28 Probabilistic Classification -- 28.1 Logistic Regression -- 28.1.1 Logistic Model and MLE -- 28.1.2 Loss Minimization View -- 28.2 LS Probabilistic Classification -- 29 Structured Classification -- 29.1 Sequence Classification -- 29.2 Probabilistic Classification for Sequences -- 29.2.1 Conditional Random Field -- 29.2.2 MLE -- 29.2.3 Recursive Computation -- 29.2.4 Prediction for New Sample -- 29.3 Deterministic Classification for Sequences -- 5 FURTHER TOPICS -- 30 Ensemble Learning -- 30.1 Decision Stump Classifier -- 30.2 Bagging -- 30.3 Boosting -- 30.3.1 Adaboost -- 30.3.2 Loss Minimization View -- 30.4 General Ensemble Learning -- 31 Online Learning -- 31.1 Stochastic Gradient Descent -- 31.2 Passive-Aggressive Learning -- 31.2.1 Classification -- 31.2.2 Regression -- 31.3 Adaptive Regularization of Weight Vectors (AROW) -- 31.3.1 Uncertainty of Parameters -- 31.3.2 Classification -- 31.3.3 Regression -- 32 Confidence of Prediction -- 32.1 Predictive Variance for?2-Regularized LS -- 32.2 Bootstrap Confidence Estimation -- 32.3 Applications -- 32.3.1 Time-series Prediction -- 32.3.2 Tuning Parameter Optimization -- 33 Semisupervised Learning -- 33.1 Manifold Regularization -- 33.1.1 Manifold Structure Brought by Input Samples -- 33.1.2 Computing the Solution -- 33.2 Covariate Shift Adaptation -- 33.2.1 Importance Weighted Learning -- 33.2.2 Relative Importance Weighted Learning -- 33.2.3 Importance Weighted Cross Validation -- 33.2.4 Importance Estimation. 
650 0 |a Machine learning  |x Statistical methods. 
650 6 |a Apprentissage automatique  |0 (CaQQLa)201-0131435  |x M�ethodes statistiques.  |0 (CaQQLa)201-0373903 
650 7 |a COMPUTERS  |x General.  |2 bisacsh 
650 7 |a Machine learning  |x Statistical methods  |2 fast  |0 (OCoLC)fst01004801 
776 0 8 |i Print version:  |z 9780128023501 
856 4 0 |u https://sciencedirect.uam.elogim.com/science/book/9780128021217  |z Texto completo