MARC

LEADER 00000cam a2200000Ia 4500
001 OR_ocn857306630
003 OCoLC
005 20231017213018.0
006 m o d
007 cr cnu---unuuu
008 130830s2013 caua ob 001 0 eng d
040 |a N$T  |b eng  |e pn  |c N$T  |d TEFOD  |d YDXCP  |d IAI  |d UPM  |d OCLCF  |d COO  |d REB  |d TEFOD  |d EBLCP  |d OCLCQ  |d LND  |d OCLCQ  |d U3W  |d UOK  |d NTG  |d DKU  |d OCLCQ  |d WYU  |d ERL  |d NRC  |d ORZ  |d OCLCQ  |d INARC  |d OCLCO  |d AAA  |d OCLCO  |d OCL  |d OCLCQ  |d OCLCO  |d ORMDA 
016 7 |a 016444020  |2 Uk 
019 |a 993892975  |a 1002139944  |a 1285479965 
020 |a 9781449374280  |q (electronic bk.) 
020 |a 144937428X  |q (electronic bk.) 
020 |a 9781449374297  |q (electronic bk.) 
020 |a 1449374298  |q (electronic bk.) 
020 |z 9781449361327  |q (pbk.) 
020 |z 1449361323  |q (pbk.) 
029 1 |a AU@  |b 000056680022 
029 1 |a NZ1  |b 15317448 
029 1 |a AU@  |b 000067106810 
035 |a (OCoLC)857306630  |z (OCoLC)993892975  |z (OCoLC)1002139944  |z (OCoLC)1285479965 
037 |a 389C332D-C8AB-4374-B90C-F90840F70518  |b OverDrive, Inc.  |n http://www.overdrive.com 
037 |a 9781449374273  |b O'Reilly Media 
050 4 |a QA76.9.D343  |b P76 2013eb 
072 7 |a COM  |x 021040  |2 bisacsh 
082 0 4 |a 005.74  |2 23 
049 |a UAMI 
100 1 |a Provost, Foster,  |d 1964- 
245 1 0 |a Data science for business :  |b what you need to know about data mining and data-analytic thinking /  |c Foster Provost & Tom Fawcett. 
250 |a 1st ed. 
260 |a Sebastopol, CA :  |b O'Reilly Media,  |c 2013. 
300 |a 1 online resource (xviii, 384 pages) :  |b illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
504 |a Includes bibliographical references (pages 359-366) and index. 
588 0 |a Online resource; title from digital title page (viewed on April 02, 2019). 
505 0 |a Machine generated contents note: 1. Introduction: Data-Analytic Thinking -- The Ubiquity of Data Opportunities -- Example: Hurricane Frances -- Example: Predicting Customer Churn -- Data Science, Engineering, and Data-Driven Decision Making -- Data Processing and "Big Data" -- From Big Data 1.0 to Big Data 2.0 -- Data and Data Science Capability as a Strategic Asset -- Data-Analytic Thinking -- This Book -- Data Mining and Data Science, Revisited -- Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist -- Summary -- 2. Business Problems and Data Science Solutions -- Fundamental concepts: A set of canonical data mining tasks; The data mining process; Supervised versus unsupervised data mining -- From Business Problems to Data Mining Tasks -- Supervised Versus Unsupervised Methods -- Data Mining and Its Results -- The Data Mining Process -- Business Understanding -- Data Understanding -- Data Preparation -- Modeling -- Evaluation -- Deployment -- Implications for Managing the Data Science Team -- Other Analytics Techniques and Technologies -- Statistics -- Database Querying -- Data Warehousing -- Regression Analysis -- Machine Learning and Data Mining -- Answering Business Questions with These Techniques -- Summary -- 3. Introduction to Predictive Modeling: From Correlation to Supervised Segmentation -- Fundamental concepts: Identifying informative attributes; Segmenting data by progressive attribute selection -- Exemplary techniques: Finding correlations; Attribute/variable selection; Tree induction -- Models, Induction, and Prediction -- Supervised Segmentation -- Selecting Informative Attributes -- Example: Attribute Selection with Information Gain -- Supervised Segmentation with Tree-Structured Models -- Visualizing Segmentations -- Trees as Sets of Rules -- Probability Estimation -- Example: Addressing the Churn Problem with Tree Induction -- Summary -- 4. Fitting a Model to Data -- Fundamental concepts: Finding "optimal" model parameters based on data; Choosing the goal for data mining; Objective functions; Loss functions -- Exemplary techniques: Linear regression; Logistic regression; Support-vector machines -- Classification via Mathematical Functions -- Linear Discriminant Functions -- Optimizing an Objective Function -- An Example of Mining a Linear Discriminant from Data -- Linear Discriminant Functions for Scoring and Ranking Instances -- Support Vector Machines, Briefly -- Regression via Mathematical Functions -- Class Probability Estimation and Logistic "Regression" -- Logistic Regression: Some Technical Details -- Example: Logistic Regression versus Tree Induction -- Nonlinear Functions, Support Vector Machines, and Neural Networks -- Summary -- 5. Overfitting and Its Avoidance -- Fundamental concepts: Generalization; Fitting and overfitting; Complexity control -- Exemplary techniques: Cross-validation; Attribute selection; Tree pruning; Regularization -- Generalization -- Overfitting -- Overfitting Examined -- Holdout Data and Fitting Graphs -- Overfitting in Tree Induction -- Overfitting in Mathematical Functions -- Example: Overfitting Linear Functions -- Example: Why Is Overfitting Bad? -- From Holdout Evaluation to Cross-Validation -- The Churn Dataset Revisited -- Learning Curves -- Overfitting Avoidance and Complexity Control -- Avoiding Overfitting with Tree Induction -- A General Method for Avoiding Overfitting -- Avoiding Overfitting for Parameter Optimization -- Summary -- 6. Similarity, Neighbors, and Clusters -- Fundamental concepts: Calculating similarity of objects described by data; Using similarity for prediction; Clustering as similarity-based segmentation -- Exemplary techniques: Searching for similar entities; Nearest neighbor methods; Clustering methods; Distance metrics for calculating similarity -- Similarity and Distance -- Nearest-Neighbor Reasoning -- Example: Whiskey Analytics -- Nearest Neighbors for Predictive Modeling -- How Many Neighbors and How Much Influence? -- Geometric Interpretation, Overfitting, and Complexity Control -- Issues with Nearest-Neighbor Methods -- Some Important Technical Details Relating to Similarities and Neighbors -- Heterogeneous Attributes -- Other Distance Functions -- Combining Functions: Calculating Scores from Neighbors -- Clustering -- Example: Whiskey Analytics Revisited -- Hierarchical Clustering -- Nearest Neighbors Revisited: Clustering Around Centroids -- Example: Clustering Business News Stories -- Understanding the Results of Clustering -- Using Supervised Learning to Generate Cluster Descriptions -- Stepping Back: Solving a Business Problem Versus Data Exploration -- Summary -- 7. Decision Analytic Thinking I: What Is a Good Model? -- Fundamental concepts: Careful consideration of what is desired from data science results; Expected value as a key evaluation framework; Consideration of appropriate comparative baselines -- Exemplary techniques: Various evaluation metrics; Estimating costs and benefits; Calculating expected profit; Creating baseline methods for comparison -- Evaluating Classifiers -- Plain Accuracy and Its Problems -- The Confusion Matrix -- Problems with Unbalanced Classes -- Problems with Unequal Costs and Benefits -- Generalizing Beyond Classification -- A Key Analytical Framework: Expected Value -- Using Expected Value to Frame Classifier Use -- Using Expected Value to Frame Classifier Evaluation -- Evaluation, Baseline Performance, and Implications for Investments in Data -- Summary -- 8. Visualizing Model Performance -- Fundamental concepts: Visualization of model performance under various kinds of uncertainty; Further consideration of what is desired from data mining results -- Exemplary techniques: Profit curves; Cumulative response curves; Lift curves; ROC curves -- Ranking Instead of Classifying -- Profit Curves -- ROC Graphs and Curves -- The Area Under the ROC Curve (AUC) -- Cumulative Response and Lift Curves -- Example: Performance Analytics for Churn Modeling -- Summary -- 9. Evidence and Probabilities -- Fundamental concepts: Explicit evidence combination with Bayes' Rule; Probabilistic reasoning via assumptions of conditional independence -- Exemplary techniques: Naive Bayes classification; Evidence lift -- Example: Targeting Online Consumers With Advertisements -- Combining Evidence Probabilistically -- Joint Probability and Independence -- Bayes' Rule -- Applying Bayes' Rule to Data Science -- Conditional Independence and Naive Bayes -- Advantages and Disadvantages of Naive Bayes -- A Model of Evidence "Lift" -- Example: Evidence Lifts from Facebook "Likes" -- Evidence in Action: Targeting Consumers with Ads -- Summary -- 10. Representing and Mining Text -- Fundamental concepts: The importance of constructing mining-friendly data representations; Representation of text for data mining -- Exemplary techniques: Bag of words representation; TFIDF calculation; N-grams; Stemming; Named entity extraction; Topic models -- Why Text Is Important -- Why Text Is Difficult -- Representation -- Bag of Words -- Term Frequency -- Measuring Sparseness: Inverse Document Frequency -- Combining Them: TFIDF -- Example: Jazz Musicians -- The Relationship of IDF to Entropy -- Beyond Bag of Words -- N-gram Sequences -- Named Entity Extraction -- Topic Models -- Example: Mining News Stories to Predict Stock Price Movement -- The Task -- The Data -- Data Preprocessing -- Results -- Summary -- 11. Decision Analytic Thinking II: Toward Analytical Engineering -- Fundamental concept: Solving business problems with data science starts with analytical engineering: designing an analytical solution, based on the data, tools, and techniques available -- Exemplary technique: Expected value as a framework for data science solution design -- Targeting the Best Prospects for a Charity Mailing -- The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces -- A Brief Digression on Selection Bias -- Our Churn Example Revisited with Even More Sophistication -- The Expected Value Framework: Structuring a More Complicated Business Problem -- Assessing the Influence of the Incentive -- From an Expected Value Decomposition to a Data Science Solution -- Summary -- 12.  
505 0 |a Other Data Science Tasks and Techniques -- Fundamental concepts: Our fundamental concepts as the basis of many common data science techniques; The importance of familiarity with the building blocks of data science -- Exemplary techniques: Association and co-occurrences; Behavior profiling; Link prediction; Data reduction; Latent information mining; Movie recommendation; Bias-variance decomposition of error; Ensembles of models; Causal reasoning from data -- Co-occurrences and Associations: Finding Items That Go Together -- Measuring Surprise: Lift and Leverage -- Example: Beer and Lottery Tickets -- Associations Among Facebook Likes -- Profiling: Finding Typical Behavior -- Link Prediction and Social Recommendation -- Data Reduction, Latent Information, and Movie Recommendation -- Bias, Variance, and Ensemble Methods -- Data-Driven Causal Explanation and a Viral Marketing Example -- Summary -- 13. Data Science and Business Strategy -- Fundamental concepts: Our principles as the basis of success for a data-driven business; Acquiring and sustaining competitive advantage via data science; The importance of careful curation of data science capability -- Thinking Data-Analytically, Redux -- Achieving Competitive Advantage with Data Science -- Sustaining Competitive Advantage with Data Science -- Formidable Historical Advantage -- Unique Intellectual Property -- Unique Intangible Collateral Assets -- Superior Data Scientists -- Superior Data Science Management -- Attracting and Nurturing Data Scientists and Their Teams -- Examine Data Science Case Studies -- Be Ready to Accept Creative Ideas from Any Source -- Be Ready to Evaluate Proposals for Data Science Projects -- Example Data Mining Proposal. 
505 0 |a Note continued: Flaws in the Big Red Proposal -- A Firm's Data Science Maturity -- 14. Conclusion -- The Fundamental Concepts of Data Science -- Applying Our Fundamental Concepts to a New Problem: Mining Mobile Device Data -- Changing the Way We Think about Solutions to Business Problems -- What Data Can't Do: Humans in the Loop, Revisited -- Privacy, Ethics, and Mining Data About Individuals -- Is There More to Data Science? -- Final Example: From Crowd-Sourcing to Cloud-Sourcing -- Final Words. 
520 8 |a Annotation  |b This broad, deep, but not-too-technical guide introduces you to the fundamental principles of data science and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. By learning data science principles, you will understand the many data-mining techniques in use today. More importantly, these principles underpin the processes and strategies necessary to solve business problems through data mining techniques. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
650 0 |a Data mining. 
650 0 |a Big data. 
650 0 |a Information science. 
650 0 |a Business  |x Data processing. 
650 0 |a Commerce. 
650 2 |a Data Mining 
650 2 |a Information Science 
650 2 |a Commerce 
650 2 |a Electronic Data Processing 
650 6 |a Exploration de données (Informatique) 
650 6 |a Données volumineuses. 
650 6 |a Sciences de l'information. 
650 6 |a Gestion  |x Informatique. 
650 6 |a Commerce. 
650 6 |a Informatique. 
650 7 |a information science.  |2 aat 
650 7 |a COMPUTERS  |x Database Management  |x Data Warehousing.  |2 bisacsh 
650 7 |a Sciences de l'information.  |2 eclas 
650 7 |a Commerce  |2 fast 
650 7 |a Big data  |2 fast 
650 7 |a Business  |x Data processing  |2 fast 
650 7 |a Data mining  |2 fast 
650 7 |a Information science  |2 fast 
650 7 |a Data Mining  |2 gnd 
650 7 |a Big Data  |2 gnd 
650 7 |a Business Intelligence  |2 gnd 
650 7 |a Data mining.  |2 nli 
650 7 |a Big data.  |2 nli 
650 7 |a Business  |x Data processing.  |2 nli 
700 1 |a Fawcett, Tom. 
776 0 8 |i Print version:  |a Provost, Foster, 1964-  |t Data science for business.  |d Sebastopol, Calif. : O'Reilly, 2013  |z 1449361323  |z 9781449361327  |w (OCoLC)844460899 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781449374273/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a ProQuest Ebook Central  |b EBLB  |n EBL1323973 
938 |a EBSCOhost  |b EBSC  |n 619895 
938 |a YBP Library Services  |b YANK  |n 10906129 
938 |a Internet Archive  |b INAR  |n datascienceforbu0000prov 
994 |a 92  |b IZTAP