Cargando…

Practical Weak Supervision /

Most data scientists and engineers today rely on quality labeled data to train their machine learning models. But building training sets manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Amit Bahree, Se...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tok, Wee (Autor), Bahree, Amit (Autor), Filipi, Senja (Autor)
Autor Corporativo:	Safari, an O'Reilly Media Company
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	O'Reilly Media, Inc., 2021.
Edición:	1st edition.
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

Tabla de Contenidos:

Intro
Copyright
Table of Contents
Foreword by Xuedong Huang
Foreword by Alex Ratner
Preface
Who Should Read This Book
Navigating This Book
Conventions Used in This Book
Using Code Examples
O'Reilly Online Learning
How to Contact Us
Acknowledgments
Chapter 1. Introduction to Weak Supervision
What Is Weak Supervision?
Real-World Weak Supervision with Snorkel
Approaches to Weak Supervision
Incomplete Supervision
Inexact Supervision
Inaccurate Supervision
Data Programming
Getting Training Data
How Data Programming Is Helping Accelerate Software 2.0
Summary
Chapter 2. Diving into Data Programming with Snorkel
Snorkel, a Data Programming Framework
Getting Started with Labeling Functions
Applying the Labels to the Datasets
Analyzing the Labeling Performance
Using a Validation Set
Reaching Labeling Consensus with LabelModel
Intuition Behind LabelModel
LabelModel Parameter Estimation
Strategies to Improve the Labeling Functions
Data Augmentation with Snorkel Transformers
Data Augmentation Through Word Removal
Snorkel Preprocessors
Data Augmentation Through GPT-2 Prediction
Data Augmentation Through Translation
Applying the Transformation Functions to the Dataset
Summary
Chapter 3. Labeling in Action
Labeling a Text Dataset: Identifying Fake News
Exploring the Fake News Detection(FakeNewsNet) Dataset
Importing Snorkel and Setting Up Representative Constants
Fact-Checking Sites
Is the Speaker a "Liar"?
Twitter Profile and Botometer Score
Generating Agreements Between Weak Classifiers
Labeling an Images Dataset: Determining Indoor Versus Outdoor Images
Creating a Dataset of Images from Bing
Defining and Training Weak Classifiers in TensorFlow
Training the Various Classifiers
Weak Classifiers out of Image Tags
Deploying the Computer Vision Service
Interacting with the Computer Vision Service
Preparing the DataFrame
Learning a LabelModel
Summary
Chapter 4. Using the Snorkel-Labeled Dataset for Text Classification
Getting Started with Natural Language Processing (NLP)
Transformers
Hard Versus Probabilistic Labels
Using ktrain for Performing Text Classification
Data Preparation
Dealing with an Imbalanced Dataset
Training the Model
Using the Text Classification Model for Prediction
Finding a Good Learning Rate
Using Hugging Face and Transformers
Loading the Relevant Python Packages
Dataset Preparation
Checking Whether GPU Hardware Is Available
Performing Tokenization
Model Training
Testing the Fine-Tuned Model
Summary
Chapter 5. Using the Snorkel-Labeled Dataset for Image Classification
Visual Object Recognition Overview
Representing Image Features
Transfer Learning for Computer Vision
Using PyTorch for Image Classification

Practical Weak Supervision /

Ejemplares similares