Python Web Scraping Cookbook : Over 90 proven recipes to get you scraping with Python, micro services, Docker and AWS.
Python Web Scraping Cookbook is a solution-focused book that will teach you techniques to develop high-performance Scrapers, and deal with cookies, hidden form fields, Ajax-based sites, proxies, and more. By the end of this book, you will be able to scrape websites more efficiently with more accurat...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing,
2018.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover; Title Page; Copyright and Credits; Contributors; Packt Upsell; Table of Contents; Preface; Chapter 1: Getting Started with Scraping; Introduction; Setting up a Python development environment ; Getting ready; How to do it ... ; Scraping Python.org with Requests and Beautiful Soup; Getting ready ... ; How to do it ... ; How it works ... ; Scraping Python.org in urllib3 and Beautiful Soup; Getting ready ... ; How to do it ... ; How it works; There's more ... ; Scraping Python.org with Scrapy; Getting ready ... ; How to do it ... ; How it works; Scraping Python.org with Selenium and PhantomJS.
- Getting readyHow to do it ... ; How it works; There's more ... ; Chapter 2: Data Acquisition and Extraction; Introduction; How to parse websites and navigate the DOM using BeautifulSoup; Getting ready; How to do it ... ; How it works; There's more ... ; Searching the DOM with Beautiful Soup's find methods; Getting ready; How to do it ... ; Querying the DOM with XPath and lxml; Getting ready; How to do it ... ; How it works; There's more ... ; Querying data with XPath and CSS selectors; Getting ready; How to do it ... ; How it works; There's more ... ; Using Scrapy selectors; Getting ready; How to do it ...
- How it worksThere's more ... ; Loading data in unicode / UTF-8; Getting ready; How to do it ... ; How it works; There's more ... ; Chapter 3: Processing Data; Introduction; Working with CSV and JSON data; Getting ready; How to do it; How it works; There's more ... ; Storing data using AWS S3; Getting ready; How to do it; How it works; There's more ... ; Storing data using MySQL; Getting ready; How to do it; How it works; There's more ... ; Storing data using PostgreSQL; Getting ready; How to do it; How it works; There's more ... ; Storing data in Elasticsearch; Getting ready; How to do it; How it works.
- There's more ... How to build robust ETL pipelines with AWS SQS; Getting ready; How to do it
- posting messages to an AWS queue; How it works; How to do it
- reading and processing messages; How it works; There's more ... ; Chapter 4: Working with Images, Audio, and other Assets; Introduction; Downloading media content from the web; Getting ready; How to do it; How it works; There's more ... ; Â Parsing a URL with urllib to get the filename; Getting ready; How to do it; How it works; There's more ... ; Determining the type of content for a URLÂ ; Getting ready; How to do it; How it works.
- There's more ... Determining the file extension from a content type; Getting ready; How to do it; How it works; There's more ... ; Downloading and saving images to the local file system; How to do it; How it works; There's more ... ; Downloading and saving images to S3; Getting ready; How to do it; How it works; There's more ... ; Â Generating thumbnails for images; Getting ready; How to do it; How it works; Taking a screenshot of a website; Getting ready; How to do it; How it works; Taking a screenshot of a website with an external service; Getting ready; How to do it; How it works; There's more ...