|Listed in category:
Have one to sell?

Getting Structured Data from the Internet - 9781484265758

BOOKS etc.
(529058)
Registered as a business seller
£40.82
RRP £49.99 What does this price mean?
RRP refers to the manufacturer's recommended retail price.
Save £9.17 (18% OFF)
Condition:
New
5 available1 sold
Breathe easy. Returns accepted.
Other people bought this. 1 has already sold.
This item will be sent through eBay's Global Shipping Programme.
Includes international tracking, simplified customs clearance, and no extra charges at delivery. Learn more
Postage:
£26.25 International Priority Shipping to United States via eBay's Global Shipping Programme
This amount includes seller specified domestic postage charges as well as applicable international postage, dispatch, and other fees. This amount is subject to change until you make payment. For additional information, see the Global Shipping Programme terms and conditions
.
Located in: Aldershot, United Kingdom
Import charges: 
Free amount confirmed at checkout
This amount includes applicable customs duties, taxes, brokerage and other fees. This amount is subject to change until you make payment. For additional information, see the Global Shipping Programme terms and conditions
Delivery:
Estimated between Fri, 27 Jun and Mon, 7 Jul to 94104
Estimated delivery dates - opens in a new window or tab reflect seller's dispatch time, origin postcode, destination postcode and time of order receipt, and will depend on the delivery service selected and receipt of cleared paymentcleared payment - opens in a new window or tab. Delivery times may vary, especially during peak periods, and are an estimate only.
Includes international tracking
Returns:
60 days return. Buyer pays for return postage. If you use an eBay delivery label, it will be deducted from your refund amount.
Payments:
    Diners Club
International postage and import charges paid to Pitney Bowes Inc. Learn moreLearn more about the eBay Global Shipping Programme

Shop with confidence

eBay Money Back Guarantee
Get the item you ordered or your money back. Learn moreeBay Money Back Guarantee - opens new window or tab
Seller assumes all responsibility for this listing.
eBay item number:285317158767
Last updated on 12 Jun, 2025 11:16:57 BSTView all revisionsView all revisions

Item specifics

Condition
New: A new, unread, unused book in perfect condition with no missing or damaged pages. See the ...
Book Title
Getting Structured Data from the Internet
ISBN
9781484265758
Publication Year
2020
Type
Textbook
Format
Paperback
Language
English
Publication Name
Getting Structured Data from the Internet: Running Web Crawlers/Scrapers on a Big Data Production Scale
Item Height
254 mm
Author
Jay M. Patel
Publisher
Apress
Subject
Computer Science
Item Weight
795 g
Item Width
178 mm
Number of Pages
397 Pages

About this product

Product Information

Utilize web scraping at scale to quickly get unlimited amounts of free data available on the web into a structured format. This book teaches you to use Python scripts to crawl through websites at scale and scrape data from HTML and JavaScript-enabled pages and convert it into structured data formats such as CSV, Excel, JSON, or load it into a SQL database of your choice. This book goes beyond the basics of web scraping and covers advanced topics such as natural language processing (NLP) and text analytics to extract names of people, places, email addresses, contact details, etc., from a page at production scale using distributed big data techniques on an Amazon Web Services (AWS)-based cloud infrastructure. It book covers developing a robust data processing and ingestion pipeline on the Common Crawl corpus, containing petabytes of data publicly available and a web crawl data set available on AWS's registry of open data. Getting Structured Data from the Internet also includes a step-by-step tutorial on deploying your own crawlers using a production web scraping framework (such as Scrapy) and dealing with real-world issues (such as breaking Captcha, proxy IP rotation, and more). Code used in the book is provided to help you understand the concepts in practice and write your own web crawler to power your business ideas. What You Will Learn Understand web scraping, its applications/uses, and how to avoid web scraping by hitting publicly available rest API endpoints to directly get data Develop a web scraper and crawler from scratch using lxml and BeautifulSoup library, and learn about scraping from JavaScript-enabled pages using Selenium Use AWS-based cloud computing with EC2, S3, Athena, SQS, and SNS to analyze, extract, and store useful insights from crawled pages Use SQL language on PostgreSQL running on Amazon Relational Database Service (RDS) and SQLite using SQLalchemy Review sci-kit learn, Gensim, and spaCy to perform NLP tasks on scraped web pages such as name entity recognition, topic clustering (Kmeans, Agglomerative Clustering), topic modeling (LDA, NMF, LSI), topic classification (naive Bayes, Gradient Boosting Classifier) and text similarity (cosine distance-based nearest neighbors) Handle web archival file formats and explore Common Crawl open data on AWS Illustrate practical applications for web crawl data by building a similar website tool and a technology profiler similar to builtwith.com Write scripts to create a backlinks database on a web scale similar to Ahrefs.com, Moz.com, Majestic.com, etc., for search engine optimization (SEO), competitor research, and determining website domain authority and ranking Use web crawl data to build a news sentiment analysis system or alternative financial analysis covering stock market trading signals Write a production-ready crawler in Python using Scrapy framework and deal with practical workarounds for Captchas, IP rotation, and more Who This Book Is For Primary audience: data analysts and scientists with little to no exposure to real-world data processing challenges, secondary: experienced software developers doing web-heavy data processing who need a primer, tertiary: business owners and startup founders who need to know more about implementation to better direct their technical team

Product Identifiers

Publisher
Apress
ISBN-13
9781484265758
eBay Product ID (ePID)
16046645930

Product Key Features

Number of Pages
397 Pages
Language
English
Publication Name
Getting Structured Data from the Internet: Running Web Crawlers/Scrapers on a Big Data Production Scale
Publication Year
2020
Subject
Computer Science
Type
Textbook
Author
Jay M. Patel
Format
Paperback

Dimensions

Item Height
254 mm
Item Weight
795 g
Item Width
178 mm

Additional Product Features

Country/Region of Manufacture
United States
Title_Author
Jay M. Patel

Item description from the seller

Seller business information

I certify that all my selling activities will comply with all EU laws and regulations.
VAT number: GB 976952259
I provide invoices with VAT separately displayed.
CRN: 06687355
About this seller

BOOKS etc.

99.5% positive Feedback1.8M items sold

Joined Apr 2013
Usually responds within 24 hours
Registered as a business seller
BOOKS etc is an UK online business that is based in ALDERSHOT, Hampshire. We are a family run company that aims to bring customers a fantastic selection of books with great prices, with friendly ...
See more

Detailed seller ratings

Average for the last 12 months
Accurate description
5.0
Reasonable postage cost
5.0
Delivery time
4.9
Communication
4.9

Seller Feedback (588,709)

All ratings
Positive
Neutral
Negative
  • u***t (2244)- Feedback left by buyer.
    Past 6 months
    Verified purchase
    A smooth and professional transaction throughout. The item was exactly as described, clearly listed, and fairly priced. Communication from the seller was prompt, polite, and helpful, with dispatch confirmed quickly. The parcel was securely packaged and arrived in excellent condition, ahead of the expected delivery date. Care was taken at every stage of the process. I would be happy to buy from this seller again—many thanks for a reliable and well-handled sale.
  • w***r (65)- Feedback left by buyer.
    Past 6 months
    Verified purchase
    Item as described and despatched in quick time. Item arrived within 3 days and was well packaged. Item was much cheaper than I found elsewhere online. I had a query about my order and the seller responded on the same day and sorted everything out. Would happily buy from seller again
  • 0***1 (1207)- Feedback left by buyer.
    Past 6 months
    Verified purchase
    The book was described accurately and is in brand new condition. It was one of the cheapest new copies on EBay. It was posted in very secure cardboard packaging and was delivered quite quickly. To be honest, buyer communications weren’t great but I don’t massively care about them as long as I get the item - which I did.