Explore different web mining techniques to discover patterns, structures, and information from the web
- A complete overview of the basic and advanced concepts of Web mining.
- Work with easy-to-use open-source Python libraries for Web mining.
- Get familiar with the various beneficial areas and applications of Web mining.
Data Science is the fastest growing job across the globe and is predicted to create 11.5 million jobs by 2026, so job seekers with this skill set have a lot of opportunities. One of the most sought areas in the field of Data Science is mining information from the web. If you are an aspiring Data Scientist looking to learn different Web mining techniques, then this book is for you.
This book starts by covering the key concepts of Web mining and its taxonomy. It then explores the basics of Web scraping, its uses and components followed by topics like legal aspects related to scraping, data extraction and pre-processing, scraping dynamic websites, and CAPTCHA. The book also introduces you to the concept of Opinion mining and Web structure mining. Furthermore, it covers Web graph mining, Web information extraction, Web search and hyperlinks, Hyperlink Induced Topic Search (HITS) search, and partitioning algorithms that are used for Web mining. Towards the end, the book will teach you different mining techniques to discover interesting usage patterns from Web data.
By the end of the book, you will master the art of data extraction using Python.
WHAT YOU WILL LEARN
- Learn how to scrape data from any website with Python.
- Get familiar with the concepts of Opinion Mining and Sentiment Analysis.
- Use Web structure mining to discover structure information from the web.
- Learn how to collect and analyze social media data using Python.
- Use Web usage mining for predicting users' browsing behaviors.
WHO THIS BOOK IS FOR
The book is for anyone who wants to learn Web mining. Aspiring Data Scientists, Data Engineers, and Data Analysts who want to master Web mining will find this book very helpful.