Practical Machine Learning with Spark

Gourav Gupta, Dr. Manish Gupta, Dr. Inder Singh Gupta


ISBN: 9789391392086
eISBN: 9789391392130
Authors: Gourav Gupta, Dr. Manish Gupta, Dr. Inder Singh Gupta
Rights: Worldwide
Publishing Date: April 2022
Pages: 498
Dimension: 7.50 X 9.25
Book Type: Paperback
Explore the cosmic secrets of Distributed Processing for Deep Learning applications.

● In-depth practical demonstration of ML/DL concepts using Distributed Framework.
● Covers graphical illustrations and visual explanations for ML/DL pipelines.
● Includes live codebase for each of NLP, computer vision and machine learning applications.

This book provides the reader with an up-to-date explanation of Machine Learning and an in-depth, comprehensive, and straightforward understanding of the architectural techniques used to evaluate and anticipate the futuristic insights of data using Apache Spark.

The book walks readers by setting up Hadoop and Spark installations on-premises, Docker, and AWS. Readers will learn about Spark MLib and how to utilize it in supervised and unsupervised machine learning scenarios. With the help of Spark, some of the most prominent technologies, such as natural language processing and computer vision, are evaluated and demonstrated in a realistic setting. Using the capabilities of Apache Spark, this book discusses the fundamental components that underlie each of these natural language processing, computer vision, and machine learning technologies, as well as how you can incorporate these technologies into your business processes.

Towards the end of the book, readers will learn about several deep learning frameworks, such as TensorFlow and PyTorch. Readers will also learn to execute distributed processing of deep learning problems using the Spark programming language.

● Learn how to get started with machine learning projects using Spark.
● Witness how to use Spark MLib's design for machine learning and deep learning operations.
● Use Spark in tasks involving NLP, unsupervised learning, and computer vision.
● Experiment with Spark in a cloud environment and with AI pipeline workflows.
● Run deep learning applications on a distributed network.

This book is valuable for data engineers, machine learning engineers, data scientists, data architects, business analysts, and technical consultants worldwide. It would be beneficial to have some familiarity with the fundamentals of Hadoop and Python.


1. Introduction to Machine Learning

2. Apache Spark Environment Setup and Configuration

3. Apache Spark

4. Apache Spark MLlib

5. Supervised Learning with Spark

6. Un-Supervised Learning with Apache Spark

7. Natural Language Processing with Apache Spark

8. Recommendation Engine with Distributed Framework

9. Deep Learning with Spark

10. Computer Vision with Apache Spark

Mr. Gourav Gupta is a Data specialist having 5+ years of experience in Big Data, Artificial Intelligence, Deep Learning, Augment Intelligence, Internet of Things and Digital Twin. Mr. Gourav has worked on several interdisciplinary real time projects which are the conglomerations of Digital Technologies. His expertise is on architectural optimization and technical solutioning on Big Data, AI, Computer Vision, and Internet of Things. He also loves to write research articles and serves as a reviewer with Springer Journal.

Linkedin Profile:  Gourav Gupta

Dr. Manish Gupta is a 21st century researcher, innovator, and entrepreneur. He has completed his Ph.D. from reputed Jawaharlal Nehru University, India. Presently, he is working at Department of Radiology, Perelman School of Medicine, University of Pennsylvania (UPENN), Philadelphia, USA. Prior to UPENN, Dr. Gupta worked at Gwangju Institute of Science and Technology, Gwangju, South Korea. In addition, he is founder member and Chief Research Advisor of digital healthcare startup (Arogya Pandit Private Limited) at India. He has filed patents and published several research articles in well-reputed SCI journals and international conferences/book chapters. His research interest is on Low-cost biosensors development, Development and optimization of pulse sequence using MRI, Tumor classification using Machine Learning and Deep Learning using MRI. In addition, he is also working on several projects related to Big Data integration with Artificial intelligence and Internet of Things. Dr. Gupta also loves to write poems and technical blogs.

Linkedin Profile: Manish Gupta

Professor (Dr.) Inder Singh Gupta is a seismologist, statistician, mathematical modeler, and Data Science expert. He has 37+ years of rich experience in Research, Teaching, Principal Supervisor for many Govt. funded projects along with numerous research publications in reputed international journals and conferences. He is also an author of many undergraduate and postgraduate books of mathematics. Currently, he is retired from JVMGRR(PG) College, India, and serving as Chief Executive Officer in digital healthcare startup (Arogya Pandit Private Limited,India ( and a visiting professor at University of Adelaide, Australia.

Linkedin Profile: Inder Singh Gupta

You may also like

Recently viewed