Data Processing and Modeling with Hadoop

Vinicius Aquino do Vale

SKU: 9789391392369


ISBN: 9789391392284
eISBN: 9789391392369
Authors: Vinicius Aquino do Vale
Rights: Worldwide
Publishing Date: October 2021
Pages: 198
Dimension: 6*9 Inches
Book Type: Paperback

Understand data in a simple way using a data lake.


  • In-depth practical demonstration of Hadoop/Yarn concepts with numerous examples.
  • Includes graphical illustrations and visual explanations for Hadoop commands and parameters.
  • Includes details of dimensional modeling and Data Vault modeling.
  • Includes details of how to create and define a structure to a data lake.


The book 'Data Processing and Modeling with Hadoop' explains how a distributed system works and its benefits in the big data era in a straightforward and clear manner. After reading the book, you will be able to plan and organize projects involving a massive amount of data.

The book describes the standards and technologies that aid in data management and compares them to other technology business standards. The reader receives practical guidance on how to segregate and separate data into zones, as well as how to develop a model that can aid in data evolution. It discusses security and the measures that are utilized to reduce the impact of security. Self-service analytics, Data Lake, Data Vault 2.0, and Data Mesh are discussed in the book.

After reading this book, the reader will have a thorough understanding of how to structure a data lake, as well as the ability to plan, organize, and carry out the implementation of a data-driven business with full governance and security.


  • Learn the basics of components to the Hadoop Ecosystem.
  • Understand the structure, files, and zones of a Data Lake.
  • Learn to implement the security part of the Hadoop Ecosystem.
  • Learn to work with the Data Vault 2.0 modeling.
  • Learn to develop a strategy to define good governance.
  • Learn new tools to work with Data and Big Data


This book caters to big data developers, technical specialists, consultants, and students who want to build good proficiency in big data. Knowing basic SQL concepts, modeling, and development would be good, although not mandatory.

  1. Understanding the Current Moment
  2. Defining the Zones
  3. The Importance of Modeling
  4. Massive Parallel Processing
  5. Doing ETL/ELT
  6. A Little Governance
  7. Talking About Security
  8. What Are the Next Steps?

Vinicius Aquino do Vale is an experienced technical consultant who has been working with clients and partners for 15 years in the design of technological solutions. In his career, Vinicius has participated in large projects as a specialist in Big Data technologies, having advanced knowledge of the Hadoop ecosystem. He has worked on several Big Data projects in the largest companies in Brazil assisting in architecture design, implementation, configuration, ingestion, analysis and ETL. He participated in the construction of all data lake / smart data flows, in addition to integrating the entire system with analytics tools like QLikSense, QlikView, Tableau, Metabase, Tibco SpotFire, in addition to implementing security integration with AD / LDAP.

Vinicius served as an MBA professor, teaching NoSQL, Data Ingestion and Parallel Mass Processing classes, as well as speaking at IT events for IT companies and communities. Vinicius is a PostgreSQL, MongoDB and Cassandra database specialist, as well as a Linux Server specialist: CentOS, Debian, RedHat, SUSE. Vinicius dedicated years to its improvement, obtaining international certifications such as: ITIL, LPIC (Linux), OCJP (Oracle Certified Professional, Java SE 6 Programmer), OCE-WCD (Oracle Certified Expert, Java EE 6 Web Component Developer), OCE-JPAD (Oracle Certified Expert, Java EE 6 Java Persistence API Developer), OCE-EJB (Oracle Certified Expert, Java EE 6 Enterprise JavaBeans Developer), Hadoop Administrator (Cloudera), PostgreSQL (EnterpriseDB). 

He has extensive experience in Java development for the Web, working with several technologies and frameworks, and also has the ability to lead and coordinate projects with agile methodology with SCRUM / Kanban. In addition, vinicius has a domain over public clouds like Google Cloud (GCP), Azure and AWS.

In 2014, Vinicius founded his own education company, Sudoers, where he is the Founder and Professor of Technology, helping, training and mentoring young people to pursue careers in technology.

Blog links: 

LinkedIn Profile: Vinicius Aquino do Vale

You may also like

Recently viewed