Building Transformer Models with PyTorch 2.0

Prem Timsina

SKU: 9789355517494


ISBN: 9789355517494
eISBN: 9789355519900
Authors: Prem Timsina
Rights: Worldwide
Edition: 2024
Pages: 310
Dimension: 7.5*9.25 Inches
Book Type: Paperback

This book covers transformer architecture for various applications including NLP, computer vision, speech processing, and predictive modeling with tabular data. It is a valuable resource for anyone looking to harness the power of transformer architecture in their machine learning projects.

The book provides a step-by-step guide to building transformer models from scratch and fine-tuning pre-trained open-source models. It explores foundational model architecture, including GPT, VIT, Whisper, TabTransformer, Stable Diffusion, and the core principles for solving various problems with transformers. The book also covers transfer learning, model training, and fine-tuning, and discusses how to utilize recent models from Hugging Face. Additionally, the book explores advanced topics such as model benchmarking, multimodal learning, reinforcement learning, and deploying and serving transformer models.

In conclusion, this book offers a comprehensive and thorough guide to transformer models and their various applications.


  • Transformer architecture for different modalities and multimodalities.
  • Practical guidelines to build and fine-tune transformer models.
  • Comprehensive code samples with detailed documentation.


  • Understand the core architecture of various foundational models, including single and multimodalities.
  • Step-by-step approach to developing transformer-based Machine Learning models.
  • Utilize various open-source models to solve your business problems.
  • Train and fine-tune various open-source models using PyTorch 2.0 and the Hugging Face ecosystem.
  • Deploy and serve transformer models.
  • Best practices and guidelines for building transformer-based models.


This book caters to data scientists, Machine Learning engineers, developers, and software architects interested in the world of generative AI. 

  1. Transformer Architecture
  2. Hugging Face Ecosystem
  3. Transformer Model in PyTorch
  4. Transfer Learning with PyTorch and Hugging Face
  5. Large Language Models: BERT, GPT-3, and BART
  6. NLP Tasks with Transformers
  7. CV Model Anatomy: ViT, DETR, and DeiT
  8. Computer Vision Tasks with Transformers
  9. Speech Processing Model Anatomy: Whisper, SpeechT5, and Wav2Vec
  10. Speech Tasks with Transformers
  11. Transformer Architecture for Tabular Data Processing
  12. Transformers for Tabular Data Regression and Classification
  13. Multimodal Transformers, Architectures and Applications
  14. Explore Reinforcement Learning for Transformer
  15. Model Export, Serving, and Deployment
  16. Transformer Model Interpretability, and Experimental Visualization
  17. PyTorch Models: Best Practices and Debugging

Prem Timsina is the Director of Engineering at Mount Sinai Health Systems, where he oversees the development and implementation of Machine Learning Data Products. He has overseen multiple Machine Learning products that have been used as clinical decision support tool at multiple hospitals within New York City. With over 10 years of experience in the field, Dr. Timsina is a dedicated machine learning enthusiast who has worked on a variety of big data challenges using tools such as PyTorch, Deep Learning, Generative AI, Apache Spark, and various NoSQL platforms. He has contributed to the field through more than 40 publications in machine learning, text mining, and big data analytics. He earned his Doctor of Science degree in Information Systems from Dakota State University.

You may also like

Recently viewed