Skip to product information
1 of 1

SRE with AIOps

Regular price $39.95
Sale price $39.95 Regular price
Sale Sold out
Tax included. Shipping calculated at checkout.
Type: Paperback
In stock (100 units), ready to be shipped

FREE PREVIEW

ISBN: 9789378542343
eISBN: 9789378548642
Authors: Sunny Behl, Giridhar Kanikarapu
Rights: Worldwide
Edition: 2026
Pages:  282
Dimension: 7.5*9.25 Inches
Book Type: Paperback

View Product Details

As digital ecosystems grow more complex and customer expectations reach new heights, the convergence of site reliability engineering (SRE) and artificial intelligence for IT operations (AIOps) is redefining how modern enterprises ensure resilience, performance, and reliability at scale. Intelligent automation and data-driven operations are no longer optional; they are the foundation of competitive advantage. This book is your essential guide to merging these two powerful disciplines to build faster, smarter, and more resilient operations.

This book begins with the foundational principles of SRE: SLOs, SLIs, error budgets, and toil reduction, before progressing through AIOps tooling, observability, and the unified knowledge base. Readers explore intelligent incident management, change and problem management, advanced anomaly detection using autoencoders and isolation forests, causal inference for root cause analysis, and the AIOps-powered SRE assistant. The book also explores chaos engineering, generative AI-powered SRE chatbots, and enterprise-scale AIOps adoption, culminating in a strategic roadmap for autonomous operations, predictive governance, and the role of LLMs and agentic AI in the future of reliability engineering.

By the end of this book, readers will possess both the strategic mindset and the technical depth to architect, lead, and scale intelligent operations. Whether you are an SRE practitioner, IT architect, or technology leader, you will be equipped to move from reactive firefighting to proactive, self-healing operations, delivering measurable reliability and business impact.

WHAT YOU WILL LEARN
● Apply SRE principles, SLOs, SLIs, and error budgets effectively.
● Evaluate and operationalize AIOps platforms for SRE goals.
● Build unified observability models from logs, metrics, and traces.
● Automate incident triage, correlation, and postmortem workflows.
● Deploy advanced anomaly detection using ML models.
● Design chaos engineering experiments to validate SLOs.
● Architect generative AI chatbots for incident and runbook automation.
● Scale AIOps across enterprise teams with measurable outcomes.

WHO THIS BOOK IS FOR
This book is for SREs, IT operations managers, cloud architects, and technology leaders who want to evolve from traditional operations to intelligent, AI-driven reliability practices. Readers should have intermediate experience in DevOps, SRE, or IT operations and a working familiarity with monitoring tools and cloud infrastructure.

1. SRE Principles Driving Modern Operations
2. AIOps Tools for SRE
3. AIOps Knowledgebase
4. Intelligent Incident Management for SREs
5. Streamlining Change and Problem Management
6. Path to Productivity and Reliability
7. Advanced Anomaly Detection
8. Causal Inference and Efficient Root Cause Analysis
9. Intelligent SRE Assistant
10. Chaos Engineering and Reliability Testing
11. Generative AI-powered SRE Chatbot
12. Scaling AIOps Across the Enterprise
13. Future Trends in SRE and AIOps

● Sunny Behl is the director and head of AIOps and digital production management at a major global financial institution, serving over 200 million customer accounts across more than 160 countries. With more than 20 years of experience transforming large-scale IT organizations, Sunny brings rare depth across AIOps, cloud-native platforms, site reliability engineering, DevOps, and agile methodologies.

He is the holder of five US patents granted by the United States Patent and Trademark Office (USPTO), three for automated AIOps platform management and two for microservices anomaly detection. These patents reflect Sunny's commitment to translating engineering insight into intellectual innovation.

A Splunk certified power user and SRE-certified professional via the DevOps institute, Sunny also holds a bachelor of engineering in electronics and communication and a post graduate program in artificial intelligence and machine learning. He is based in Irving, Texas, where he plays a pivotal role in Citigroup's technology and business enablement strategy.

● Giridhar Kanikarapu is a distinguished technology leader at a global financial institution with over 13 years of experience driving digital transformation across industries, including travel management, healthcare, and retail banking. He holds a master's degree in computer science and brings deep expertise in cloud computing, artificial intelligence, and modern software engineering practices.

In this role, Giridhar leads strategic technology modernization initiatives with a strong focus on site reliability engineering, AIOps, resilient production support, and scalable cloud-native architectures. He is known for delivering forward-looking solutions that leverage AI and cloud technologies to improve system reliability, accelerate innovation, and boost organizational agility. With a cross-industry perspective, he bridges legacy banking systems with next-generation digital capabilities, enabling enterprises to thrive in a rapidly evolving technology landscape.