Hybrid Resilience (H/R) Testing Model: AI-Driven Zero Downtime Deployment for Kubernetes DOI

Haranath Rakshit,

Subhasis Banerjee

Research Square (Research Square), Journal Year: 2025, Volume and Issue: unknown

Published: April 8, 2025

Abstract Deploying applications in Kubernetes without any downtime is a big challenge. Current methods like Rolling Updates, Blue-Green, Canary, and A/B Testing wait until failure happens before taking action. This can lead to unexpected disruptions, manual rollbacks, slow recovery. To solve this, we introduce the Hybrid Resilience (H/R) Model, new way predict prevent failures they happen. Our model combines AI-driven prediction, chaos engineering resilience testing, automated rollback using smart decision-making (MDP), traffic-aware self-healing Istio ensure smooth uninterrupted deployments. Instead of testing real-world setup, focus on theoretical metrics—Resilience Score (RS), Failure Probability (FP), Chaos Impact (D)—to measure how strong deployment it goes live. provides guideline for future research, helping developers move from fixing after preventing them deployment, making more reliable self-healing.

Language: Английский

AI Workload Automation and Orchestration in Cloud Environments DOI

Sachinkumar Anandpal Goswami,

Kashyap C. Patel,

Dhara Ashish Darji

et al.

Advances in computational intelligence and robotics book series, Journal Year: 2025, Volume and Issue: unknown, P. 33 - 70

Published: Jan. 17, 2025

Automation and orchestration of workloads in the cloud environment are critical for managing ever-increasing complexity scale artificial intelligence applications. This chapter analyses tools techniques used to automate optimize AI workflows infrastructure. focus is on role that platforms, such as Kubernetes, play distributed benefits automation scaling, resource management, fault tolerance. will discuss integration ML pipeline with services towards streamlined model training, deployment observation. The discussion puts greater emphasis fact workload maximises productivity, minimizes downtime while making definite almost guaranteed high availability clouds. case studies reflect best practices adopted most common problems face.

Language: Английский

Citations

0

Hybrid Resilience (H/R) Testing Model: AI-Driven Zero Downtime Deployment for Kubernetes DOI

Haranath Rakshit,

Subhasis Banerjee

Research Square (Research Square), Journal Year: 2025, Volume and Issue: unknown

Published: April 8, 2025

Abstract Deploying applications in Kubernetes without any downtime is a big challenge. Current methods like Rolling Updates, Blue-Green, Canary, and A/B Testing wait until failure happens before taking action. This can lead to unexpected disruptions, manual rollbacks, slow recovery. To solve this, we introduce the Hybrid Resilience (H/R) Model, new way predict prevent failures they happen. Our model combines AI-driven prediction, chaos engineering resilience testing, automated rollback using smart decision-making (MDP), traffic-aware self-healing Istio ensure smooth uninterrupted deployments. Instead of testing real-world setup, focus on theoretical metrics—Resilience Score (RS), Failure Probability (FP), Chaos Impact (D)—to measure how strong deployment it goes live. provides guideline for future research, helping developers move from fixing after preventing them deployment, making more reliable self-healing.

Language: Английский

Citations

0