Migrating a production monolith to Kubernetes is one of the most impactful — and risky — infrastructure transformations an engineering team can undertake. Done poorly, it leads to extended outages and frustrated users. Done right, it unlocks a new level of scalability, resilience, and deployment velocity.
Here's a battle-tested strategy for achieving zero-downtime migration.
Phase 1: Containerize Without Migrating
The first mistake teams make is trying to containerize and migrate simultaneously. Instead, containerize your application first and deploy it alongside the existing infrastructure.
# Dockerfile for a Spring Boot monolith
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY target/application.jar app.jar
# Health check for K8s readiness probes
HEALTHCHECK --interval=10s --timeout=3s --retries=3 \
CMD wget -qO- http://localhost:8080/actuator/health || exit 1
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Run the containerized version in parallel with production. Route a small percentage of traffic to it using weighted load balancing. This validates that the container behaves identically to the bare-metal deployment.
Phase 2: The Strangler Fig Pattern
Rather than migrating everything at once, use the Strangler Fig pattern — gradually extract services and route traffic to them. Start with the least critical components.
The strangler fig grows around a host tree until it eventually replaces it entirely. Your new microservices should grow around the monolith the same way.
Identify bounded contexts within your monolith. Extract them one by one:
- Identify the boundary — find a module with clear inputs/outputs and minimal shared state
- Create the new service — build it as a standalone K8s deployment
- Proxy traffic — use an API gateway to route requests to the new service
- Validate in shadow mode — run both old and new in parallel, comparing outputs
- Cut over — once confidence is high, route 100% to the new service
Phase 3: Kubernetes Deployment Strategy
For zero-downtime deployments within Kubernetes, configure your deployments correctly:
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-service
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # Never take a pod down before a new one is ready
maxSurge: 1 # Add one pod before removing old ones
template:
spec:
containers:
- name: payment-service
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"] # Graceful drain
Critical Configuration Points
- maxUnavailable: 0 — ensures a new pod is fully ready before old ones shut down
- Readiness probes — prevent traffic from reaching pods that aren't ready to serve
- preStop hook — allows in-flight requests to complete before pod termination
- PodDisruptionBudgets — protect against accidentally destroying too many pods during node upgrades
Phase 4: Database Migration
The hardest part is the database. Stateless services are easy to migrate — stateful components require careful planning:
- Read replicas first — point new services to read replicas; keep writes on the original database
- Change Data Capture (CDC) — use tools like Debezium to stream changes between databases
- Feature flags — toggle between old and new data paths without deployments
- Dual-write with reconciliation — write to both databases and reconcile differences
Monitoring the Migration
You cannot migrate what you cannot measure. Before starting, establish comprehensive observability:
- Golden signals — latency, traffic, errors, saturation for every service
- Distributed tracing — correlate requests across old and new infrastructure
- Dashboards — real-time comparison of old vs. new system performance
- Automated rollback — if error rate exceeds thresholds, automatically route back to the old system
Zero-downtime migration isn't a single event — it's a process. Take it slow, measure everything, and always have a rollback plan.