DevOps · CI/CD · Kubernetes · Self-directed

Spring PetClinic on AWS EKS — end-to-end Jenkins delivery

A full Jenkins pipeline that takes the Spring Boot PetClinic microservices app from a git push to a running deployment on AWS EKS, with a dev → QA → prod promotion flow, ECR image management, Selenium smoke tests, Rancher cluster operations, and Prometheus + Grafana observability.

Jenkins AWS EKS ECR Rancher Nexus Maven Docker Kubernetes Selenium Prometheus Grafana Spring Boot

01What it delivers

The Spring PetClinic reference app is split into a handful of services (customers, vets, visits, api-gateway, admin, config, discovery). Each one has its own Dockerfile and its own Jenkins pipeline stage — the unit of delivery is a service, not the whole app.

  • Three environments — dev, qa, prod — each an EKS namespace with its own config and secrets.
  • Image promotion via ECR tags: every build publishes a :<git-sha> tag, and promotion is a retag, not a rebuild.
  • Smoke tests with Selenium against the deployed environment before promotion to the next stage is allowed.
  • Rancher-managed cluster for day-2 operations — kubectl context, RBAC, and dashboards in one place.
  • Prometheus + Grafana wired to the cluster and the app's Micrometer endpoints.

02The Jenkins pipeline

The Jenkinsfile is declarative, parameterized on target environment, and runs each stage only for services whose code changed in the current commit. That way a one-line fix to customers-service doesn't rebuild every image in the repo.

01 · CheckoutPull the monorepo, compute which services changed.
02 · BuildMaven build per changed service — unit tests gate the stage.
03 · Packagedocker build per service, tagged with the git SHA.
04 · PublishPush to AWS ECR; Nexus acts as the backup maven-central mirror.
05 · Deploy devApply Kubernetes manifests into the petclinic-dev namespace on EKS.
06 · Smoke testSelenium hits the gateway and verifies the happy path — owners, vets, visits.
07 · Promote QAManual approval, then retag ECR image and apply petclinic-qa manifests.
08 · Promote prodSecond approval gate. Same retag + apply into petclinic-prod.
09 · ObservabilityGrafana dashboards light up on the new pods via Prometheus discovery.

03Supporting infrastructure

Everything outside the app itself runs on AWS — the EKS cluster, the ECR registries, the Jenkins controller + agents, Nexus, and the monitoring stack. Cluster is 3 worker nodes by default; Rancher overlays multi-cluster management.

# Namespaces — one per environment, with RBAC + network policies
kubectl create namespace petclinic-dev
kubectl create namespace petclinic-qa
kubectl create namespace petclinic-prod

# ECR login for Jenkins agent (via IRSA / instance role in the real pipeline)
aws ecr get-login-password --region us-east-1 \
  | docker login --username AWS --password-stdin \
      <account>.dkr.ecr.us-east-1.amazonaws.com

# Rollout to the target namespace
kubectl -n petclinic-$ENV apply -f k8s/
kubectl -n petclinic-$ENV rollout status deploy/customers-service

04Observability

Each Spring Boot service exposes Micrometer metrics at /actuator/prometheus. Prometheus scrapes via the Kubernetes service-discovery SD config; Grafana dashboards show JVM health, HTTP latency percentiles, and per-pod resource usage side by side with the rollout timeline.

  • One Grafana dashboard per service with p50 / p95 / p99 latency panels.
  • Cluster-level dashboard for node CPU, memory, and pod count per namespace.
  • Alerts on elevated 5xx rate after a deploy — deliberately cheap and cheerful, not PagerDuty.

05What this project taught me

Promotion by retag, not rebuild

The thing that runs in prod must be byte-identical to what was smoke-tested in QA. Rebuilding on promotion re-introduces supply-chain risk for no benefit.

Pipelines per service, not per app

Monolithic pipelines optimize for the case where you change everything at once. In reality you almost never do. Per-service pipelines cut feedback time dramatically.

Rancher over raw kubectl

When multiple clusters and multiple namespaces enter the picture, Rancher's RBAC and context switching are worth the overhead of running the control plane.

Smoke tests gate promotion — not unit tests

Unit tests gate the build; Selenium against the real deployed URL gates promotion. Different questions, different tools.