Course Project: Multi-Pod Multi-Container Pipeline on Kubernetes
Course Project: Multi-Pod Multi-Container Pipeline on Kubernetes
Introduction
Motivation
The goal of this class is to equip students with not only theory but also technical skills that are applicable to real world problems. To accomplish this, we will follow a project-based learning approach, where students start with a top-down project idea and gradually acquire the necessary knowledge to implement this project over the duration of the course.
Project idea
The project idea should evolve around a full stack pipeline that involves multiple pods with at least one of which contains multiple containers. The deployment of this pipeline will be carried out on Kubernetes, with consideration for authentication and authorization aspects. This deployment process will teach students the concept of infrastructure as code and improve their administrative automation and scripting skills.
Example Project Idea
While groups are free to select their own project ideas, these selections need to reflect the complexity of the pipeline. To that end, we have the following template project idea: Image Classification Pipeline on Kubernetes
Project Overview
Design and deploy a simplified image classification pipeline on Kubernetes. The system demonstrates multi-pod orchestration and possibly multi-container pods using sidecars. At minimum, include:
Preprocessing Service (resizes/normalizes images)
Inference Service (serves a pre-trained model)
Storage Service (persists original/processed images and results)
It is possible to have in-depth projects that veer away from the multi-pod/multi-container requirements. All project selections need to be approved by the professor.
Milestone 1: Team Formation & Architecture (Design Only)
Goal
Form a team,
Define the pipeline architecture, and
Make clear technical choices before building.
Requirements
Form teams between 3-5 students.
Learn about your teammates’ capabilities and define roles (Lead/PM, DevOps, Backend/Inference, Preprocessing, QA/Docs).
Develop use cases and workflow:
Write 2–3 user stories and draw data-flow diagram.
Example:
User story: As a user, I upload an image and retrieve the predicted label
Restart/redeploy Pods and show persisted data remains accessible.
Testing:
Example: Include a small script or curl commands to demonstrate upload → preprocessing → inference → storage → retrieval.
Final operational consideration:
Health/readiness probes where appropriate.
Resource requests/limits for inference pods (lightweight, realistic).
Basic logs/metrics evidence.
Deliverables
Update project repository with manifests for storage/ YAMLs (PV/PVC, or StatefulSet if you go that route)
Final end-to-end demo video (≤5 min)
Final technical report
PDF format, 1-inch margin, 10-pts font size.
Content:
Front matter (names, title)
Overview (marketing!)
4–5 page (undergraduates) or 6-7 page (team containing graduate students)
Architecture recap, data-flow diagram (updates if changed)
Pod grouping & any sidecar choices
Services and routing
Scaling/self-healing results
Persistence design & verification steps
Known limitations + next steps
Constraints & Hints
Prefer ClusterIP for internal comms; expose exactly one external entrypoint unless justified.
Keep input data small and predictable; pin versions.
If using AI-based services, start with a tiny pre-trained model (e.g., MobileNet-v2 or even a mock classifier) to focus on orchestration, not training.
Test early with kubectl port-forward and curl before wiring the full flow.
For grading reproducibility, ensure kubectl delete -f . && kubectl apply -f . cleanly reprovisions the stack.