From Prototype to Production: Deploying in the Cloud
Welcome to Cloud Engineering
Details
From working prototypes to production infrastructure
You built something that works. Now comes the hard part: making it work reliably, at scale
This course takes you
from developer to cloud engineer
from local services to orchestrated microservices
from Docker to Kubernetes
You have built the app
We all remember RamCoin (a cryptocurrency mining stack)
Services:
rng: generates random data
hasher: hashes the data
worker: coordinates hashing jobs
redis: backend store
webui: frontend dashboard
Now you need to deploy itProblem: Prototype != Production
How do you run this system across multiple machines?
How do you recover from failure?
How do you update without downtime?
How do you manage logs, secrets, access controls?
We have options (but not all are equal)
Docker Compose
Great for dev/test
But no self-healing, no scaling (single node scaling only), no declarative control
Docker Swarm
Lightweight orchestration
But nearly deprecated and not widely supported
Kubernetes
Designed for large-scale, production deployments
Industry-standard
Built for resilience, scaling, and management
Details
If Docker Compose is your local test bed, Kubernetes is your data center.
What is Kubernetes?
Open-source container orchestration platform
Maintains desired state using declarative configuration (YAML)
Core resources:
Pods: smallest deployable unit
Services: expose pods
Deployments: manage pod replicas
What does orchestrate mean?
Dictionary definition: to arrange or combine so as to achieve a desired or maximum effect
Kubernetes documentation: We tell Kubernetes what the desired state of our system is like, and Kubernetes will work to maintain that
Before containerization/virtualization, we have cluster of computers running jobs.
Jobs = applications running on single or multiple computing nodes
Applications’ dependencies are tied in to the supporting operating system on these nodes.
Cluster management system only need to manage applications.
Container is more than an application.
A lightweight virtualization of an operating system and its components that help an application to run, including external libraries.
A running container does not depending on a host computer’s libraries.
Is the management process the same as a cluster management system?
Kubernetes in Context
Kubernetes didn’t emerge from nowhere
It was inspired by Borg, Google’s internal cluster manager
Before Kubernetes: Borg, a cluster management system
Google’s Cluster Management System
First developed in 2003.
Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems, p. 18. ACM, 2015.
Manages hundreds of thousands of jobs, from many thousands of different applications, across clusters up to tens of thousands machines.
Borg is the predecessor of Kubernetes. Understand Borg helps understand the design decision in creating Kubernetes.
Kubernetes is perhaps the most popular open-source container orchestration system today, for both academic and industry.
Other container orchestration systems are either
Deprecating (Docker Swarm)
Integrates container management as part of the existing framework rather than developing a new management system (UC Berkeley’s Mesos and Twitter’s Aurora)
Borg runs:
Gmail
YouTube
Google Search
Details
Kubernetes is Borg’s spiritual child but open, modular, extensible
Inside Borg (2003-2015)
Manages 10K+ machines per cluster
Schedules hundreds of thousands of jobs
Uses a central master, and node-level agents (“Borglets”)
Concepts:
Jobs
Tasks
Allocations
Cells
Borg’s concepts elaborated
Work is submitted to Borg as jobs, which can have one or more tasks (binary).
Each job runs in one Borg cell, consisting of multiple machines that are managed as a single unit.
A Borg’s allocation defines a reserved set of resources on a machine in which one or more tasks can be run.
Job types:
Long running services that should never goes down and have short-lived latency-sensitive requests: Gmail, Google Docs, Web Search …
Batch jobs that take a few seconds to a few days to complete.
Borg cells allow for not just applications, but applications frameworks
One master job and one or more worker jobs.
The framework can execute parallel applications itself.
Examples of frameworks running on top of Borg:
MapReduce
FlumeJava: Data-Parallel Pipelines
Millwheel: Fault-tolerant Stream Processing at Internet Scale
Pregel: Large-scale graph processing
Machines in cells belong to a single cluster, defined by the high-performance datacenter-scale network fabric connecting them.
How is this different that the traditional cluster model?
Borg’s architecture
Borg Master
Borglet
Scalability of Borg Master
Reported in the 2015 paper: Unsure of the ultimate scalability limit (flex anyone?)
A single master can
manage many thousands machines in a cell
several cells have arrival rates of more than 10,000 tasks per minute.
2020 Borg analysis report:
(Muhamad Tirmazi, Adam Barker, Nan Deng, Md E. Haque, Zhijing Gene Qin, Steven Hand, Mor Harchol-Balter, and John Wilkes. “Borg: the next generation.” In Proceedings of the fifteenth European conference on computer systems)[https://dl.acm.org/doi/pdf/10.1145/3342195.3387517]