Kubernetes Guide for Beginners: Everything You Need to Know

Kunle·Last updated: 2026-01-25·14 min read·4,170 views

Kubernetes (often shortened to K8s) is an open-source platform that automates the deployment, scaling, and management of containerised applications across clusters of servers. If Docker packages your application into a container, Kubernetes is the system that decides where those containers run, keeps them running, and scales them up or down based on demand.

Originally built by Google and now maintained by the Cloud Native Computing Foundation (CNCF), Kubernetes has become the industry standard for container orchestration. 61% of companies running containers in production use Kubernetes, and every major cloud provider offers a managed Kubernetes service.

This guide covers everything you need to understand Kubernetes as a beginner from the problem it solves, to its architecture, to the core concepts you'll work with daily.

Why Kubernetes exists

To understand Kubernetes, you need to understand the problem it solves.

Imagine you have a web application running in a Docker container on a single server. It works well. Then your application gets popular. One server can't handle the traffic. You need to run the same container on five servers, distribute traffic between them, and restart any container that crashes without your users noticing.

That's what Docker alone cannot do. Docker runs containers on a single machine. It has no concept of clusters, traffic distribution, or automatic recovery across servers.

Before Kubernetes, teams solved this with manual scripts, custom tooling, and a lot of hope. Operators would SSH into servers at 3am to restart crashed processes. Deployments meant taking the application offline, replacing the files, and praying it came back up cleanly.

Kubernetes automates all of this:

  • Scheduling it decides which server runs which container, based on available resources
  • Scaling it adds more container copies when demand increases and removes them when demand drops
  • Self-healing if a container crashes, Kubernetes restarts it automatically. If a server dies, Kubernetes moves its containers to healthy servers
  • Networking it assigns IP addresses, load-balances traffic, and manages service discovery
  • Rolling updates it deploys new versions gradually, with zero downtime
  • Storage orchestration it mounts persistent storage to containers that need it

In short: Kubernetes turns a collection of servers into a single, programmable platform for running containers. You tell Kubernetes what you want (three copies of this container, always running, with this much memory), and Kubernetes makes it happen.

Kubernetes architecture

Kubernetes runs as a cluster: a set of machines working together. A cluster has two types of components: the control plane (the brain) and worker nodes (the muscle).

Control plane

The control plane makes global decisions about the cluster scheduling, scaling, and responding to events. It runs on one or more dedicated machines (called master nodes in older documentation).

API Server (kube-apiserver) the front door to the cluster. Every command you run with kubectl, every internal component communication, and every API call goes through the API server. It validates requests and updates the cluster state.

Scheduler (kube-scheduler) watches for newly created containers that have no assigned node. It evaluates which node has the best available resources (CPU, memory, GPU) and assigns the container to that node. Think of it as the air traffic controller deciding which runway each plane lands on.

Controller Manager (kube-controller-manager) runs a set of controllers that watch the cluster state and make adjustments. If you say "I want 3 copies of this container" and one dies, the ReplicaSet controller detects the gap and creates a replacement. Each controller is a reconciliation loop: compare desired state to actual state, take action if they differ.

etcd a distributed key-value store that holds the entire cluster state. Every object in Kubernetes (every pod, service, and configuration) is stored in etcd. It's the single source of truth. If etcd is lost, the cluster is lost. That's why production clusters run etcd with replication and regular backups.

Worker nodes

Worker nodes run the actual application containers. A cluster typically has many worker nodes. Each one runs three components:

Kubelet an agent that runs on every worker node. It receives instructions from the control plane ("run this container on this node") and ensures the container is running and healthy. If a container crashes, the kubelet restarts it. Think of the kubelet as the site manager at each building in a construction project.

Kube-proxy manages networking rules on each node. It routes traffic to the correct containers, handles load balancing between container copies, and maintains network rules so containers can communicate with each other and the outside world.

Container runtime the software that actually runs containers. Kubernetes supports several runtimes. containerd is the most common in production. Docker (via containerd) is what most people learn with first.

How it all fits together

  1. You submit a request to the API server: "Run 3 copies of my web application"
  2. The API server validates the request and stores it in etcd
  3. The scheduler sees three unscheduled containers and assigns each to a node with available resources
  4. The kubelet on each assigned node pulls the container image and starts the container
  5. Kube-proxy sets up networking rules so traffic can reach the containers
  6. The controller manager continuously watches: if a container dies, it creates a replacement

This loop runs constantly. Kubernetes is always reconciling the desired state (what you asked for) with the actual state (what's actually running). That's the core design principle: declarative configuration. You declare what you want, and Kubernetes figures out how to make it happen.

Core concepts explained

Kubernetes has a lot of terminology. Here are the concepts you'll work with from day one, each with a plain-English explanation and a YAML example.

Pods

A Pod is the smallest deployable unit in Kubernetes. It wraps one or more containers that share the same network space and storage. In practice, most Pods run a single container.

Analogy: If a container is an application in a box, a Pod is that box sitting on a shelf in a warehouse. Kubernetes manages Pods, not individual containers.

apiVersion: v1
kind: Pod
metadata:
  name: my-web-app
  labels:
    app: web
spec:
  containers:
    name: web
      image: nginx:1.27
      ports:
        containerPort: 80

You rarely create Pods directly. Instead, you create Deployments, which create and manage Pods for you.

Deployments

A Deployment manages a set of identical Pods. You tell it how many copies (replicas) you want, and it ensures that exact number is always running. If a Pod crashes, the Deployment creates a replacement. When you update the container image, the Deployment rolls out the change gradually.

Analogy: A Deployment is like a staffing manager. You say "I need 3 cashiers working at all times." If one goes home sick, the manager calls in a replacement. If you decide cashiers should wear new uniforms, the manager swaps them one at a time so the shop never closes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        name: web
          image: nginx:1.27
          ports:
            containerPort: 80
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "250m"
              memory: "256Mi"

This creates 3 Pods, each running nginx:1.27. If you change the image to nginx:1.28 and reapply, Kubernetes performs a rolling update replacing Pods one at a time.

Services

A Service provides a stable network endpoint for a set of Pods. Pods are ephemeral they get created and destroyed constantly, and each one gets a different IP address. A Service gives you a single, unchanging address that routes traffic to whichever Pods are currently running.

Analogy: A Service is like a restaurant's phone number. Staff come and go, but customers always dial the same number. The receptionist (Service) routes the call to whoever is available.

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
    protocol: TCP
      port: 80
      targetPort: 80
  type: ClusterIP

Service types:

  • ClusterIP (default) accessible only within the cluster
  • NodePort exposes the Service on a static port on each node
  • LoadBalancer provisions a cloud load balancer (on AWS, GCP, Azure) that routes external traffic to the Service

Ingress

An Ingress manages external HTTP/HTTPS access to Services inside the cluster. While a LoadBalancer Service gives you a raw IP, an Ingress lets you route traffic based on domain names and URL paths. It also handles TLS termination (HTTPS).

Analogy: If Services are departments in a building, an Ingress is the reception desk. Visitors say "I'm here for Sales" (sales.example.com) and the receptionist directs them to the right floor.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
    host: app.example.com
      http:
        paths:
          path: /
            pathType: Prefix
            backend:
              service:
                name: web-service
                port:
                  number: 80
  tls:
    hosts:
        app.example.com
      secretName: tls-secret

You need an Ingress Controller (such as NGINX Ingress Controller or AWS ALB Ingress Controller) installed in your cluster for Ingress resources to work.

ConfigMaps and Secrets

ConfigMaps store non-sensitive configuration data as key-value pairs. Secrets store sensitive data (passwords, API keys, tokens) in base64-encoded form.

Analogy: ConfigMaps are the instruction manual left on a worker's desk. Secrets are the key card locked in a safe. Both provide information the application needs, but Secrets get extra protection.

# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_HOST: "db.example.com"
  LOG_LEVEL: "info"
  MAX_CONNECTIONS: "100"
---
# Secret
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
data:
  DATABASE_PASSWORD: cGFzc3dvcmQxMjM=  # base64-encoded
  API_KEY: c2VjcmV0LWtleS12YWx1ZQ==

You can mount ConfigMaps and Secrets as environment variables or as files inside a container. This keeps configuration separate from the container image the same image works in development, staging, and production with different ConfigMaps.

Namespaces

Namespaces divide a single cluster into virtual sub-clusters. They provide isolation for teams, environments, or applications. Resources in one Namespace don't interfere with resources in another.

Analogy: Namespaces are floors in an office building. Each floor has its own rooms, equipment, and staff. People on floor 3 don't accidentally walk into floor 5's meeting rooms.

apiVersion: v1
kind: Namespace
metadata:
  name: production
---
apiVersion: v1
kind: Namespace
metadata:
  name: staging

Common Namespace patterns:

  • Per environment: production, staging, development
  • Per team: team-frontend, team-backend, team-data
  • Per application: payments-app, user-service

Kubernetes creates three default Namespaces: default, kube-system (for cluster components), and kube-public.

PersistentVolumes

Containers are ephemeral when a Pod is destroyed, any data inside it is lost. PersistentVolumes (PVs) provide storage that survives Pod restarts and deletions. A PersistentVolumeClaim (PVC) is a request for storage, which Kubernetes matches to an available PersistentVolume.

Analogy: A PersistentVolume is a filing cabinet in the office. Workers (Pods) come and go, but the filing cabinet stays. A PVC is a request form: "I need a filing cabinet with at least 10GB of space."

# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-storage
spec:
  accessModes:
    ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: gp3
---
# Using the PVC in a Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        name: postgres
          image: postgres:16
          volumeMounts:
            mountPath: /var/lib/postgresql/data
              name: db-data
      volumes:
        name: db-data
          persistentVolumeClaim:
            claimName: database-storage

On managed Kubernetes (EKS, GKE, AKS), the cloud provider automatically provisions the underlying disk when you create a PVC. The storageClassName determines the disk type (e.g., gp3 for AWS EBS).

Kubernetes vs Docker Compose

If you've used Docker Compose, you might wonder when to use Kubernetes instead. Here's a direct comparison:

FeatureDocker ComposeKubernetes
ScopeSingle machineCluster of machines
ScalingManual (scale: 5 in YAML)Automatic (HPA based on metrics)
Self-healingRestart policies onlyFull self-healing across nodes
Load balancingBasic (round-robin)Advanced (Services, Ingress)
Rolling updatesNot built-inNative, zero-downtime
StorageDocker volumes (local)PersistentVolumes (network-attached)
NetworkingSingle Docker networkCluster-wide networking, DNS, network policies
Secrets managementEnvironment filesEncrypted Secrets, external vaults
Configurationdocker-compose.ymlDeclarative YAML manifests
Setup complexityLow (minutes)High (hours to days for self-managed)
Best forDevelopment, small projectsProduction, microservices, scale
Community/ecosystemModerateMassive (Helm charts, operators, CNCF)

Rule of thumb: Use Docker Compose for local development and single-server deployments. Use Kubernetes when you need multi-server reliability, auto-scaling, or zero-downtime deployments.

Many teams use both: Docker Compose for local development, Kubernetes for staging and production.

When to use Kubernetes

Kubernetes is powerful but not always necessary. Here's a decision framework:

You likely need Kubernetes if:

  • You run multiple services that communicate with each other (microservices)
  • You need auto-scaling based on CPU, memory, or custom metrics
  • You require zero-downtime deployments
  • You need high availability across multiple servers or availability zones
  • Your team deploys frequently (multiple times per day)
  • You're running AI/ML workloads that need GPU scheduling
  • You need consistent environments across development, staging, and production

You probably don't need Kubernetes if:

  • You have a single application on one or two servers
  • Docker Compose handles your deployment needs
  • You have a very small team (1-3 developers) with no dedicated operations staff
  • Your application has low traffic with no scaling requirements
  • A Platform-as-a-Service (Heroku, Railway, Render) meets your needs

The middle ground: Managed Kubernetes services (EKS, GKE, AKS) reduce operational overhead significantly. The control plane is managed for you. If you're on the fence, starting with a managed service is a sensible compromise.

Getting started with Kubernetes

There are three main ways to run Kubernetes as a beginner:

Minikube runs a single-node Kubernetes cluster on your local machine inside a virtual machine or container. It's the simplest way to start learning.

# Install minikube (macOS)
brew install minikube

# Start a cluster
minikube start

# Verify it's running
kubectl get nodes

# Deploy a sample application
kubectl create deployment hello --image=nginx:1.27
kubectl expose deployment hello --port=80 --type=NodePort

# Access the application
minikube service hello

Minikube is ideal for learning and experimentation. It supports most Kubernetes features, including Ingress, DNS, storage, and dashboard.

2. kind (Kubernetes in Docker)

kind runs Kubernetes clusters as Docker containers. It's faster to start than Minikube and is popular for CI/CD testing.

# Install kind
brew install kind

# Create a cluster
kind create cluster --name my-cluster

# Verify
kubectl get nodes

3. Managed Kubernetes (for production)

For real workloads, use a managed Kubernetes service from a cloud provider:

ProviderServiceStrengths
AWSEKS (Elastic Kubernetes Service)Largest ecosystem, deepest AWS integration
Google CloudGKE (Google Kubernetes Engine)Created K8s, best managed experience
AzureAKS (Azure Kubernetes Service)Strong enterprise integration, free control plane

Managed services handle the control plane for you (API server, scheduler, etcd). You only manage the worker nodes and your application deployments. This removes the hardest part of running Kubernetes in production.

Our recommendation: Start with Minikube to learn concepts, then move to a managed service (EKS is most common in job postings) when you're comfortable with the core objects.

Kubernetes for AI workloads

One of the fastest-growing use cases for Kubernetes is running AI and machine learning workloads. Here's why Kubernetes has become essential for AI infrastructure.

The GPU scheduling problem

AI model training and inference require GPUs expensive, specialised hardware. A single NVIDIA A100 GPU costs thousands of pounds. Organisations run clusters of these GPUs, and they need to share them efficiently across teams and workloads.

Kubernetes solves this with the NVIDIA device plugin, which makes GPUs schedulable resources just like CPU and memory:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: model-inference
spec:
  replicas: 2
  selector:
    matchLabels:
      app: inference
  template:
    metadata:
      labels:
        app: inference
    spec:
      containers:
        name: model-server
          image: my-model-server:v1
          resources:
            limits:
              nvidia.com/gpu: 1  # Request 1 GPU per Pod
          ports:
            containerPort: 8080

The scheduler automatically places this Pod on a node that has an available GPU. No manual assignment, no SSH-ing into GPU servers.

Why K8s is essential for AI infrastructure

Multi-tenancy multiple teams share the same GPU cluster. Kubernetes Namespaces and resource quotas ensure fair allocation.

Auto-scaling scale inference Pods up during peak demand and down during quiet periods. GPU time is expensive; auto-scaling prevents waste.

Pipeline orchestration tools like Kubeflow run end-to-end ML pipelines on Kubernetes: data preprocessing, model training, evaluation, and deployment all as Kubernetes jobs.

Model serving frameworks like NVIDIA Triton and vLLM run as Kubernetes Deployments, serving multiple models concurrently with load balancing.

Monitoring DCGM Exporter exposes GPU metrics (utilisation, memory, temperature) to Prometheus, giving teams visibility into GPU health and usage through Grafana dashboards.

The infrastructure behind products like ChatGPT runs on Kubernetes. Companies building AI products need engineers who understand both Kubernetes and GPU workload management a combination that commands some of the highest salaries in tech.

Common kubectl commands

kubectl is the command-line tool for interacting with Kubernetes. Here are the commands you'll use most frequently:

CommandWhat it does
kubectl get podsList all Pods in the current Namespace
kubectl get pods -AList all Pods across all Namespaces
kubectl get deploymentsList all Deployments
kubectl get servicesList all Services
kubectl get nodesList all nodes in the cluster
kubectl describe pod <name>Show detailed information about a Pod
kubectl logs <pod-name>View container logs
kubectl logs <pod-name> -fStream logs in real time
kubectl exec -it <pod-name> -- /bin/shOpen a shell inside a running container
kubectl apply -f <file.yaml>Create or update resources from a YAML file
kubectl delete -f <file.yaml>Delete resources defined in a YAML file
kubectl scale deployment <name> --replicas=5Scale a Deployment to 5 Pods
kubectl rollout status deployment/<name>Watch a rolling update in progress
kubectl rollout undo deployment/<name>Roll back to the previous version
kubectl top podsShow CPU and memory usage per Pod
kubectl get events --sort-by=.metadata.creationTimestampView recent cluster events
kubectl config get-contextsList available cluster contexts
kubectl config use-context <name>Switch to a different cluster

Tip: Add -o wide to any get command for additional details (node name, IP address). Add -o yaml to see the full YAML definition of any resource.

What to learn before Kubernetes

Kubernetes builds on several foundational skills. Trying to learn Kubernetes without these prerequisites is like learning calculus without algebra technically possible, but unnecessarily painful.

  1. Linux fundamentals navigating the filesystem, managing processes, reading logs, basic networking commands (curl, netstat, dig). Kubernetes runs on Linux, and debugging requires Linux skills.

  2. Networking basics TCP/IP, DNS, ports, HTTP/HTTPS, load balancing concepts. Kubernetes networking is complex; understanding the fundamentals makes it manageable.

  3. Docker building images with Dockerfiles, running containers, Docker Compose for multi-container setups, container registries. You need to understand what Kubernetes is orchestrating.

  4. YAML Kubernetes configuration is written in YAML. Understand indentation, lists, maps, and multi-document files.

The CloudPros curriculum covers Docker in Weeks 5-6 and Kubernetes in Weeks 12-13, ensuring you have all prerequisites before touching K8s.

What to learn next

Once you're comfortable with the core concepts in this guide, here's the progression:

  1. Helm the package manager for Kubernetes. Helm charts template your YAML manifests, making them reusable and configurable. Most production applications are deployed via Helm.

  2. Horizontal Pod Autoscaler (HPA) automatically scales Pods based on CPU, memory, or custom metrics. Essential for production workloads.

  3. RBAC (Role-Based Access Control) controls who can do what in the cluster. Critical for multi-team environments.

  4. Network Policies firewall rules between Pods. Define which Pods can communicate with which other Pods.

  5. GitOps with ArgoCD define your desired cluster state in Git, and ArgoCD continuously syncs the cluster to match. The modern standard for Kubernetes deployment.

  6. Service Mesh (Istio/Linkerd) advanced networking: mutual TLS between services, traffic splitting, observability. Relevant for large microservice architectures.

For a full breakdown of the tools ecosystem, see our DevOps tools guide. To understand how Kubernetes fits into the broader DevOps landscape, read What is DevOps?. For a focused comparison of Docker and Kubernetes, see Docker vs Kubernetes: Which Should You Learn First? and Kubernetes Explained Simply.

Frequently Asked Questions

Ola

Ola

Founder, CloudPros

Building the most hands-on DevOps bootcamp for the AI era. 16 weeks of real infrastructure, real projects, real career outcomes.