DevOps & Docker Interview
Questions & Answers
🌱Beginner QuestionsQ1–Q14
DevOps is a culture, philosophy, and set of practices that unify software development (Dev) and operations (Ops) teams to deliver software faster, more reliably, and continuously.
The problem it solves: Historically, Dev teams wrote code and threw it "over the wall" to Ops teams to deploy and maintain. Different goals (Dev: ship fast, Ops: keep stable) caused friction, slow releases, blame culture, and fragile deployments.
- Continuous Integration (CI): Developers frequently merge code; automated tests run on every change.
- Continuous Delivery (CD): Code is always in a deployable state; releases are automated to staging.
- Continuous Deployment: Every passing build is automatically deployed to production.
- Infrastructure as Code (IaC): Servers and infrastructure defined in code, version-controlled.
- Monitoring & Feedback: Measure production metrics, feed insights back to development.
Docker is an open platform for building, shipping, and running applications in containers — lightweight, portable, self-contained units that package code along with all its dependencies.
The "works on my machine" problem: Before Docker, apps would work on a developer's laptop but fail in staging/production due to different OS versions, library versions, or environment variables. Docker solves this by shipping the environment along with the code.
Docker vs VMs: VMs virtualise an entire machine (OS + hardware). Containers share the host OS kernel — much lighter (MBs vs GBs), start in milliseconds vs minutes.
A Dockerfile is a text file containing a series of instructions that Docker reads to automatically build an image. Each instruction creates a new layer in the image.
| Instruction | Purpose |
|---|---|
FROM | Base image to build from |
WORKDIR | Set working directory inside container |
COPY / ADD | Copy files into image (ADD also handles URLs & tar extraction) |
RUN | Execute command during build (creates a layer) |
ENV | Set environment variables |
EXPOSE | Document which port the container listens on |
CMD | Default command when container starts (overridable) |
ENTRYPOINT | Main command (not easily overridden) |
ARG | Build-time variables (not in final image) |
VOLUME | Declare mount point for persistent data |
| Feature | Docker Image | Docker Container |
|---|---|---|
| What it is | Blueprint / template (read-only layers) | Running instance of an image |
| State | Immutable — never changes | Has writable layer on top |
| Analogy | Like a class definition | Like an object/instance |
| Storage | Shared across containers | Adds thin writable layer per container |
| Created by | docker build | docker run |
| Data persistence | Persists until deleted | Data lost when removed (use volumes) |
Docker caches each layer. If a layer hasn't changed since the last build, Docker reuses the cached version — making subsequent builds much faster. Once a layer is invalidated, all subsequent layers are rebuilt.
Docker Compose is a tool for defining and running multi-container Docker applications using a single YAML file. Instead of running multiple docker run commands, you declare all services, networks, and volumes in one file.
Container filesystems are ephemeral — data written inside a container is lost when the container is removed. Volumes provide persistent storage that survives container restarts and removals.
| Type | Description | Best for |
|---|---|---|
| Named Volume | Managed by Docker, stored in Docker's storage area | Production databases, app state |
| Bind Mount | Maps a host directory into the container | Development (live code reload) |
| tmpfs Mount | Stored in host memory only (not persisted) | Sensitive data, temp files |
CI/CD is the practice of automating the integration, testing, and delivery of code changes.
| Stage | What happens | Trigger |
|---|---|---|
| Continuous Integration | Merge frequently → run automated tests + static analysis on every push | Every git push / PR |
| Continuous Delivery | Every passing build is automatically deployed to staging/pre-prod. Release to prod requires manual approval. | After CI passes |
| Continuous Deployment | Every passing build automatically goes all the way to production. No manual step. | After CI passes (fully automated) |
Kubernetes (K8s) is an open-source container orchestration platform that automates deployment, scaling, and management of containerised applications across a cluster of machines.
Problems it solves:
- Container scheduling: Decides which node in the cluster to run each container on based on resource availability.
- Auto-scaling: Automatically scales pods up/down based on CPU, memory, or custom metrics.
- Self-healing: Automatically restarts failed containers, replaces unhealthy nodes, kills containers that fail health checks.
- Rolling updates & rollbacks: Deploy new versions with zero downtime; instantly roll back if something goes wrong.
- Service discovery & load balancing: Pods get DNS names; Services load-balance traffic across them.
- Secret & config management: Store sensitive data and configuration separately from container images.
Infrastructure as Code defines and provisions infrastructure (servers, networks, databases, load balancers) using machine-readable configuration files — stored in version control — instead of manual processes or UIs.
- Benefits: Reproducible environments, version-controlled changes, peer review via PRs, automated provisioning, disaster recovery (recreate from code in minutes).
- Terraform (HashiCorp): Declarative HCL language. Cloud-agnostic — works with AWS, GCP, Azure. Manages state file tracking what's deployed. Most widely used IaC tool.
- AWS CloudFormation: AWS-native IaC in YAML/JSON. Tight AWS integration but vendor-locked.
- Pulumi: IaC using real programming languages (TypeScript, Python, Go). Great for complex logic.
- Ansible: Configuration management + provisioning in YAML (playbooks). Agentless — uses SSH.
- Helm: Package manager for Kubernetes. Charts are templated K8s manifests.
| Feature | Virtual Machine (VM) | Container |
|---|---|---|
| OS | Full OS per VM (GBs) | Shares host OS kernel (MBs) |
| Start time | Minutes | Milliseconds |
| Size | GBs | MBs |
| Isolation | Strong (hardware-level) | Process-level (namespace/cgroups) |
| Portability | Heavy, hypervisor dependent | Highly portable |
| Use case | Run different OS, strong security isolation | Microservices, CI/CD, cloud-native apps |
| Examples | VMware, VirtualBox, AWS EC2 instances | Docker, containerd, podman |
A .dockerignore file works like .gitignore — it tells Docker which files and directories to exclude when sending the build context to the Docker daemon. Smaller build context = faster builds and smaller images.
.env files into Docker images. Secrets baked into images can be extracted from image layers even if you delete them in a later layer. Use runtime environment variables or secret managers (Vault, AWS Secrets Manager) instead.A container registry is a repository for storing, versioning, and distributing Docker images. Like GitHub for code, but for container images.
| Registry | Type | Notes |
|---|---|---|
| Docker Hub | Public/Private | Default registry. Free tier has pull limits. |
| AWS ECR | Private (AWS) | Tight IAM integration, lifecycle policies |
| Google Artifact Registry | Private (GCP) | Replaced GCR, supports multiple formats |
| Azure Container Registry | Private (Azure) | Integrated with AKS |
| GitHub Container Registry | Public/Private | Free for public, integrated with Actions |
| Harbor | Self-hosted | Open-source, vulnerability scanning |
⚡Intermediate QuestionsQ15–Q28
Multi-stage builds use multiple FROM instructions in a single Dockerfile, allowing you to use a heavy build environment but produce a tiny final image containing only what's needed at runtime.
| Feature | ConfigMap | Secret |
|---|---|---|
| Use for | Non-sensitive config (URLs, feature flags, ports) | Sensitive data (passwords, API keys, certs) |
| Encoding | Plain text | Base64 encoded (not encrypted by default!) |
| etcd storage | Unencrypted | Can be encrypted at rest (requires config) |
| Access | Env vars or mounted files | Env vars or mounted files |
A Service routes traffic to pods inside the cluster. An Ingress is a layer 7 (HTTP/HTTPS) routing resource that routes external traffic to internal Services based on hostname/path rules.
Ingress Controllers: NGINX Ingress (most popular), Traefik, AWS ALB Ingress Controller, HAProxy. The controller reads Ingress resources and configures the underlying load balancer accordingly.
The Horizontal Pod Autoscaler automatically scales the number of pod replicas in a Deployment based on observed CPU utilisation, memory, or custom metrics — ensuring your app handles traffic spikes without manual intervention.
| Probe | When fails | K8s action |
|---|---|---|
| Liveness | Container is alive but stuck (deadlock) | Restart the container |
| Readiness | Container is not ready to serve traffic (warming up, db connecting) | Remove from Service endpoints (stop routing traffic) |
| Startup | Slow-starting app hasn't started yet | Delays liveness/readiness checks until startup succeeds |
| Network Type | Isolation | Use case |
|---|---|---|
| bridge (default) | Private network on single host. Containers communicate by container name. | Multi-container apps on same host (Compose) |
| host | Container shares host's network namespace. No network isolation. | Max performance, when port mapping overhead matters |
| none | Complete network isolation | Batch jobs, maximum security |
| overlay | Spans multiple Docker hosts (swarm) | Docker Swarm, multi-host communication |
| macvlan | Container gets own MAC address on physical network | Legacy apps expecting direct network access |
Prometheus is a time-series metrics database that scrapes metrics from targets (apps, nodes, K8s). Grafana is a visualisation platform that queries Prometheus and displays dashboards.
Key Prometheus concepts: Scrape interval (how often to collect), retention period, PromQL (query language for metrics), AlertManager (route alerts to PagerDuty, Slack, email).
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod identity | Random names (myapp-xyz123) | Stable, ordered names (myapp-0, myapp-1) |
| Storage | Shared or ephemeral | Stable PersistentVolume per pod |
| Scaling order | Any order | Sequential (0→1→2 up, 2→1→0 down) |
| DNS | Service DNS only | Each pod gets stable DNS hostname |
| Use case | Stateless apps (web servers, APIs) | Stateful apps (databases, Kafka, Zookeeper) |
| Strategy | How it works | Risk | Cost |
|---|---|---|---|
| Rolling Update | Gradually replace old pods with new. Traffic shifts as pods become ready. | Both versions live simultaneously briefly | No extra cost |
| Blue/Green | Run two full environments (blue=current, green=new). Switch traffic all at once via DNS/LB. | Instant rollback — switch back | 2× infrastructure cost during switch |
| Canary | Send small % of traffic (5–10%) to new version. Monitor. Gradually increase or roll back. | Only affects small % of users if fails | Slightly more infra |
| Recreate | Stop all old pods, start all new ones. Downtime. | Downtime during update | Minimal |
Resource requests and limits tell Kubernetes how much CPU and memory a container needs and the maximum it can use. This enables the scheduler to place pods efficiently and prevent resource contention.
kubectl describe pod for OOMKilled events. Increase memory limit or fix memory leak.Helm is the package manager for Kubernetes. It bundles related K8s manifests into a Chart — a versioned, parameterisable package. Instead of managing dozens of YAML files, you deploy with one command and customise with values.
Chart structure: Chart.yaml (metadata), values.yaml (default values), templates/ (K8s manifests with Go templating), charts/ (dependencies).
GitOps is a DevOps practice where Git is the single source of truth for infrastructure and application configuration. The cluster continuously syncs itself to match the desired state declared in Git.
- Push-based (traditional CI/CD): CI pipeline pushes changes to the cluster (kubectl apply). Problem: pipeline needs cluster credentials, drift goes undetected.
- Pull-based (GitOps): An agent running inside the cluster watches a Git repo. When it detects drift (cluster ≠ Git), it automatically reconciles. Credentials never leave the cluster.
Containers aren't magic — they're Linux processes using two kernel features: namespaces for isolation and cgroups for resource limits.
| Namespace | Isolates |
|---|---|
pid | Process IDs — container sees only its own processes |
net | Network interfaces — container gets its own network stack |
mnt | Mount points — container has its own filesystem view |
uts | Hostname and domain name |
ipc | Inter-process communication |
user | User/group IDs — map container root to non-root on host |
cgroups (control groups): Limit and account for resource usage (CPU, memory, I/O, network) per group of processes. When you set --memory=512m on a container, Docker configures a cgroup to enforce that limit.
🔥Advanced QuestionsQ29–Q40
A Kubernetes Operator extends K8s with a custom controller that encodes operational knowledge for managing stateful applications. It uses Custom Resource Definitions (CRDs) to define new resource types, and a controller loop to reconcile desired vs actual state.
- CRD: Defines a new resource type (e.g.,
PostgresCluster). Users create instances of this resource just like Deployments. - Controller: Watches CRD instances, compares desired state to actual state, makes changes to converge them — automated DBA/SRE knowledge.
- When to build one: Complex stateful app lifecycle that kubectl alone can't manage — automated backup/restore, version upgrades with data migration, failover, scaling with rebalancing.
Operator frameworks: Operator SDK (Go), Kopf (Python), kubebuilder. Popular operators: Prometheus Operator, Cert-Manager, Strimzi (Kafka), CloudNativePG.
A service mesh adds a transparent infrastructure layer for microservice-to-microservice communication — implementing cross-cutting concerns without code changes via sidecar proxies.
- Sidecar pattern: Each pod gets an injected proxy (Envoy for Istio) that intercepts all traffic in/out of the pod.
- mTLS: Mutual TLS encryption and authentication between all services — automatically, without app code changes.
- Traffic management: Fine-grained routing (canary, A/B, weight-based), retries, timeouts, circuit breaking — all configured as K8s resources.
- Observability: Distributed tracing (Jaeger), metrics (Prometheus), and access logs — automatically for all services.
- Istio: Feature-rich, more complex. Uses Envoy sidecar + Istiod control plane.
- Linkerd: Simpler, lighter, written in Rust. Ultra-low latency overhead (<1ms). Easier to operate.
- Cilium (eBPF-based): Next-gen — implements service mesh at the kernel level using eBPF, no sidecar required. Lower overhead.
The ELK Stack (now Elastic Stack) is the most popular open-source log management solution.
| Component | Role |
|---|---|
| Elasticsearch | Distributed search and analytics engine. Stores and indexes logs. Near real-time full-text search. |
| Logstash | Log pipeline — collects, transforms, and ships logs from multiple sources to Elasticsearch. |
| Kibana | Web UI for searching, visualising, and dashboarding Elasticsearch data. |
| Beats/Filebeat | Lightweight log shippers. Run on each server/pod to tail log files and send to Logstash or Elasticsearch. |
Alternative: Grafana Loki — Like Prometheus but for logs. Only indexes metadata (labels), not full text — much cheaper to run at scale. Pairs with Promtail (shipper) and Grafana (visualisation).
RBAC controls who can do what with which Kubernetes resources. It uses four main objects: Role, ClusterRole, RoleBinding, and ClusterRoleBinding.
Supply chain attacks target the software build process itself — injecting malicious code into dependencies, base images, or build systems. Container image security involves scanning, signing, and policy enforcement at every stage.
SLSA framework (Supply chain Levels for Software Artifacts): Google's framework for hardening the build pipeline — hermetic builds, provenance attestation, two-person review. Level 4 = highest assurance.
eBPF (extended Berkeley Packet Filter) allows custom programs to run safely inside the Linux kernel without changing kernel source or loading modules. It's revolutionising networking, security, and observability.
- How it works: Write eBPF programs in restricted C. Kernel verifier ensures safety. JIT-compiled to native instructions. Runs at hook points (syscalls, network, tracing events) with minimal overhead.
- Observability (Pixie, Hubble): Capture any system event — network connections, file I/O, syscalls — without modifying application code. Get golden signal metrics automatically for every service.
- Networking (Cilium): Replace kube-proxy with eBPF-based load balancing. Faster than iptables at scale. Service mesh capabilities without sidecars.
- Security (Falco, Tetragon): Runtime security — detect suspicious syscalls (privilege escalation, unexpected network connections) instantly at kernel level.
- Performance profiling (Parca, Pyroscope): Always-on continuous profiling with near-zero overhead. Profile any process in production without instrumentation.
Distributed tracing tracks a single request as it flows through multiple microservices, showing exactly where time is spent. Each request gets a unique trace ID propagated through all services via HTTP headers.
Tools: OpenTelemetry (standard SDK), Jaeger/Tempo (backends), Grafana Tempo (cheap, pairs with Loki+Prometheus), AWS X-Ray, Datadog APM.
By default, all pods in a Kubernetes cluster can communicate with each other. Network Policies act as a firewall — restricting which pods can talk to which other pods and on which ports.
Chaos engineering deliberately injects failures into a system to discover weaknesses before they cause incidents. The goal is to build confidence that the system will withstand turbulent, real-world conditions.
- Principles: Start with a hypothesis (the system will recover from X). Run experiments. Measure impact. Fix weaknesses discovered.
- Types of experiments: Kill random pods, add network latency/packet loss, exhaust CPU/memory, kill nodes, trigger DNS failures, cut off database connections.
- SLOs & Error Budgets: Define SLOs (e.g. 99.9% availability). Track error budget consumption. When budget is at risk, freeze feature work, focus on reliability.
- Blameless postmortems: After incidents, analyse what happened systematically — not who to blame. Document timeline, root cause, contributing factors, action items. Share learnings broadly.
- On-call rotation: Developers who write code carry pagers. Shared ownership means better designed systems (you don't write fragile code if you're the one woken at 3am).
- Runbooks / Playbooks: Document step-by-step response procedures for known failure modes. Reduces MTTR when engineers are stressed at 2am.
- Feature flags: Decouple deployment from release. Deploy dark, enable per-user/percentage/cohort. Instant rollback without re-deploy.
- Disaster recovery drills: Regularly practice restoring from backup, failing over to secondary region, recovering from a database corruption scenario. DR that's never tested doesn't work.
- Capacity planning: Monitor resource trends, project growth, provision ahead of demand. Avoid reactive scaling that causes outages.
- Toil reduction: SRE principle — automate repetitive operational tasks. If a human does the same thing > twice, automate it. Track toil as a metric and reduce it.
🎉 The Complete Interview Hub is Live!
You've now covered all 11 topics — JavaScript, Git, Python, React, HTML & CSS, Node.js, SQL, TypeScript, System Design, DSA, and DevOps & Docker. Share RankWeb3 with anyone preparing for tech interviews.
← DSA Q&A