DevOps & Docker Interview Questions & Answers (2026) – RankWeb3
$ docker build -t myapp .
$ docker run -d -p 3000:3000 myapp
──────────────────────────
$ kubectl apply -f deployment.yaml
deployment.apps/myapp configured
$ kubectl get pods
myapp-7d4b9c-xk8pq 1/1 Running
myapp-7d4b9c-m2nrl 1/1 Running
myapp-7d4b9c-p9vws 1/1 Running
──────────────────────────
✓ 3 replicas healthy
RankWeb3Interview QuestionsDevOps & Docker
🐳 Docker ☸ Kubernetes All Levels 40 Questions Updated 2026

DevOps & Docker Interview
Questions & Answers

📅 Updated: March 2026
⏱️ Read time: ~30 min
🎯 40 questions — Beginner to Advanced
✍️ By RankWeb3 Team
40
Total Questions
14
Beginner
14
Intermediate
12
Advanced

🌱Beginner QuestionsQ1–Q14

1
What is DevOps and what problem does it solve?
BeginnerVery Common
+

DevOps is a culture, philosophy, and set of practices that unify software development (Dev) and operations (Ops) teams to deliver software faster, more reliably, and continuously.

The problem it solves: Historically, Dev teams wrote code and threw it "over the wall" to Ops teams to deploy and maintain. Different goals (Dev: ship fast, Ops: keep stable) caused friction, slow releases, blame culture, and fragile deployments.

  • Continuous Integration (CI): Developers frequently merge code; automated tests run on every change.
  • Continuous Delivery (CD): Code is always in a deployable state; releases are automated to staging.
  • Continuous Deployment: Every passing build is automatically deployed to production.
  • Infrastructure as Code (IaC): Servers and infrastructure defined in code, version-controlled.
  • Monitoring & Feedback: Measure production metrics, feed insights back to development.
💡Key metrics: Deployment frequency (how often you ship), lead time (idea → production), mean time to recovery (MTTR), change failure rate. High-performing DevOps teams deploy multiple times per day with <1% failure rate.
2
What is Docker and what problem does it solve?
BeginnerVery Common
+

Docker is an open platform for building, shipping, and running applications in containers — lightweight, portable, self-contained units that package code along with all its dependencies.

The "works on my machine" problem: Before Docker, apps would work on a developer's laptop but fail in staging/production due to different OS versions, library versions, or environment variables. Docker solves this by shipping the environment along with the code.

Docker — Basic Workflow
# Build an image from a Dockerfile: docker build -t myapp:1.0 . # Run a container from the image: docker run -d \ --name myapp \ -p 3000:3000 \ -e NODE_ENV=production \ myapp:1.0 # View running containers: docker ps # View logs: docker logs myapp # Stop & remove: docker stop myapp && docker rm myapp # Push to registry: docker push myrepo/myapp:1.0

Docker vs VMs: VMs virtualise an entire machine (OS + hardware). Containers share the host OS kernel — much lighter (MBs vs GBs), start in milliseconds vs minutes.

3
What is a Dockerfile and what are its key instructions?
BeginnerVery Common
+

A Dockerfile is a text file containing a series of instructions that Docker reads to automatically build an image. Each instruction creates a new layer in the image.

Dockerfile — Node.js App Example
# Base image FROM node:20-alpine # Set working directory WORKDIR /app # Copy dependency files FIRST (layer caching!) COPY package*.json ./ # Install dependencies RUN npm ci --only=production # Copy source code COPY . . # Expose port (documentation only — doesn't publish) EXPOSE 3000 # Create non-root user (security best practice) RUN addgroup -S appgroup && adduser -S appuser -G appgroup USER appuser # Command to run when container starts CMD ["node", "server.js"]
InstructionPurpose
FROMBase image to build from
WORKDIRSet working directory inside container
COPY / ADDCopy files into image (ADD also handles URLs & tar extraction)
RUNExecute command during build (creates a layer)
ENVSet environment variables
EXPOSEDocument which port the container listens on
CMDDefault command when container starts (overridable)
ENTRYPOINTMain command (not easily overridden)
ARGBuild-time variables (not in final image)
VOLUMEDeclare mount point for persistent data
4
What is the difference between a Docker image and a Docker container?
BeginnerVery Common
+
FeatureDocker ImageDocker Container
What it isBlueprint / template (read-only layers)Running instance of an image
StateImmutable — never changesHas writable layer on top
AnalogyLike a class definitionLike an object/instance
StorageShared across containersAdds thin writable layer per container
Created bydocker builddocker run
Data persistencePersists until deletedData lost when removed (use volumes)
Docker
# One image → many containers: docker run -d --name web1 nginx:latest docker run -d --name web2 nginx:latest docker run -d --name web3 nginx:latest # All 3 containers share the nginx image layers # Each has its own writable container layer on top # Image layers (union filesystem): # [Read-only] Layer 3: COPY app/ . # [Read-only] Layer 2: RUN npm install # [Read-only] Layer 1: FROM node:20 # [Writable ] Container layer (per container)
5
What is Docker layer caching and how do you optimise it?
BeginnerVery Common
+

Docker caches each layer. If a layer hasn't changed since the last build, Docker reuses the cached version — making subsequent builds much faster. Once a layer is invalidated, all subsequent layers are rebuilt.

Dockerfile — Cache Optimisation
# ❌ BAD — copies everything first, npm install rebuilds every time: COPY . . RUN npm install # ✅ GOOD — copy package.json first, install deps, THEN copy source: COPY package*.json ./ ← only changes when deps change RUN npm ci ← cached unless package.json changed COPY . . ← only invalidates last layers # Ordering rules: # 1. Put rarely-changing instructions early # 2. Put frequently-changing instructions (COPY source) late # 3. Combine RUN commands to reduce layers: RUN apt-get update \ && apt-get install -y curl \ && rm -rf /var/lib/apt/lists/* ← clean up in same layer!
6
What is Docker Compose and what is it used for?
BeginnerVery Common
+

Docker Compose is a tool for defining and running multi-container Docker applications using a single YAML file. Instead of running multiple docker run commands, you declare all services, networks, and volumes in one file.

YAML — docker-compose.yml
version: '3.9' services: api: build: . ports: ["3000:3000"] environment: - NODE_ENV=production - DB_URL=mongodb://mongo:27017/mydb depends_on: mongo: condition: service_healthy restart: unless-stopped mongo: image: mongo:7 volumes: ["mongo_data:/data/db"] healthcheck: test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"] interval: 10s retries: 5 redis: image: redis:7-alpine ports: ["6379:6379"] volumes: mongo_data: # Run all services: # docker compose up -d # docker compose down # docker compose logs -f api
7
What are Docker volumes and why are they needed?
BeginnerVery Common
+

Container filesystems are ephemeral — data written inside a container is lost when the container is removed. Volumes provide persistent storage that survives container restarts and removals.

TypeDescriptionBest for
Named VolumeManaged by Docker, stored in Docker's storage areaProduction databases, app state
Bind MountMaps a host directory into the containerDevelopment (live code reload)
tmpfs MountStored in host memory only (not persisted)Sensitive data, temp files
Docker — Volumes
# Named volume — managed by Docker: docker volume create postgres_data docker run -v postgres_data:/var/lib/postgresql/data postgres:16 # Bind mount — host directory ↔ container directory: docker run -v $(pwd)/src:/app/src node:20 ← live reload in dev # In docker-compose.yml: volumes: - ./src:/app/src # bind mount - postgres_data:/var/lib/postgresql/data # named vol # Volume commands: docker volume ls docker volume inspect postgres_data docker volume rm postgres_data
8
What is CI/CD and what is the difference between Continuous Delivery and Continuous Deployment?
BeginnerVery Common
+

CI/CD is the practice of automating the integration, testing, and delivery of code changes.

StageWhat happensTrigger
Continuous IntegrationMerge frequently → run automated tests + static analysis on every pushEvery git push / PR
Continuous DeliveryEvery passing build is automatically deployed to staging/pre-prod. Release to prod requires manual approval.After CI passes
Continuous DeploymentEvery passing build automatically goes all the way to production. No manual step.After CI passes (fully automated)
YAML — GitHub Actions CI/CD Pipeline
name: CI/CD Pipeline on: [push] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm ci && npm test build-and-push: needs: test steps: - run: docker build -t myapp:${{ github.sha }} . - run: docker push myrepo/myapp:${{ github.sha }} deploy-staging: needs: build-and-push environment: staging # auto-deploy staging run: kubectl set image deployment/myapp ... deploy-production: needs: deploy-staging environment: name: production url: https://myapp.com # requires manual approval
9
What is Kubernetes and what problems does it solve?
BeginnerVery Common
+

Kubernetes (K8s) is an open-source container orchestration platform that automates deployment, scaling, and management of containerised applications across a cluster of machines.

Problems it solves:

  • Container scheduling: Decides which node in the cluster to run each container on based on resource availability.
  • Auto-scaling: Automatically scales pods up/down based on CPU, memory, or custom metrics.
  • Self-healing: Automatically restarts failed containers, replaces unhealthy nodes, kills containers that fail health checks.
  • Rolling updates & rollbacks: Deploy new versions with zero downtime; instantly roll back if something goes wrong.
  • Service discovery & load balancing: Pods get DNS names; Services load-balance traffic across them.
  • Secret & config management: Store sensitive data and configuration separately from container images.
💡Docker vs Kubernetes: Docker runs containers on a single machine. Kubernetes orchestrates containers across many machines. They complement each other — Docker packages the app, Kubernetes runs it at scale.
10
What are the core Kubernetes objects — Pod, Deployment, Service, and Namespace?
BeginnerVery Common
+
YAML — Kubernetes Core Objects
# POD — smallest deployable unit. Runs one or more containers. apiVersion: v1 kind: Pod metadata: name: myapp-pod spec: containers: - name: myapp image: myapp:1.0 ports: [{containerPort: 3000}] --- # DEPLOYMENT — manages a ReplicaSet of Pods. Handles rolling updates. apiVersion: apps/v1 kind: Deployment spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: {maxSurge: 1, maxUnavailable: 0} --- # SERVICE — stable DNS name + IP that load-balances across matching pods. apiVersion: v1 kind: Service spec: selector: {app: myapp} # routes to pods with this label ports: [{port: 80, targetPort: 3000}] type: ClusterIP # internal | NodePort | LoadBalancer --- # NAMESPACE — logical isolation within a cluster. # kubectl create namespace production # kubectl get pods -n production
11
What is Infrastructure as Code (IaC) and what are common tools?
BeginnerVery Common
+

Infrastructure as Code defines and provisions infrastructure (servers, networks, databases, load balancers) using machine-readable configuration files — stored in version control — instead of manual processes or UIs.

  • Benefits: Reproducible environments, version-controlled changes, peer review via PRs, automated provisioning, disaster recovery (recreate from code in minutes).
  • Terraform (HashiCorp): Declarative HCL language. Cloud-agnostic — works with AWS, GCP, Azure. Manages state file tracking what's deployed. Most widely used IaC tool.
  • AWS CloudFormation: AWS-native IaC in YAML/JSON. Tight AWS integration but vendor-locked.
  • Pulumi: IaC using real programming languages (TypeScript, Python, Go). Great for complex logic.
  • Ansible: Configuration management + provisioning in YAML (playbooks). Agentless — uses SSH.
  • Helm: Package manager for Kubernetes. Charts are templated K8s manifests.
HCL — Terraform Example
# Provision an EC2 instance: resource "aws_instance" "web" { ami = "ami-0abcdef1234567890" instance_type = "t3.micro" tags = { Name = "web-server" } } # terraform init → terraform plan → terraform apply
12
What is the difference between containers and virtual machines?
BeginnerVery Common
+
FeatureVirtual Machine (VM)Container
OSFull OS per VM (GBs)Shares host OS kernel (MBs)
Start timeMinutesMilliseconds
SizeGBsMBs
IsolationStrong (hardware-level)Process-level (namespace/cgroups)
PortabilityHeavy, hypervisor dependentHighly portable
Use caseRun different OS, strong security isolationMicroservices, CI/CD, cloud-native apps
ExamplesVMware, VirtualBox, AWS EC2 instancesDocker, containerd, podman
ℹ️In practice, containers run inside VMs in cloud environments. AWS EKS nodes are EC2 VMs running containerd. You get the security of VMs and the efficiency of containers.
13
What is a .dockerignore file?
BeginnerCommon
+

A .dockerignore file works like .gitignore — it tells Docker which files and directories to exclude when sending the build context to the Docker daemon. Smaller build context = faster builds and smaller images.

.dockerignore
# Dependencies (will be installed fresh) node_modules/ vendor/ # Version control .git/ .gitignore # Local environment .env .env.local .env.*.local # Development files *.md *.test.js *.spec.ts coverage/ .nyc_output/ # Build artifacts dist/ build/ *.log # Docker files themselves Dockerfile* docker-compose*
⚠️Never copy .env files into Docker images. Secrets baked into images can be extracted from image layers even if you delete them in a later layer. Use runtime environment variables or secret managers (Vault, AWS Secrets Manager) instead.
14
What is a container registry and what are common options?
BeginnerCommon
+

A container registry is a repository for storing, versioning, and distributing Docker images. Like GitHub for code, but for container images.

RegistryTypeNotes
Docker HubPublic/PrivateDefault registry. Free tier has pull limits.
AWS ECRPrivate (AWS)Tight IAM integration, lifecycle policies
Google Artifact RegistryPrivate (GCP)Replaced GCR, supports multiple formats
Azure Container RegistryPrivate (Azure)Integrated with AKS
GitHub Container RegistryPublic/PrivateFree for public, integrated with Actions
HarborSelf-hostedOpen-source, vulnerability scanning
Docker — Registry Operations
# Tag image for a registry: docker tag myapp:1.0 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.0 # Login to ECR: aws ecr get-login-password | docker login --username AWS \ --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com # Push image: docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.0 # Pull image: docker pull 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.0

Intermediate QuestionsQ15–Q28

15
What is multi-stage Docker build and why use it?
IntermediateVery Common
+

Multi-stage builds use multiple FROM instructions in a single Dockerfile, allowing you to use a heavy build environment but produce a tiny final image containing only what's needed at runtime.

Dockerfile — Multi-Stage Build
# ── Stage 1: Builder ──────────────────────── FROM node:20 AS builder WORKDIR /app COPY package*.json ./ RUN npm ci COPY . . RUN npm run build ← compiles TypeScript, bundles etc. # ── Stage 2: Production ───────────────────── FROM node:20-alpine AS production WORKDIR /app # Copy ONLY the build artifacts from builder stage: COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules USER node CMD ["node", "dist/server.js"] # Result: # Builder stage: ~1.2 GB (node:20 + devDependencies + source) # Final image: ~180 MB (alpine + prod deps + dist only) # 85% smaller image → faster pulls, reduced attack surface
16
What is Kubernetes ConfigMap and Secret?
IntermediateVery Common
+
FeatureConfigMapSecret
Use forNon-sensitive config (URLs, feature flags, ports)Sensitive data (passwords, API keys, certs)
EncodingPlain textBase64 encoded (not encrypted by default!)
etcd storageUnencryptedCan be encrypted at rest (requires config)
AccessEnv vars or mounted filesEnv vars or mounted files
YAML — ConfigMap & Secret
apiVersion: v1 kind: ConfigMap metadata: {name: app-config} data: API_URL: "https://api.example.com" LOG_LEVEL: "info" --- apiVersion: v1 kind: Secret type: Opaque data: DB_PASSWORD: bXlTZWNyZXRQYXNzd29yZA== # base64 --- # Use in a Pod: env: - name: API_URL valueFrom: configMapKeyRef: {name: app-config, key: API_URL} - name: DB_PASSWORD valueFrom: secretKeyRef: {name: app-secret, key: DB_PASSWORD}
⚠️Kubernetes Secrets are NOT encrypted by default — they're just base64 encoded. Enable encryption at rest, use RBAC to restrict access, or integrate with external secret managers (AWS Secrets Manager, HashiCorp Vault, External Secrets Operator).
17
What is Kubernetes Ingress and how does it differ from a Service?
IntermediateVery Common
+

A Service routes traffic to pods inside the cluster. An Ingress is a layer 7 (HTTP/HTTPS) routing resource that routes external traffic to internal Services based on hostname/path rules.

YAML — Kubernetes Ingress
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / cert-manager.io/cluster-issuer: letsencrypt spec: tls: - hosts: [myapp.com] secretName: myapp-tls rules: - host: myapp.com http: paths: - path: /api pathType: Prefix backend: service: {name: api-service, port: {number: 80}} - path: / backend: service: {name: frontend-service, port: {number: 80}}

Ingress Controllers: NGINX Ingress (most popular), Traefik, AWS ALB Ingress Controller, HAProxy. The controller reads Ingress resources and configures the underlying load balancer accordingly.

18
What is Horizontal Pod Autoscaler (HPA) in Kubernetes?
IntermediateVery Common
+

The Horizontal Pod Autoscaler automatically scales the number of pod replicas in a Deployment based on observed CPU utilisation, memory, or custom metrics — ensuring your app handles traffic spikes without manual intervention.

YAML + Shell — HPA
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 # scale up when CPU > 70% - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 # Quick HPA via kubectl: kubectl autoscale deployment myapp --cpu-percent=70 --min=2 --max=20 kubectl get hpa
💡Requires Metrics Server installed in the cluster. For custom metrics (queue depth, RPS), use KEDA (Kubernetes Event-driven Autoscaling) which integrates with 50+ external systems including Kafka, SQS, Redis, Prometheus.
19
What are liveness, readiness, and startup probes in Kubernetes?
IntermediateVery Common
+
ProbeWhen failsK8s action
LivenessContainer is alive but stuck (deadlock)Restart the container
ReadinessContainer is not ready to serve traffic (warming up, db connecting)Remove from Service endpoints (stop routing traffic)
StartupSlow-starting app hasn't started yetDelays liveness/readiness checks until startup succeeds
YAML — Health Probes
containers: - name: myapp livenessProbe: httpGet: {path: /health, port: 3000} initialDelaySeconds: 15 periodSeconds: 20 failureThreshold: 3 readinessProbe: httpGet: {path: /ready, port: 3000} periodSeconds: 5 failureThreshold: 3 startupProbe: httpGet: {path: /health, port: 3000} failureThreshold: 30 # 30 * 10s = 5 min to start periodSeconds: 10 # /health endpoint — is the app alive? # /ready endpoint — can it serve traffic? (checks DB, cache)
20
What is Docker networking? Explain bridge, host, and overlay networks.
IntermediateCommon
+
Network TypeIsolationUse case
bridge (default)Private network on single host. Containers communicate by container name.Multi-container apps on same host (Compose)
hostContainer shares host's network namespace. No network isolation.Max performance, when port mapping overhead matters
noneComplete network isolationBatch jobs, maximum security
overlaySpans multiple Docker hosts (swarm)Docker Swarm, multi-host communication
macvlanContainer gets own MAC address on physical networkLegacy apps expecting direct network access
Docker — Networking
# Create a custom bridge network: docker network create mynet # Run containers in same network (can talk by name): docker run -d --network mynet --name api myapp docker run -d --network mynet --name mongo mongo:7 # api container can reach mongo with: mongodb://mongo:27017 # Inspect network: docker network inspect mynet # Compose auto-creates a bridge network per project # Services reference each other by service name
21
What is monitoring and observability in DevOps? What is the Prometheus + Grafana stack?
IntermediateVery Common
+

Prometheus is a time-series metrics database that scrapes metrics from targets (apps, nodes, K8s). Grafana is a visualisation platform that queries Prometheus and displays dashboards.

Node.js — Expose Prometheus Metrics
const { Registry, Counter, Histogram } = require('prom-client'); const register = new Registry(); // Counter — monotonically increasing: const httpRequests = new Counter({ name: 'http_requests_total', help: 'Total HTTP requests', labelNames: ['method', 'status'], registers: [register], }); // Histogram — request duration buckets: const httpDuration = new Histogram({ name: 'http_duration_seconds', help: 'HTTP request duration', labelNames: ['method', 'route'], registers: [register], }); // Expose /metrics endpoint: app.get('/metrics', async (req, res) => { res.set('Content-Type', register.contentType); res.end(await register.metrics()); });

Key Prometheus concepts: Scrape interval (how often to collect), retention period, PromQL (query language for metrics), AlertManager (route alerts to PagerDuty, Slack, email).

22
What is a Kubernetes StatefulSet vs Deployment?
IntermediateVery Common
+
FeatureDeploymentStatefulSet
Pod identityRandom names (myapp-xyz123)Stable, ordered names (myapp-0, myapp-1)
StorageShared or ephemeralStable PersistentVolume per pod
Scaling orderAny orderSequential (0→1→2 up, 2→1→0 down)
DNSService DNS onlyEach pod gets stable DNS hostname
Use caseStateless apps (web servers, APIs)Stateful apps (databases, Kafka, Zookeeper)
YAML — StatefulSet (MongoDB)
kind: StatefulSet spec: serviceName: "mongo" replicas: 3 volumeClaimTemplates: # each pod gets own PVC! - metadata: {name: mongo-storage} spec: accessModes: ["ReadWriteOnce"] resources: requests: {storage: 10Gi} # Pods: mongo-0, mongo-1, mongo-2 # DNS: mongo-0.mongo.default.svc.cluster.local
23
What is a rolling update vs blue/green vs canary deployment?
IntermediateVery Common
+
StrategyHow it worksRiskCost
Rolling UpdateGradually replace old pods with new. Traffic shifts as pods become ready.Both versions live simultaneously brieflyNo extra cost
Blue/GreenRun two full environments (blue=current, green=new). Switch traffic all at once via DNS/LB.Instant rollback — switch back2× infrastructure cost during switch
CanarySend small % of traffic (5–10%) to new version. Monitor. Gradually increase or roll back.Only affects small % of users if failsSlightly more infra
RecreateStop all old pods, start all new ones. Downtime.Downtime during updateMinimal
YAML — Canary with Nginx Ingress
# Canary ingress — route 10% of traffic to new version: metadata: annotations: nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-weight: "10" ← 10%
24
What are Kubernetes resource requests and limits?
IntermediateVery Common
+

Resource requests and limits tell Kubernetes how much CPU and memory a container needs and the maximum it can use. This enables the scheduler to place pods efficiently and prevent resource contention.

YAML — Resource Requests & Limits
containers: - name: myapp resources: requests: # guaranteed resources (for scheduling) memory: "128Mi" # mibibytes cpu: "100m" # millicores (100m = 0.1 CPU core) limits: # maximum allowed memory: "512Mi" # OOMKilled if exceeded cpu: "500m" # throttled (not killed) if exceeded # QoS Classes (determines eviction priority): # Guaranteed: requests == limits → never evicted first # Burstable: requests < limits # BestEffort: no requests/limits → evicted first
⚠️OOMKilled (Out Of Memory Killed) happens when a container exceeds its memory limit. Check kubectl describe pod for OOMKilled events. Increase memory limit or fix memory leak.
25
What is Helm and why is it used with Kubernetes?
IntermediateVery Common
+

Helm is the package manager for Kubernetes. It bundles related K8s manifests into a Chart — a versioned, parameterisable package. Instead of managing dozens of YAML files, you deploy with one command and customise with values.

Shell — Helm Commands
# Add a chart repository: helm repo add bitnami https://charts.bitnami.com/bitnami helm repo update # Install PostgreSQL with custom values: helm install my-postgres bitnami/postgresql \ --set auth.postgresPassword=mypassword \ --set primary.persistence.size=20Gi \ --namespace production # Upgrade a release: helm upgrade my-postgres bitnami/postgresql --set image.tag=16.2.0 # Rollback to previous version: helm rollback my-postgres 1 # Template a chart to see generated YAML: helm template my-chart ./mychart -f values-prod.yaml # List releases: helm list -n production

Chart structure: Chart.yaml (metadata), values.yaml (default values), templates/ (K8s manifests with Go templating), charts/ (dependencies).

26
What is GitOps and how does ArgoCD/Flux implement it?
IntermediateCommon
+

GitOps is a DevOps practice where Git is the single source of truth for infrastructure and application configuration. The cluster continuously syncs itself to match the desired state declared in Git.

  • Push-based (traditional CI/CD): CI pipeline pushes changes to the cluster (kubectl apply). Problem: pipeline needs cluster credentials, drift goes undetected.
  • Pull-based (GitOps): An agent running inside the cluster watches a Git repo. When it detects drift (cluster ≠ Git), it automatically reconciles. Credentials never leave the cluster.
Shell — ArgoCD
# Install ArgoCD: kubectl create namespace argocd kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml # Create an Application (sync Git → cluster): argocd app create myapp \ --repo https://github.com/myorg/k8s-configs \ --path ./production \ --dest-server https://kubernetes.default.svc \ --dest-namespace production \ --sync-policy automated ← auto-sync on git push # Benefits: # - Full audit trail (who changed what, when) # - Easy rollback (git revert) # - Drift detection and auto-healing # - No kubectl credentials in CI pipelines
27
What are Linux namespaces and cgroups? How do they enable containers?
IntermediateCommon
+

Containers aren't magic — they're Linux processes using two kernel features: namespaces for isolation and cgroups for resource limits.

NamespaceIsolates
pidProcess IDs — container sees only its own processes
netNetwork interfaces — container gets its own network stack
mntMount points — container has its own filesystem view
utsHostname and domain name
ipcInter-process communication
userUser/group IDs — map container root to non-root on host

cgroups (control groups): Limit and account for resource usage (CPU, memory, I/O, network) per group of processes. When you set --memory=512m on a container, Docker configures a cgroup to enforce that limit.

💡Container = process(es) + namespaces + cgroups. There is no hypervisor, no separate kernel. This is why containers are so lightweight — they share the host kernel but see an isolated view of resources.
28
What is container security and what are Docker security best practices?
IntermediateVery Common
+
Dockerfile & Shell — Security Best Practices
# 1. Use specific image tags (not :latest): FROM node:20.11.1-alpine3.19 ← pinned, reproducible # 2. Run as non-root user: RUN addgroup -S app && adduser -S app -G app USER app # 3. Read-only filesystem at runtime: docker run --read-only --tmpfs /tmp myapp # 4. Drop all capabilities, add only what's needed: docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE myapp # 5. No new privileges escalation: docker run --security-opt=no-new-privileges myapp # 6. Scan images for vulnerabilities: docker scout cves myapp:latest trivy image myapp:latest # 7. Use multi-stage builds (smaller attack surface) # 8. Never store secrets in images or ENV baked at build time # 9. Sign images with Docker Content Trust export DOCKER_CONTENT_TRUST=1

🔥Advanced QuestionsQ29–Q40

29
What is a Kubernetes Operator and when would you build one?
AdvancedCommon
+

A Kubernetes Operator extends K8s with a custom controller that encodes operational knowledge for managing stateful applications. It uses Custom Resource Definitions (CRDs) to define new resource types, and a controller loop to reconcile desired vs actual state.

  • CRD: Defines a new resource type (e.g., PostgresCluster). Users create instances of this resource just like Deployments.
  • Controller: Watches CRD instances, compares desired state to actual state, makes changes to converge them — automated DBA/SRE knowledge.
  • When to build one: Complex stateful app lifecycle that kubectl alone can't manage — automated backup/restore, version upgrades with data migration, failover, scaling with rebalancing.
Shell — Using an Operator (CrunchyData PGO)
# Install PGO (PostgreSQL Operator): kubectl apply -k https://github.com/CrunchyData/postgres-operator-examples/kustomize/install # Create a PostgreSQL cluster (CRD instance): apiVersion: postgres-operator.crunchydata.com/v1beta1 kind: PostgresCluster ← custom resource! metadata: {name: mydb} spec: instances: - replicas: 3 dataVolumeClaimSpec: resources: {requests: {storage: 100Gi}} backups: pgbackrest: repos: [{name: repo1, s3: {...}}] # Operator handles: HA setup, streaming replication, backups, failover

Operator frameworks: Operator SDK (Go), Kopf (Python), kubebuilder. Popular operators: Prometheus Operator, Cert-Manager, Strimzi (Kafka), CloudNativePG.

30
What is a service mesh and what does Istio/Linkerd provide?
AdvancedCommon
+

A service mesh adds a transparent infrastructure layer for microservice-to-microservice communication — implementing cross-cutting concerns without code changes via sidecar proxies.

  • Sidecar pattern: Each pod gets an injected proxy (Envoy for Istio) that intercepts all traffic in/out of the pod.
  • mTLS: Mutual TLS encryption and authentication between all services — automatically, without app code changes.
  • Traffic management: Fine-grained routing (canary, A/B, weight-based), retries, timeouts, circuit breaking — all configured as K8s resources.
  • Observability: Distributed tracing (Jaeger), metrics (Prometheus), and access logs — automatically for all services.
  • Istio: Feature-rich, more complex. Uses Envoy sidecar + Istiod control plane.
  • Linkerd: Simpler, lighter, written in Rust. Ultra-low latency overhead (<1ms). Easier to operate.
  • Cilium (eBPF-based): Next-gen — implements service mesh at the kernel level using eBPF, no sidecar required. Lower overhead.
31
What is the ELK Stack and how is it used for log management?
AdvancedVery Common
+

The ELK Stack (now Elastic Stack) is the most popular open-source log management solution.

ComponentRole
ElasticsearchDistributed search and analytics engine. Stores and indexes logs. Near real-time full-text search.
LogstashLog pipeline — collects, transforms, and ships logs from multiple sources to Elasticsearch.
KibanaWeb UI for searching, visualising, and dashboarding Elasticsearch data.
Beats/FilebeatLightweight log shippers. Run on each server/pod to tail log files and send to Logstash or Elasticsearch.
YAML — Fluent Bit (K8s log shipping)
# Modern alternative: Grafana Loki + Promtail (pull-based, cheaper) # Or Vector.dev (Rust-based, very fast) # Fluent Bit DaemonSet — runs on every node: kind: DaemonSet ← one pod per node spec: containers: - image: fluent/fluent-bit:latest volumeMounts: - name: varlog mountPath: /var/log ← reads all pod logs

Alternative: Grafana Loki — Like Prometheus but for logs. Only indexes metadata (labels), not full text — much cheaper to run at scale. Pairs with Promtail (shipper) and Grafana (visualisation).

32
What is Kubernetes RBAC (Role-Based Access Control)?
AdvancedVery Common
+

RBAC controls who can do what with which Kubernetes resources. It uses four main objects: Role, ClusterRole, RoleBinding, and ClusterRoleBinding.

YAML — Kubernetes RBAC
# Role — permissions scoped to a namespace: apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: {name: pod-reader, namespace: production} rules: - apiGroups: [""] resources: ["pods", "pods/log"] verbs: ["get", "list", "watch"] --- # RoleBinding — assign role to a user/group/serviceaccount: kind: RoleBinding subjects: - kind: User name: "meraj@company.com" - kind: ServiceAccount name: "ci-pipeline" roleRef: kind: Role name: pod-reader # ClusterRole — cluster-wide permissions (no namespace) # Principle of least privilege: give minimum permissions needed
33
How do you implement zero-downtime deployments in Kubernetes?
AdvancedVery Common
+
YAML — Zero-Downtime Deployment Config
spec: strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # allow 1 extra pod during update maxUnavailable: 0 # never reduce below desired template: spec: containers: - readinessProbe: ← ensures pod is ready before traffic httpGet: {path: /ready, port: 3000} lifecycle: preStop: ← graceful shutdown hook exec: command: ["/bin/sh", "-c", "sleep 5"] terminationGracePeriodSeconds: 30 ← wait for in-flight requests # Zero-downtime checklist: # 1. Readiness probe (don't send traffic to unready pods) # 2. maxUnavailable: 0 (never scale down before new pod ready) # 3. preStop hook + terminationGracePeriod (drain in-flight requests) # 4. App handles SIGTERM gracefully (stop accepting, finish, exit) # 5. Multiple replicas (at least 2)
34
What is container image scanning and supply chain security?
AdvancedCommon
+

Supply chain attacks target the software build process itself — injecting malicious code into dependencies, base images, or build systems. Container image security involves scanning, signing, and policy enforcement at every stage.

Shell — Supply Chain Security
# 1. Vulnerability scanning with Trivy: trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest # 2. Generate SBOM (Software Bill of Materials): syft myapp:latest -o spdx-json > sbom.json # Tracks all packages in the image # 3. Sign images with Sigstore/cosign: cosign sign --key cosign.key myapp:latest cosign verify --key cosign.pub myapp:latest # 4. Enforce policies with Kyverno/OPA Gatekeeper: # Reject pods using :latest tag # Require images from approved registry only # Require non-root user # 5. CI/CD pipeline gates: # Block deployment if critical CVEs found # Block if image not signed by known key

SLSA framework (Supply chain Levels for Software Artifacts): Google's framework for hardening the build pipeline — hermetic builds, provenance attestation, two-person review. Level 4 = highest assurance.

35
What is eBPF and why is it transforming DevOps/networking?
Advanced
+

eBPF (extended Berkeley Packet Filter) allows custom programs to run safely inside the Linux kernel without changing kernel source or loading modules. It's revolutionising networking, security, and observability.

  • How it works: Write eBPF programs in restricted C. Kernel verifier ensures safety. JIT-compiled to native instructions. Runs at hook points (syscalls, network, tracing events) with minimal overhead.
  • Observability (Pixie, Hubble): Capture any system event — network connections, file I/O, syscalls — without modifying application code. Get golden signal metrics automatically for every service.
  • Networking (Cilium): Replace kube-proxy with eBPF-based load balancing. Faster than iptables at scale. Service mesh capabilities without sidecars.
  • Security (Falco, Tetragon): Runtime security — detect suspicious syscalls (privilege escalation, unexpected network connections) instantly at kernel level.
  • Performance profiling (Parca, Pyroscope): Always-on continuous profiling with near-zero overhead. Profile any process in production without instrumentation.
💡Why it matters: Traditional sidecars add ~100MB memory and ~1ms latency per service. eBPF achieves the same observability/security with <1% overhead, no code changes, and no sidecar injection.
36
What is distributed tracing and how do you implement it?
AdvancedVery Common
+

Distributed tracing tracks a single request as it flows through multiple microservices, showing exactly where time is spent. Each request gets a unique trace ID propagated through all services via HTTP headers.

Node.js — OpenTelemetry Tracing
const { NodeSDK } = require('@opentelemetry/sdk-node'); const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http'); const sdk = new NodeSDK({ traceExporter: new OTLPTraceExporter({ url: 'http://jaeger:4318/v1/traces', }), instrumentations: [getNodeAutoInstrumentations()], }); sdk.start(); // auto-instruments Express, MongoDB, Redis, http // Manual span for custom operations: const { trace } = require('@opentelemetry/api'); const tracer = trace.getTracer('myapp'); async function processOrder(orderId) { const span = tracer.startSpan('process_order'); span.setAttribute('order.id', orderId); try { await fulfillOrder(orderId); } catch (err) { span.recordException(err); span.setStatus({ code: SpanStatusCode.ERROR }); } finally { span.end(); } }

Tools: OpenTelemetry (standard SDK), Jaeger/Tempo (backends), Grafana Tempo (cheap, pairs with Loki+Prometheus), AWS X-Ray, Datadog APM.

37
What is Kubernetes network policy?
AdvancedCommon
+

By default, all pods in a Kubernetes cluster can communicate with each other. Network Policies act as a firewall — restricting which pods can talk to which other pods and on which ports.

YAML — Network Policy (Zero-Trust)
# Default deny all traffic in namespace: kind: NetworkPolicy spec: podSelector: {} # matches ALL pods policyTypes: [Ingress, Egress] # no ingress/egress rules = deny all --- # Allow only api → database traffic on port 5432: kind: NetworkPolicy metadata: {name: allow-api-to-db} spec: podSelector: matchLabels: {app: postgres} policyTypes: [Ingress] ingress: - from: - podSelector: matchLabels: {app: api} ports: [{protocol: TCP, port: 5432}]
⚠️Network Policies require a network plugin (CNI) that supports them — Calico, Cilium, Weave Net. The default kubenet CNI on many managed clusters does NOT enforce NetworkPolicies. Check before relying on them for security.
38
What is chaos engineering and how do you implement it?
AdvancedCommon
+

Chaos engineering deliberately injects failures into a system to discover weaknesses before they cause incidents. The goal is to build confidence that the system will withstand turbulent, real-world conditions.

  • Principles: Start with a hypothesis (the system will recover from X). Run experiments. Measure impact. Fix weaknesses discovered.
  • Types of experiments: Kill random pods, add network latency/packet loss, exhaust CPU/memory, kill nodes, trigger DNS failures, cut off database connections.
Shell — Chaos Engineering Tools
# 1. Chaos Mesh (CNCF project for Kubernetes): kind: PodChaos spec: action: pod-kill selector: namespaces: [production] labelSelectors: {app: myapp} scheduler: {cron: "@every 10m"} ← continuous chaos # 2. Litmus Chaos: kubectl apply -f https://hub.litmuschaos.io/api/chaos/pod-delete # 3. Simple pod deletion (poor man's chaos): kubectl delete pod $(kubectl get pods -l app=myapp -o name | shuf -n 1) # 4. Network chaos with tc (traffic control): tc qdisc add dev eth0 root netem delay 100ms loss 5% # GameDay: scheduled chaos experiments with entire team watching
39
What is the difference between PodDisruptionBudget, PodAffinity, and Taints/Tolerations?
Advanced
+
YAML — PDB, Affinity, Taints
# PodDisruptionBudget — limits voluntary disruptions: kind: PodDisruptionBudget spec: minAvailable: 2 # always keep at least 2 pods up selector: matchLabels: {app: myapp} # Prevents: kubectl drain node from evicting too many pods at once --- # Pod Affinity — schedule near/far from other pods: affinity: podAntiAffinity: # spread across nodes requiredDuringScheduling...: - labelSelector: matchLabels: {app: myapp} topologyKey: "kubernetes.io/hostname" # Never schedule 2 myapp pods on same node --- # Taints (on nodes) + Tolerations (on pods): # Taint a node for GPU workloads only: kubectl taint nodes gpu-node1 dedicated=gpu:NoSchedule # Pod must tolerate the taint to be scheduled there: tolerations: - key: "dedicated" value: "gpu" effect: "NoSchedule"
40
What are DevOps and SRE best practices for production reliability?
AdvancedVery Common
+
  • SLOs & Error Budgets: Define SLOs (e.g. 99.9% availability). Track error budget consumption. When budget is at risk, freeze feature work, focus on reliability.
  • Blameless postmortems: After incidents, analyse what happened systematically — not who to blame. Document timeline, root cause, contributing factors, action items. Share learnings broadly.
  • On-call rotation: Developers who write code carry pagers. Shared ownership means better designed systems (you don't write fragile code if you're the one woken at 3am).
  • Runbooks / Playbooks: Document step-by-step response procedures for known failure modes. Reduces MTTR when engineers are stressed at 2am.
  • Feature flags: Decouple deployment from release. Deploy dark, enable per-user/percentage/cohort. Instant rollback without re-deploy.
  • Disaster recovery drills: Regularly practice restoring from backup, failing over to secondary region, recovering from a database corruption scenario. DR that's never tested doesn't work.
  • Capacity planning: Monitor resource trends, project growth, provision ahead of demand. Avoid reactive scaling that causes outages.
  • Toil reduction: SRE principle — automate repetitive operational tasks. If a human does the same thing > twice, automate it. Track toil as a metric and reduce it.
💡Google SRE Book (free online): The authoritative reference for production engineering best practices. Key concepts: error budgets, toil, CRE, postmortems, progressive rollouts, capacity planning. Essential reading for senior engineers.

🎉 The Complete Interview Hub is Live!

You've now covered all 11 topics — JavaScript, Git, Python, React, HTML & CSS, Node.js, SQL, TypeScript, System Design, DSA, and DevOps & Docker. Share RankWeb3 with anyone preparing for tech interviews.

← DSA Q&A