SQL & Database Interview
Questions & Answers
🌱Beginner QuestionsQ1–Q10
A database is an organised collection of structured data stored and accessed electronically. A DBMS (Database Management System) is the software that manages databases (MySQL, PostgreSQL, SQLite, SQL Server, Oracle).
SQL (Structured Query Language) is the standard language for interacting with relational databases. It's used to create, read, update, and delete data (CRUD).
SQL commands are grouped into sub-languages: DDL (Data Definition: CREATE, ALTER, DROP), DML (Data Manipulation: SELECT, INSERT, UPDATE, DELETE), DCL (Data Control: GRANT, REVOKE), and TCL (Transaction Control: COMMIT, ROLLBACK).
- Primary Key: Uniquely identifies each row. Cannot be NULL. One per table (can be composite).
- Foreign Key: A column that references the primary key of another table. Enforces referential integrity.
- UNIQUE: All values in the column must be different. Can be NULL (unless also NOT NULL). Multiple unique constraints allowed per table.
WHERE and HAVING?| Feature | WHERE | HAVING |
|---|---|---|
| Filters | Individual rows | Groups (after GROUP BY) |
| Aggregate functions? | ❌ Cannot use (e.g. SUM, COUNT) | ✅ Yes |
| Execution order | Before GROUP BY | After GROUP BY |
| JOIN Type | Returns | Unmatched rows? |
|---|---|---|
INNER JOIN | Rows with matches in BOTH tables | ❌ Excluded |
LEFT JOIN | All left rows + matched right rows | Left: ✅ kept (right = NULL) |
RIGHT JOIN | All right rows + matched left rows | Right: ✅ kept (left = NULL) |
FULL OUTER JOIN | All rows from both tables | ✅ Both sides kept (NULL for no match) |
CROSS JOIN | Cartesian product (every combination) | N/A |
SELF JOIN | Table joined to itself | Used for hierarchies |
COUNT(*) which counts all rows including NULLs). COUNT(column) only counts non-NULL values.SQL clauses are written in one order but executed in a different order. Understanding this prevents confusing errors like "column in WHERE can't use SELECT alias".
DELETE, TRUNCATE, and DROP?| Command | Removes | WHERE clause? | Rollback? | Speed |
|---|---|---|---|---|
DELETE | Specific rows | ✅ Yes | ✅ Yes (DML) | Slow (row by row, logs each) |
TRUNCATE | All rows (keeps table) | ❌ No | ⚠️ Usually not | Fast (deallocates pages) |
DROP | Entire table + data + structure | ❌ No | ❌ Usually not | Instant |
NULL represents the absence of a value — not zero, not an empty string. NULL is unknown. This means any comparison with NULL (including = NULL) returns NULL (unknown), not TRUE or FALSE.
CHAR and VARCHAR?| Category | Types | Use case |
|---|---|---|
| Integer | INT, BIGINT, TINYINT, SMALLINT | IDs, counts, flags |
| Decimal | DECIMAL(p,s), FLOAT, DOUBLE | Money (DECIMAL), scientific (FLOAT) |
| String | CHAR(n), VARCHAR(n), TEXT | Fixed/variable length text |
| Date/Time | DATE, TIME, DATETIME, TIMESTAMP | Temporal data |
| Boolean | BOOLEAN / TINYINT(1) | Flags, status |
| Binary | BLOB, BINARY | Files, images (usually store path instead) |
| JSON | JSON | Semi-structured data (MySQL 5.7+, PG 9.2+) |
ORDER BY and LIMIT / OFFSET?⚡Intermediate QuestionsQ11–Q21
An index is a data structure (usually a B-tree) that the database maintains separately from the table to speed up data retrieval. Like a book's index — instead of reading every page, you jump directly to the right location.
When to add indexes: Columns frequently used in WHERE, JOIN ON, and ORDER BY. When NOT to: Small tables, columns rarely queried, columns that change very frequently (updates become expensive because the index must be updated too).
Normalization is the process of organizing a database to reduce data redundancy and improve data integrity by dividing large tables into smaller, related tables.
Window functions perform calculations across a set of rows related to the current row — without collapsing rows like GROUP BY does. They use the OVER() clause.
ROW_NUMBER() in a CTE, then filter WHERE rn <= N. This is one of the most common interview questions: "Find the top 3 highest-paid employees in each department."A transaction is a sequence of SQL operations that are treated as a single unit — either all succeed (COMMIT) or all fail (ROLLBACK). ACID defines the properties that guarantee reliability.
- A — Atomicity: All operations in a transaction succeed, or none do. (Bank transfer: debit + credit are one atomic unit.)
- C — Consistency: The database moves from one valid state to another. Constraints are never violated mid-transaction.
- I — Isolation: Concurrent transactions don't interfere. Uncommitted changes in one transaction aren't visible to others (to varying degrees, based on isolation level).
- D — Durability: Once committed, changes survive power failures, crashes, or errors. Data is persisted to disk.
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
READ UNCOMMITTED | ✅ Possible | ✅ Possible | ✅ Possible |
READ COMMITTED | ❌ Prevented | ✅ Possible | ✅ Possible |
REPEATABLE READ | ❌ Prevented | ❌ Prevented | ✅ Possible |
SERIALIZABLE | ❌ Prevented | ❌ Prevented | ❌ Prevented |
UNION and UNION ALL?| Feature | UNION | UNION ALL |
|---|---|---|
| Duplicates | ❌ Removed (slower) | ✅ Kept (faster) |
| Performance | Slower — sorts to dedupe | Faster — no deduplication |
| Use when | You need distinct results | You know results are distinct or want all rows |
| Feature | SQL (Relational) | NoSQL |
|---|---|---|
| Schema | Fixed, predefined | Flexible, dynamic |
| Data model | Tables + rows | Documents, key-value, graphs, columns |
| Relationships | Foreign keys + JOINs | Embedding or manual references |
| ACID | ✅ Full support | Varies (MongoDB: since v4) |
| Scaling | Vertical (scale up) | Horizontal (scale out) |
| Examples | PostgreSQL, MySQL, SQLite | MongoDB, Redis, Cassandra, DynamoDB |
| Best for | Complex relationships, reporting, financial | Flexible data, real-time, massive scale |
NoSQL types: Document (MongoDB — JSON-like), Key-Value (Redis — cache/sessions), Column-family (Cassandra — time-series, IoT), Graph (Neo4j — social networks, recommendations).
A view is a named, saved SQL query that you can query like a table. It doesn't store data itself — it executes the underlying query every time it's queried. Views simplify complex queries and can restrict data access.
🔥Advanced QuestionsQ22–Q30
The query optimizer analyzes your SQL query and generates an execution plan — deciding which indexes to use, the order of table joins, and whether to use nested loops, hash joins, or merge joins.
Partitioning splits a large table into smaller pieces within the same database server. Sharding distributes data across multiple database servers (horizontal scaling).
| Feature | Partitioning | Sharding |
|---|---|---|
| Location | Same server | Multiple servers |
| Transparency | Fully transparent to app | App must know shard routing |
| Use case | Very large tables (billions of rows) | Massive scale, global distribution |
| Complexity | Low | High (cross-shard queries, rebalancing) |
Replication copies data from one database (primary/master) to one or more databases (replicas/secondaries). It provides high availability, fault tolerance, and read scalability.
- Primary-Replica (Master-Slave): Writes go to primary, replicated to replicas. Replicas handle read-only queries. Most common pattern.
- Primary-Primary (Multi-Master): Both servers accept writes. Complex conflict resolution required.
- Synchronous replication: Transaction only commits after replica confirms write. Strong consistency, slower writes.
- Asynchronous replication: Primary commits immediately, replica catches up. Faster writes, but replica may lag (read stale data).
- Logical replication: Replicates data changes (row-level). Allows selectively replicating specific tables.
- Physical replication: Copies raw disk blocks. Faster but replica must be same PostgreSQL version.
A deadlock occurs when two or more transactions are each waiting for the other to release a lock — creating a circular dependency where none can proceed.
The N+1 problem occurs when you execute 1 query to fetch N records, then N additional queries to fetch related data for each record — instead of fetching everything in 1 or 2 queries.
The CAP theorem states that a distributed database system can only guarantee two of three properties simultaneously:
- C — Consistency: Every read receives the most recent write or an error. All nodes see the same data at the same time.
- A — Availability: Every request receives a response (not necessarily the most recent data). The system never returns an error.
- P — Partition Tolerance: The system continues operating even if network communication between nodes fails.
| Combination | Trade-off | Examples |
|---|---|---|
| CP (Consistency + Partition) | May reject requests to stay consistent | MongoDB, HBase, Zookeeper |
| AP (Availability + Partition) | May return stale data | Cassandra, CouchDB, DynamoDB |
| CA (Consistency + Availability) | Cannot tolerate network partition | Traditional RDBMS (single node) |
Creating a new database connection for every request is expensive — it involves TCP handshake, authentication, and session setup. A connection pool maintains a cache of pre-established connections that are reused across requests.
SQL injection is when an attacker inserts malicious SQL into a query by manipulating user-supplied input — potentially reading all data, bypassing authentication, or destroying the database.
Database migrations are version-controlled scripts that incrementally modify the database schema over time. They let teams evolve the schema safely, consistently, and reproducibly across all environments (dev, staging, production).
The Complete Interview Hub is Here
You've now covered all 7 core interview topics — JavaScript, Git, Python, React, HTML & CSS, Node.js, and SQL. Bookmark RankWeb3 and share it with anyone preparing for tech interviews.
← Node.js Q&A