SQL Basics and Commands
1. What is SQL? Explain its use in databases.
SQL (Structured Query Language) is a programming language designed for managing and interacting with relational databases. It allows users to create, read, update, and delete data stored in database tables. SQL is widely used for:
- Data Querying: Fetching specific data from large datasets.
- Data Manipulation: Inserting, updating, and deleting records.
- Schema Management: Creating and modifying database structures like tables and indexes.
- Access Control: Granting or revoking permissions.
2. What are the different types of SQL commands?
SQL commands are categorized into five main types:
- DDL (Data Definition Language): Deals with schema and structure creation.
- DML (Data Manipulation Language): Manages data within the schema.
- DCL (Data Control Language): Controls access and permissions.
- TCL (Transaction Control Language): Manages database transactions.
- DQL (Data Query Language): Primarily consists of the
SELECT
statement.
DDL (Data Definition Language)
Commands that define and modify the database schema:
- CREATE: Used to create tables, databases, views, etc.
CREATE TABLE Employees ( ID INT PRIMARY KEY, Name VARCHAR(50), Salary DECIMAL(10, 2) );
- ALTER: Modifies existing database objects.
ALTER TABLE Employees ADD COLUMN Department VARCHAR(50);
- DROP: Deletes objects from the database.
DROP TABLE Employees;
DML (Data Manipulation Language)
Commands to manipulate data:
- INSERT: Adds new data to tables.
INSERT INTO Employees (ID, Name, Salary) VALUES (1, 'John Doe', 50000);
- UPDATE: Modifies existing data.
UPDATE Employees SET Salary = 60000 WHERE ID = 1;
- DELETE: Removes data (but keeps table structure).
DELETE FROM Employees WHERE ID = 1;
DCL (Data Control Language)
Commands to manage permissions:
- GRANT: Provides access rights.
GRANT SELECT ON Employees TO User123;
- REVOKE: Removes access rights.
REVOKE SELECT ON Employees FROM User123;
TCL (Transaction Control Language)
Commands to handle transactions:
- COMMIT: Saves changes to the database.
- ROLLBACK: Undoes changes since the last COMMIT.
- SAVEPOINT: Creates a point to which a transaction can roll back.
SAVEPOINT Save1; ROLLBACK TO Save1;
3. What is the difference between DELETE and TRUNCATE?
Feature | DELETE | TRUNCATE |
---|---|---|
Purpose | Removes specific rows. | Removes all rows in a table. |
Rollback | Can be rolled back. | Cannot roll back in most DBMS. |
Speed | Slower (logs each row deletion). | Faster (resets table). |
Triggers | Activates triggers. | Does not activate triggers. |
4. What are primary keys and foreign keys?
- Primary Key: A unique identifier for table rows. It must contain unique, non-null values.
CREATE TABLE Students ( StudentID INT PRIMARY KEY, Name VARCHAR(50) );
- Foreign Key: A column that references a primary key in another table to establish relationships.
CREATE TABLE Enrollments ( EnrollmentID INT PRIMARY KEY, StudentID INT, FOREIGN KEY (StudentID) REFERENCES Students(StudentID) );
5. Explain the difference between WHERE and HAVING clauses.
- WHERE Clause: Filters rows before grouping.
- HAVING Clause: Filters groups after aggregation.
Example:
SELECT Department, AVG(Salary)
FROM Employees
WHERE Salary > 40000
GROUP BY Department
HAVING AVG(Salary) > 50000;
Joins in SQL
6. What is a join? Explain different types of joins.
Joins combine rows from two or more tables based on a related column.
-
INNER JOIN: Returns rows with matching values in both tables.
SELECT Employees.Name, Departments.DepartmentName FROM Employees INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
-
LEFT JOIN: Returns all rows from the left table and matched rows from the right table.
SELECT Employees.Name, Departments.DepartmentName FROM Employees LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
-
RIGHT JOIN: Returns all rows from the right table and matched rows from the left table.
SELECT Employees.Name, Departments.DepartmentName FROM Employees RIGHT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
-
FULL JOIN: Returns all rows when there is a match in either table.
SELECT Employees.Name, Departments.DepartmentName FROM Employees FULL JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
Additional SQL Concepts
7. What are constraints? Name a few types of constraints in SQL.
Constraints enforce rules on data in tables:
- NOT NULL: Ensures values are not null.
- UNIQUE: Ensures all values in a column are unique.
- CHECK: Ensures values meet a condition.
CHECK (Salary > 0);
- DEFAULT: Provides a default value if none is specified.
- FOREIGN KEY: Enforces referential integrity.
- PRIMARY KEY: Combines NOT NULL and UNIQUE.
8. How is DISTINCT used in a query? Provide an example.
The DISTINCT
keyword removes duplicate rows.
SELECT DISTINCT Department
FROM Employees;
9. Explain the purpose of the LIKE operator. How do wildcards work?
The LIKE
operator filters rows based on a pattern:
%
: Matches zero or more characters._
: Matches exactly one character.
Example:
SELECT Name
FROM Employees
WHERE Name LIKE 'A%'; -- Names starting with 'A'
10. What is the difference between CHAR and VARCHAR data types?
Feature | CHAR | VARCHAR |
---|---|---|
Storage | Fixed-length. | Variable-length. |
Efficiency | Faster for fixed sizes. | Efficient for varying sizes. |
Padding | Pads with spaces. | Does not pad. |
Example:
CHAR(10) -- Always 10 characters.
VARCHAR(10) -- Up to 10 characters.
SQL Concepts Explained
1. Explain the concept of normalization. What are its advantages?
Normalization is a database design technique used to reduce data redundancy and improve data integrity. It organizes data into multiple related tables, ensuring each table has a single purpose.
Advantages:
- Reduces Data Redundancy: Eliminates duplicate data by splitting tables.
- Improves Data Integrity: Maintains consistent and accurate data.
- Efficient Storage: Saves space by storing data only once.
- Easier Maintenance: Simplifies updates and reduces anomalies.
- Improved Query Performance: Reduces the amount of data scanned.
Normalization Forms:
- 1NF (First Normal Form): Ensures columns have atomic values and a unique identifier.
- 2NF (Second Normal Form): Eliminates partial dependencies.
- 3NF (Third Normal Form): Removes transitive dependencies.
2. What is the difference between a VIEW and a TABLE?
Feature | VIEW | TABLE |
---|---|---|
Definition | A virtual table based on a query. | A physical structure to store data. |
Storage | Does not store data, only query. | Physically stores data. |
Updates | Can be updatable (with limitations). | Directly updatable. |
Performance | Slower (query executed each time). | Faster (data stored directly). |
Example:
-- VIEW
CREATE VIEW HighSalaryEmployees AS
SELECT * FROM Employees WHERE Salary > 50000;
-- TABLE
CREATE TABLE Employees (
ID INT PRIMARY KEY,
Name VARCHAR(50),
Salary DECIMAL(10, 2)
);
3. How does indexing improve query performance? What are the types of indexes?
Indexes improve query performance by enabling faster data retrieval. They create a data structure (e.g., B-tree or hash table) that reduces the number of rows scanned during a query.
Types of Indexes:
- Clustered Index: Sorts and stores table rows based on the indexed column.
- Non-Clustered Index: Creates a separate structure with pointers to the actual rows.
- Unique Index: Ensures no duplicate values in the indexed column.
- Full-Text Index: Facilitates efficient text searches.
Example:
CREATE INDEX idx_salary ON Employees(Salary);
4. What is the difference between GROUP BY and ORDER BY?
Feature | GROUP BY | ORDER BY |
---|---|---|
Purpose | Groups rows for aggregation. | Sorts rows for display. |
Functions | Works with aggregate functions. | Does not require aggregation. |
Example:
-- GROUP BY
SELECT Department, AVG(Salary)
FROM Employees
GROUP BY Department;
-- ORDER BY
SELECT Name, Salary
FROM Employees
ORDER BY Salary DESC;
5. Explain the concept of aggregate functions like SUM, AVG, COUNT, etc.
Aggregate functions perform calculations on a set of values and return a single result.
Function | Description | Example |
---|---|---|
SUM | Adds numeric values. | SUM(Salary) |
AVG | Calculates average. | AVG(Salary) |
COUNT | Counts rows. | COUNT(*) |
MAX | Finds the maximum value. | MAX(Salary) |
MIN | Finds the minimum value. | MIN(Salary) |
Example:
SELECT COUNT(*), AVG(Salary), SUM(Salary)
FROM Employees;
6. How would you retrieve duplicate records from a table?
To find duplicates, use the GROUP BY
clause with a HAVING
condition:
SELECT Name, COUNT(*)
FROM Employees
GROUP BY Name
HAVING COUNT(*) > 1;
7. What is a subquery? How is it different from a correlated subquery?
- Subquery: A query inside another query. It executes independently of the outer query.
- Correlated Subquery: A subquery that depends on the outer query for its values.
Example:
-- Subquery
SELECT Name
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);
-- Correlated Subquery
SELECT Name
FROM Employees e1
WHERE Salary > (SELECT AVG(Salary) FROM Employees e2 WHERE e1.Department = e2.Department);
8. Explain the use of the CASE statement in SQL.
The CASE
statement is used for conditional logic in SQL queries.
Example:
SELECT Name,
CASE
WHEN Salary > 50000 THEN 'High'
WHEN Salary BETWEEN 30000 AND 50000 THEN 'Medium'
ELSE 'Low'
END AS SalaryCategory
FROM Employees;
9. How can you implement a many-to-many relationship in SQL?
A many-to-many relationship is implemented using a junction table (bridge table) that connects two tables through foreign keys.
Example:
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
Name VARCHAR(50)
);
CREATE TABLE Courses (
CourseID INT PRIMARY KEY,
CourseName VARCHAR(50)
);
CREATE TABLE StudentCourses (
StudentID INT,
CourseID INT,
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);
10. What is the difference between UNION and UNION ALL?
Feature | UNION | UNION ALL |
---|---|---|
Duplicates | Removes duplicate rows. | Includes duplicate rows. |
Performance | Slower (removes duplicates). | Faster (no duplicate check). |
Example:
-- UNION
SELECT Name FROM Employees
UNION
SELECT Name FROM Managers;
-- UNION ALL
SELECT Name FROM Employees
UNION ALL
SELECT Name FROM Managers;
SQL Concepts with Examples and Explanations
1. What are window functions? Provide examples of ROW_NUMBER(), RANK(), and DENSE_RANK().
Window functions perform calculations across a set of table rows related to the current row without collapsing the rows into a single output (like aggregate functions do).
Examples:
- ROW_NUMBER(): Assigns a unique number to each row within a partition.
- RANK(): Assigns a rank to each row, leaving gaps if there are ties.
- DENSE_RANK(): Similar to RANK but without gaps.
Example Query:
SELECT Name, Department, Salary,
ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS RowNumber,
RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank,
DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS DenseRank
FROM Employees;
Output Example:
Name | Department | Salary | RowNumber | Rank | DenseRank |
---|---|---|---|---|---|
Alice | Sales | 70000 | 1 | 1 | 1 |
Bob | Sales | 60000 | 2 | 2 | 2 |
Charlie | Sales | 60000 | 3 | 2 | 2 |
2. How would you optimize a slow-running query?
Strategies for Query Optimization:
- Indexing:
- Use indexes on frequently queried columns.
- Example:
CREATE INDEX idx_salary ON Employees(Salary);
- *Avoid SELECT :
- Fetch only necessary columns.
- Example:
SELECT Name, Salary FROM Employees;
- Analyze Query Execution Plan:
- Use tools like
EXPLAIN
to identify bottlenecks.
- Use tools like
- Reduce Joins and Subqueries:
- Simplify queries and avoid unnecessary joins.
- Partitioning:
- Divide large tables into smaller, manageable parts.
- Caching Results:
- Cache frequent queries to avoid repeated calculations.
- Proper Use of WHERE Conditions:
- Ensure filters use indexed columns.
- Example:
WHERE Salary > 50000;
3. What is the difference between transactional databases and analytical databases in terms of SQL usage?
Feature | Transactional Database (OLTP) | Analytical Database (OLAP) |
---|---|---|
Purpose | Handle real-time transactions. | Perform data analysis and reporting. |
Data Model | Highly normalized. | Denormalized (star or snowflake schema). |
Query Type | Simple, short queries. | Complex, long-running queries. |
Examples | Bank systems, e-commerce apps. | Data warehouses, BI tools. |
4. How do you handle NULL values in SQL queries? Explain with examples.
Strategies to Handle NULLs:
- Use IS NULL or IS NOT NULL:
SELECT * FROM Employees WHERE Salary IS NULL;
- Use COALESCE():
- Replaces NULL with a default value.
SELECT Name, COALESCE(Salary, 0) AS Salary FROM Employees;
- Use NULL-safe Comparisons:
- Avoid
=
for NULLs; use IS NULL.
- Avoid
Example Query:
SELECT Name,
CASE
WHEN Salary IS NULL THEN 'Unknown'
ELSE 'Known'
END AS SalaryStatus
FROM Employees;
5. Explain the concept of stored procedures, functions, and triggers.
Feature | Stored Procedure | Function | Trigger |
---|---|---|---|
Definition | Precompiled SQL block for tasks. | Returns a single value/result. | Executes automatically on events. |
Execution | Explicitly called. | Invoked in SQL queries. | Automatically triggered. |
Use Case | Complex tasks or batch processing. | Calculations, transformations. | Auditing, enforcing constraints. |
Example:
- Stored Procedure:
CREATE PROCEDURE GetHighSalaries() AS BEGIN SELECT * FROM Employees WHERE Salary > 50000; END;
- Function:
CREATE FUNCTION GetBonus(@Salary INT) RETURNS INT AS BEGIN RETURN @Salary * 0.1; END;
- Trigger:
CREATE TRIGGER AuditSalaryChange ON Employees AFTER UPDATE AS BEGIN INSERT INTO AuditLog (EmployeeID, OldSalary, NewSalary) SELECT ID, DELETED.Salary, INSERTED.Salary FROM DELETED, INSERTED; END;
6. What is a CTE (Common Table Expression)? When should it be used?
A CTE is a temporary result set defined within a query for better readability and modularity.
When to Use:
- Complex queries with nested logic.
- Recursive queries.
- Reusing query logic multiple times.
Example:
WITH HighSalary AS (
SELECT Name, Salary
FROM Employees
WHERE Salary > 50000
)
SELECT * FROM HighSalary WHERE Name LIKE 'A%';
7. How would you design a database schema for a scalable application?
Key Practices:
- Normalization: Start with normalized schemas, then denormalize for performance.
- Indexing: Add indexes for frequently accessed columns.
- Partitioning: Divide large tables horizontally or vertically.
- Sharding: Distribute data across multiple databases.
- Use Proper Data Types: Optimize storage and performance.
- Implement Constraints: Enforce data integrity.
- Maintain Audit Trails: Track changes with logging tables.
8. What is ACID in database transactions? Why is it important?
ACID:
- Atomicity: Ensures transactions are all-or-nothing.
- Consistency: Guarantees database state remains valid.
- Isolation: Ensures transactions don’t interfere with each other.
- Durability: Guarantees data persistence after a transaction.
Importance: Ensures reliability and data integrity in multi-user environments.
9. Explain the difference between OLAP and OLTP.
Feature | OLAP (Online Analytical Processing) | OLTP (Online Transactional Processing) |
---|---|---|
Purpose | Analytical queries and reporting. | Real-time transaction handling. |
Schema | Denormalized. | Highly normalized. |
Query Type | Complex. | Simple, short queries. |
Data Volume | Large datasets. | Small, frequent updates. |
10. How would you implement pagination in SQL?
Example Query for Pagination:
-- Fetch rows 11 to 20
SELECT *
FROM Employees
ORDER BY ID
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;
Key Approaches:
- LIMIT and OFFSET (MySQL, PostgreSQL):
SELECT * FROM Employees LIMIT 10 OFFSET 10;
- ROW_NUMBER() (SQL Server, Oracle):
WITH Paginated AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY ID) AS RowNum FROM Employees ) SELECT * FROM Paginated WHERE RowNum BETWEEN 11 AND 20;
SQL Queries for Advanced Scenarios
1. Find the second-highest salary in a table
SELECT MAX(Salary) AS SecondHighestSalary
FROM Employees
WHERE Salary < (SELECT MAX(Salary) FROM Employees);
Alternative using ROW_NUMBER()
:
WITH RankedSalaries AS (
SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
FROM Employees
)
SELECT Salary AS SecondHighestSalary
FROM RankedSalaries
WHERE RowNum = 2;
2. Delete duplicate records from a table without using a temporary table
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name, Department, Salary ORDER BY ID) AS RowNum
FROM Employees
)
DELETE FROM Employees
WHERE ID IN (SELECT ID FROM CTE WHERE RowNum > 1);
3. Display the department with the highest number of employees
SELECT Department
FROM Employees
GROUP BY Department
ORDER BY COUNT(*) DESC
LIMIT 1; -- Use TOP 1 for SQL Server
4. Find the nth highest salary in a table
WITH RankedSalaries AS (
SELECT Salary, DENSE_RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees
)
SELECT Salary
FROM RankedSalaries
WHERE Rank = n; -- Replace 'n' with the desired rank
5. Transpose rows into columns using SQL
Example: Transpose department names into columns with employee counts.
SELECT
SUM(CASE WHEN Department = 'Sales' THEN 1 ELSE 0 END) AS Sales,
SUM(CASE WHEN Department = 'HR' THEN 1 ELSE 0 END) AS HR,
SUM(CASE WHEN Department = 'IT' THEN 1 ELSE 0 END) AS IT
FROM Employees;
6. Fetch employees who joined in the last 6 months
SELECT *
FROM Employees
WHERE JoinDate >= DATEADD(MONTH, -6, GETDATE()); -- For SQL Server
For MySQL:
SELECT *
FROM Employees
WHERE JoinDate >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH);
7. Calculate the cumulative sum of a column
SELECT Name, Salary,
SUM(Salary) OVER (ORDER BY ID) AS CumulativeSalary
FROM Employees;
8. Retrieve the top 3 salaries from each department
WITH RankedSalaries AS (
SELECT Name, Department, Salary,
ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank
FROM Employees
)
SELECT Name, Department, Salary
FROM RankedSalaries
WHERE Rank <= 3;
9. Fetch records that do not have matching entries in another table
Example: Find employees without a department.
SELECT e.*
FROM Employees e
LEFT JOIN Departments d ON e.DepartmentID = d.ID
WHERE d.ID IS NULL;
10. List employees who have the same salary as another employee
SELECT e1.Name, e1.Salary
FROM Employees e1
JOIN Employees e2 ON e1.Salary = e2.Salary AND e1.ID <> e2.ID;