SQL Basics and Commands

1. What is SQL? Explain its use in databases.

SQL (Structured Query Language) is a programming language designed for managing and interacting with relational databases. It allows users to create, read, update, and delete data stored in database tables. SQL is widely used for:

  • Data Querying: Fetching specific data from large datasets.
  • Data Manipulation: Inserting, updating, and deleting records.
  • Schema Management: Creating and modifying database structures like tables and indexes.
  • Access Control: Granting or revoking permissions.

2. What are the different types of SQL commands?

SQL commands are categorized into five main types:

  • DDL (Data Definition Language): Deals with schema and structure creation.
  • DML (Data Manipulation Language): Manages data within the schema.
  • DCL (Data Control Language): Controls access and permissions.
  • TCL (Transaction Control Language): Manages database transactions.
  • DQL (Data Query Language): Primarily consists of the SELECT statement.

DDL (Data Definition Language)

Commands that define and modify the database schema:

  • CREATE: Used to create tables, databases, views, etc.
    CREATE TABLE Employees (
        ID INT PRIMARY KEY,
        Name VARCHAR(50),
        Salary DECIMAL(10, 2)
    );
    
  • ALTER: Modifies existing database objects.
    ALTER TABLE Employees ADD COLUMN Department VARCHAR(50);
    
  • DROP: Deletes objects from the database.
    DROP TABLE Employees;
    

DML (Data Manipulation Language)

Commands to manipulate data:

  • INSERT: Adds new data to tables.
    INSERT INTO Employees (ID, Name, Salary) VALUES (1, 'John Doe', 50000);
    
  • UPDATE: Modifies existing data.
    UPDATE Employees SET Salary = 60000 WHERE ID = 1;
    
  • DELETE: Removes data (but keeps table structure).
    DELETE FROM Employees WHERE ID = 1;
    

DCL (Data Control Language)

Commands to manage permissions:

  • GRANT: Provides access rights.
    GRANT SELECT ON Employees TO User123;
    
  • REVOKE: Removes access rights.
    REVOKE SELECT ON Employees FROM User123;
    

TCL (Transaction Control Language)

Commands to handle transactions:

  • COMMIT: Saves changes to the database.
  • ROLLBACK: Undoes changes since the last COMMIT.
  • SAVEPOINT: Creates a point to which a transaction can roll back.
    SAVEPOINT Save1;
    ROLLBACK TO Save1;
    

3. What is the difference between DELETE and TRUNCATE?

Feature DELETE TRUNCATE
Purpose Removes specific rows. Removes all rows in a table.
Rollback Can be rolled back. Cannot roll back in most DBMS.
Speed Slower (logs each row deletion). Faster (resets table).
Triggers Activates triggers. Does not activate triggers.

4. What are primary keys and foreign keys?

  • Primary Key: A unique identifier for table rows. It must contain unique, non-null values.
    CREATE TABLE Students (
        StudentID INT PRIMARY KEY,
        Name VARCHAR(50)
    );
    
  • Foreign Key: A column that references a primary key in another table to establish relationships.
    CREATE TABLE Enrollments (
        EnrollmentID INT PRIMARY KEY,
        StudentID INT,
        FOREIGN KEY (StudentID) REFERENCES Students(StudentID)
    );
    

5. Explain the difference between WHERE and HAVING clauses.

  • WHERE Clause: Filters rows before grouping.
  • HAVING Clause: Filters groups after aggregation.

Example:

SELECT Department, AVG(Salary)
FROM Employees
WHERE Salary > 40000
GROUP BY Department
HAVING AVG(Salary) > 50000;

Joins in SQL

6. What is a join? Explain different types of joins.

Joins combine rows from two or more tables based on a related column.

  • INNER JOIN: Returns rows with matching values in both tables.

    SELECT Employees.Name, Departments.DepartmentName
    FROM Employees
    INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
    
  • LEFT JOIN: Returns all rows from the left table and matched rows from the right table.

    SELECT Employees.Name, Departments.DepartmentName
    FROM Employees
    LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
    
  • RIGHT JOIN: Returns all rows from the right table and matched rows from the left table.

    SELECT Employees.Name, Departments.DepartmentName
    FROM Employees
    RIGHT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
    
  • FULL JOIN: Returns all rows when there is a match in either table.

    SELECT Employees.Name, Departments.DepartmentName
    FROM Employees
    FULL JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
    

Additional SQL Concepts

7. What are constraints? Name a few types of constraints in SQL.

Constraints enforce rules on data in tables:

  • NOT NULL: Ensures values are not null.
  • UNIQUE: Ensures all values in a column are unique.
  • CHECK: Ensures values meet a condition.
    CHECK (Salary > 0);
    
  • DEFAULT: Provides a default value if none is specified.
  • FOREIGN KEY: Enforces referential integrity.
  • PRIMARY KEY: Combines NOT NULL and UNIQUE.

8. How is DISTINCT used in a query? Provide an example.

The DISTINCT keyword removes duplicate rows.

SELECT DISTINCT Department
FROM Employees;

9. Explain the purpose of the LIKE operator. How do wildcards work?

The LIKE operator filters rows based on a pattern:

  • %: Matches zero or more characters.
  • _: Matches exactly one character.

Example:

SELECT Name
FROM Employees
WHERE Name LIKE 'A%';  -- Names starting with 'A'

10. What is the difference between CHAR and VARCHAR data types?

Feature CHAR VARCHAR
Storage Fixed-length. Variable-length.
Efficiency Faster for fixed sizes. Efficient for varying sizes.
Padding Pads with spaces. Does not pad.

Example:

CHAR(10)   -- Always 10 characters.
VARCHAR(10) -- Up to 10 characters.

SQL Concepts Explained

1. Explain the concept of normalization. What are its advantages?

Normalization is a database design technique used to reduce data redundancy and improve data integrity. It organizes data into multiple related tables, ensuring each table has a single purpose.

Advantages:

  1. Reduces Data Redundancy: Eliminates duplicate data by splitting tables.
  2. Improves Data Integrity: Maintains consistent and accurate data.
  3. Efficient Storage: Saves space by storing data only once.
  4. Easier Maintenance: Simplifies updates and reduces anomalies.
  5. Improved Query Performance: Reduces the amount of data scanned.

Normalization Forms:

  • 1NF (First Normal Form): Ensures columns have atomic values and a unique identifier.
  • 2NF (Second Normal Form): Eliminates partial dependencies.
  • 3NF (Third Normal Form): Removes transitive dependencies.

2. What is the difference between a VIEW and a TABLE?

Feature VIEW TABLE
Definition A virtual table based on a query. A physical structure to store data.
Storage Does not store data, only query. Physically stores data.
Updates Can be updatable (with limitations). Directly updatable.
Performance Slower (query executed each time). Faster (data stored directly).

Example:

-- VIEW
CREATE VIEW HighSalaryEmployees AS
SELECT * FROM Employees WHERE Salary > 50000;

-- TABLE
CREATE TABLE Employees (
    ID INT PRIMARY KEY,
    Name VARCHAR(50),
    Salary DECIMAL(10, 2)
);

3. How does indexing improve query performance? What are the types of indexes?

Indexes improve query performance by enabling faster data retrieval. They create a data structure (e.g., B-tree or hash table) that reduces the number of rows scanned during a query.

Types of Indexes:

  1. Clustered Index: Sorts and stores table rows based on the indexed column.
  2. Non-Clustered Index: Creates a separate structure with pointers to the actual rows.
  3. Unique Index: Ensures no duplicate values in the indexed column.
  4. Full-Text Index: Facilitates efficient text searches.

Example:

CREATE INDEX idx_salary ON Employees(Salary);

4. What is the difference between GROUP BY and ORDER BY?

Feature GROUP BY ORDER BY
Purpose Groups rows for aggregation. Sorts rows for display.
Functions Works with aggregate functions. Does not require aggregation.

Example:

-- GROUP BY
SELECT Department, AVG(Salary)
FROM Employees
GROUP BY Department;

-- ORDER BY
SELECT Name, Salary
FROM Employees
ORDER BY Salary DESC;

5. Explain the concept of aggregate functions like SUM, AVG, COUNT, etc.

Aggregate functions perform calculations on a set of values and return a single result.

Function Description Example
SUM Adds numeric values. SUM(Salary)
AVG Calculates average. AVG(Salary)
COUNT Counts rows. COUNT(*)
MAX Finds the maximum value. MAX(Salary)
MIN Finds the minimum value. MIN(Salary)

Example:

SELECT COUNT(*), AVG(Salary), SUM(Salary)
FROM Employees;

6. How would you retrieve duplicate records from a table?

To find duplicates, use the GROUP BY clause with a HAVING condition:

SELECT Name, COUNT(*)
FROM Employees
GROUP BY Name
HAVING COUNT(*) > 1;

7. What is a subquery? How is it different from a correlated subquery?

  • Subquery: A query inside another query. It executes independently of the outer query.
  • Correlated Subquery: A subquery that depends on the outer query for its values.

Example:

-- Subquery
SELECT Name
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);

-- Correlated Subquery
SELECT Name
FROM Employees e1
WHERE Salary > (SELECT AVG(Salary) FROM Employees e2 WHERE e1.Department = e2.Department);

8. Explain the use of the CASE statement in SQL.

The CASE statement is used for conditional logic in SQL queries.

Example:

SELECT Name,
       CASE
           WHEN Salary > 50000 THEN 'High'
           WHEN Salary BETWEEN 30000 AND 50000 THEN 'Medium'
           ELSE 'Low'
       END AS SalaryCategory
FROM Employees;

9. How can you implement a many-to-many relationship in SQL?

A many-to-many relationship is implemented using a junction table (bridge table) that connects two tables through foreign keys.

Example:

CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    Name VARCHAR(50)
);

CREATE TABLE Courses (
    CourseID INT PRIMARY KEY,
    CourseName VARCHAR(50)
);

CREATE TABLE StudentCourses (
    StudentID INT,
    CourseID INT,
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
    FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

10. What is the difference between UNION and UNION ALL?

Feature UNION UNION ALL
Duplicates Removes duplicate rows. Includes duplicate rows.
Performance Slower (removes duplicates). Faster (no duplicate check).

Example:

-- UNION
SELECT Name FROM Employees
UNION
SELECT Name FROM Managers;

-- UNION ALL
SELECT Name FROM Employees
UNION ALL
SELECT Name FROM Managers;

SQL Concepts with Examples and Explanations


1. What are window functions? Provide examples of ROW_NUMBER(), RANK(), and DENSE_RANK().

Window functions perform calculations across a set of table rows related to the current row without collapsing the rows into a single output (like aggregate functions do).

Examples:

  • ROW_NUMBER(): Assigns a unique number to each row within a partition.
  • RANK(): Assigns a rank to each row, leaving gaps if there are ties.
  • DENSE_RANK(): Similar to RANK but without gaps.

Example Query:

SELECT Name, Department, Salary,
       ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS RowNumber,
       RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank,
       DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS DenseRank
FROM Employees;

Output Example:

Name Department Salary RowNumber Rank DenseRank
Alice Sales 70000 1 1 1
Bob Sales 60000 2 2 2
Charlie Sales 60000 3 2 2

2. How would you optimize a slow-running query?

Strategies for Query Optimization:

  1. Indexing:
    • Use indexes on frequently queried columns.
    • Example: CREATE INDEX idx_salary ON Employees(Salary);
  2. *Avoid SELECT :
    • Fetch only necessary columns.
    • Example: SELECT Name, Salary FROM Employees;
  3. Analyze Query Execution Plan:
    • Use tools like EXPLAIN to identify bottlenecks.
  4. Reduce Joins and Subqueries:
    • Simplify queries and avoid unnecessary joins.
  5. Partitioning:
    • Divide large tables into smaller, manageable parts.
  6. Caching Results:
    • Cache frequent queries to avoid repeated calculations.
  7. Proper Use of WHERE Conditions:
    • Ensure filters use indexed columns.
    • Example: WHERE Salary > 50000;

3. What is the difference between transactional databases and analytical databases in terms of SQL usage?

Feature Transactional Database (OLTP) Analytical Database (OLAP)
Purpose Handle real-time transactions. Perform data analysis and reporting.
Data Model Highly normalized. Denormalized (star or snowflake schema).
Query Type Simple, short queries. Complex, long-running queries.
Examples Bank systems, e-commerce apps. Data warehouses, BI tools.

4. How do you handle NULL values in SQL queries? Explain with examples.

Strategies to Handle NULLs:

  1. Use IS NULL or IS NOT NULL:
    SELECT * FROM Employees WHERE Salary IS NULL;
    
  2. Use COALESCE():
    • Replaces NULL with a default value.
    SELECT Name, COALESCE(Salary, 0) AS Salary FROM Employees;
    
  3. Use NULL-safe Comparisons:
    • Avoid = for NULLs; use IS NULL.

Example Query:

SELECT Name,
       CASE
           WHEN Salary IS NULL THEN 'Unknown'
           ELSE 'Known'
       END AS SalaryStatus
FROM Employees;

5. Explain the concept of stored procedures, functions, and triggers.

Feature Stored Procedure Function Trigger
Definition Precompiled SQL block for tasks. Returns a single value/result. Executes automatically on events.
Execution Explicitly called. Invoked in SQL queries. Automatically triggered.
Use Case Complex tasks or batch processing. Calculations, transformations. Auditing, enforcing constraints.

Example:

  • Stored Procedure:
    CREATE PROCEDURE GetHighSalaries()
    AS
    BEGIN
        SELECT * FROM Employees WHERE Salary > 50000;
    END;
    
  • Function:
    CREATE FUNCTION GetBonus(@Salary INT)
    RETURNS INT
    AS
    BEGIN
        RETURN @Salary * 0.1;
    END;
    
  • Trigger:
    CREATE TRIGGER AuditSalaryChange
    ON Employees
    AFTER UPDATE
    AS
    BEGIN
        INSERT INTO AuditLog (EmployeeID, OldSalary, NewSalary)
        SELECT ID, DELETED.Salary, INSERTED.Salary
        FROM DELETED, INSERTED;
    END;
    

6. What is a CTE (Common Table Expression)? When should it be used?

A CTE is a temporary result set defined within a query for better readability and modularity.

When to Use:

  1. Complex queries with nested logic.
  2. Recursive queries.
  3. Reusing query logic multiple times.

Example:

WITH HighSalary AS (
    SELECT Name, Salary
    FROM Employees
    WHERE Salary > 50000
)
SELECT * FROM HighSalary WHERE Name LIKE 'A%';

7. How would you design a database schema for a scalable application?

Key Practices:

  1. Normalization: Start with normalized schemas, then denormalize for performance.
  2. Indexing: Add indexes for frequently accessed columns.
  3. Partitioning: Divide large tables horizontally or vertically.
  4. Sharding: Distribute data across multiple databases.
  5. Use Proper Data Types: Optimize storage and performance.
  6. Implement Constraints: Enforce data integrity.
  7. Maintain Audit Trails: Track changes with logging tables.

8. What is ACID in database transactions? Why is it important?

ACID:

  1. Atomicity: Ensures transactions are all-or-nothing.
  2. Consistency: Guarantees database state remains valid.
  3. Isolation: Ensures transactions don’t interfere with each other.
  4. Durability: Guarantees data persistence after a transaction.

Importance: Ensures reliability and data integrity in multi-user environments.


9. Explain the difference between OLAP and OLTP.

Feature OLAP (Online Analytical Processing) OLTP (Online Transactional Processing)
Purpose Analytical queries and reporting. Real-time transaction handling.
Schema Denormalized. Highly normalized.
Query Type Complex. Simple, short queries.
Data Volume Large datasets. Small, frequent updates.

10. How would you implement pagination in SQL?

Example Query for Pagination:

-- Fetch rows 11 to 20
SELECT *
FROM Employees
ORDER BY ID
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;

Key Approaches:

  1. LIMIT and OFFSET (MySQL, PostgreSQL):
    SELECT * FROM Employees LIMIT 10 OFFSET 10;
    
  2. ROW_NUMBER() (SQL Server, Oracle):
    WITH Paginated AS (
        SELECT *, ROW_NUMBER() OVER (ORDER BY ID) AS RowNum
        FROM Employees
    )
    SELECT * FROM Paginated WHERE RowNum BETWEEN 11 AND 20;
    

SQL Queries for Advanced Scenarios


1. Find the second-highest salary in a table

SELECT MAX(Salary) AS SecondHighestSalary
FROM Employees
WHERE Salary < (SELECT MAX(Salary) FROM Employees);

Alternative using ROW_NUMBER():

WITH RankedSalaries AS (
    SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
    FROM Employees
)
SELECT Salary AS SecondHighestSalary
FROM RankedSalaries
WHERE RowNum = 2;

2. Delete duplicate records from a table without using a temporary table

WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Name, Department, Salary ORDER BY ID) AS RowNum
    FROM Employees
)
DELETE FROM Employees
WHERE ID IN (SELECT ID FROM CTE WHERE RowNum > 1);

3. Display the department with the highest number of employees

SELECT Department
FROM Employees
GROUP BY Department
ORDER BY COUNT(*) DESC
LIMIT 1; -- Use TOP 1 for SQL Server

4. Find the nth highest salary in a table

WITH RankedSalaries AS (
    SELECT Salary, DENSE_RANK() OVER (ORDER BY Salary DESC) AS Rank
    FROM Employees
)
SELECT Salary
FROM RankedSalaries
WHERE Rank = n; -- Replace 'n' with the desired rank

5. Transpose rows into columns using SQL

Example: Transpose department names into columns with employee counts.

SELECT 
    SUM(CASE WHEN Department = 'Sales' THEN 1 ELSE 0 END) AS Sales,
    SUM(CASE WHEN Department = 'HR' THEN 1 ELSE 0 END) AS HR,
    SUM(CASE WHEN Department = 'IT' THEN 1 ELSE 0 END) AS IT
FROM Employees;

6. Fetch employees who joined in the last 6 months

SELECT *
FROM Employees
WHERE JoinDate >= DATEADD(MONTH, -6, GETDATE()); -- For SQL Server

For MySQL:

SELECT *
FROM Employees
WHERE JoinDate >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH);

7. Calculate the cumulative sum of a column

SELECT Name, Salary,
       SUM(Salary) OVER (ORDER BY ID) AS CumulativeSalary
FROM Employees;

8. Retrieve the top 3 salaries from each department

WITH RankedSalaries AS (
    SELECT Name, Department, Salary,
           ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank
    FROM Employees
)
SELECT Name, Department, Salary
FROM RankedSalaries
WHERE Rank <= 3;

9. Fetch records that do not have matching entries in another table

Example: Find employees without a department.

SELECT e.*
FROM Employees e
LEFT JOIN Departments d ON e.DepartmentID = d.ID
WHERE d.ID IS NULL;

10. List employees who have the same salary as another employee

SELECT e1.Name, e1.Salary
FROM Employees e1
JOIN Employees e2 ON e1.Salary = e2.Salary AND e1.ID <> e2.ID;