Skip to content

Latest commit

Β 

History

History
121 lines (87 loc) Β· 4.94 KB

File metadata and controls

121 lines (87 loc) Β· 4.94 KB

πŸ“ Topic 10.2: INNER JOIN (The Intersection)

INNER JOIN is the most commonly used join in SQL. It answers the question: "Show me only the rows that exist in both tables." If a row appears in one table but has no match in the other, it is completely excluded.


1. Definition

INNER JOIN: Returns only those rows where the join condition is satisfied in both tables. Rows without a matching partner on either side are silently excluded from the result.


2. Why This Concept Exists

In many business scenarios, you only care about complete relationships:

  • A shipping manifest only needs customers who have actually placed an order.
  • A payroll report only needs employees who are assigned to a department.
  • An invoice only shows products that were actually purchased.

INNER JOIN automatically filters out "incomplete" or "unlinked" data.


3. How It Works β€” The Join Buffer (PRO LEVEL)

MySQL uses a Nested Loop Join (NLJ) by default for INNER JOINs:

  1. Reads a row from the Outer Table (usually the smaller table).
  2. For each row, looks up the matching row in the Inner Table using the join key.
  3. If a match is found, a combined row is produced.
  4. If no match is found, that row is silently discarded.

The Join Buffer Optimization: For large datasets where indexes aren't perfect, MySQL loads chunks of the outer table into a RAM buffer (the Join Buffer) and scans the inner table once per chunk rather than once per row. Setting join_buffer_size in my.ini can speed this up significantly.


4. Syntax / Implementation β€” Full Cheat Sheet

-- Standard syntax (most common)
SELECT c.Customer_Name, o.Order_Date, o.Total_Amount
FROM Customers c
INNER JOIN Orders o ON c.Customer_ID = o.Customer_ID;

-- The USING shorthand (when column names are identical in both tables)
SELECT c.Customer_Name, o.Order_Date
FROM Customers c
INNER JOIN Orders o USING (Customer_ID);

-- With WHERE and ORDER BY
SELECT p.Product_Name, s.Supplier_Name, p.Stock
FROM Products p
INNER JOIN Suppliers s ON p.Supplier_ID = s.Supplier_ID
WHERE p.Stock > 0
ORDER BY p.Product_Name ASC;

-- Multi-column JOIN condition (for composite keys)
SELECT * FROM Inventory i
INNER JOIN Order_Items oi ON i.Branch_ID = oi.Branch_ID AND i.Item_ID = oi.Item_ID;

5. Real-Life Examples

Example 1 β€” Employee Directory with Department Name:

SELECT e.First_Name, e.Last_Name, d.Dept_Name, d.Location
FROM Employees e
INNER JOIN Departments d ON e.Dept_ID = d.Dept_ID;
-- Result: Only employees with an assigned department appear.
-- (Interns with NULL Dept_ID are automatically excluded)

Example 2 β€” E-commerce Order Report:

SELECT 
    c.Customer_Name, 
    o.Order_ID, 
    p.Product_Name, 
    oi.Quantity,
    oi.Quantity * p.Price AS Line_Total
FROM Orders o
INNER JOIN Customers c ON o.Customer_ID = c.Customer_ID
INNER JOIN Order_Items oi ON o.Order_ID = oi.Order_ID
INNER JOIN Products p ON oi.Product_ID = p.Product_ID;

Example 3 β€” School: Students with their class teacher:

SELECT s.Student_Name, t.Teacher_Name, c.Class_Name
FROM Students s
INNER JOIN Classes c ON s.Class_ID = c.Class_ID
INNER JOIN Teachers t ON c.Teacher_ID = t.Teacher_ID;

6. Common Mistakes

  • Using INNER JOIN when you should use LEFT JOIN: If a customer has no orders and you use INNER JOIN, they vanish from results. Use LEFT JOIN when you want "all rows from one side, even without a match."
  • The old comma syntax: SELECT * FROM T1, T2 WHERE T1.ID = T2.ID is technically an INNER JOIN but is an outdated style. It's harder to read, can accidentally become a CROSS JOIN, and doesn't scale to OUTER JOINs cleanly. Always use explicit INNER JOIN ... ON.
  • Not indexing the join column: If Orders.Customer_ID has no index and there are 5 million orders, MySQL full-scans 5 million rows for every single customer lookup.

7. Tips & Best Practices

  • Always alias your tables (c for Customers, o for Orders) β€” it makes queries readable and prevents "ambiguous column" errors when both tables have similarly named columns.
  • Use EXPLAIN to verify the join is using an index (type: ref or type: eq_ref) vs. a full scan (type: ALL).
  • The USING keyword reduces verbosity when the column names are identical in both tables, but use it carefully β€” if a future schema change renames one column, the USING clause will silently start joining on the wrong column.

8. Mini Practice Tasks

  • Task 1: Write a query joining Patients(Patient_ID, Name, Doctor_ID) and Doctors(Doctor_ID, Doctor_Name) to show each patient's name and their assigned doctor's name.
  • Task 2: If Customer_ID = 50 exists in Customers but has no corresponding rows in Orders, will that customer appear in an INNER JOIN result? Why or why not?
  • Task 3: Rewrite this old-style join using modern syntax: SELECT * FROM Employees, Departments WHERE Employees.Dept_ID = Departments.Dept_ID;