| comments | true | |
|---|---|---|
| difficulty | Hard | |
| edit_url | https://github.com/doocs/leetcode/edit/main/solution/3700-3799/3764.Most%20Common%20Course%20Pairs/README_EN.md | |
| tags |
|
Table: course_completions
+-------------------+---------+ | Column Name | Type | +-------------------+---------+ | user_id | int | | course_id | int | | course_name | varchar | | completion_date | date | | course_rating | int | +-------------------+---------+ (user_id, course_id) is the combination of columns with unique values for this table. Each row represents a completed course by a user with their rating (1-5 scale).
Write a solution to identify skill mastery pathways by analyzing course completion sequences among top-performing students:
- Consider only top-performing students (those who completed at least
5courses with an average rating of4or higher). - For each top performer, identify the sequence of courses they completed in chronological order.
- Find all consecutive course pairs (
Course A → Course B) taken by these students. - Return the pair frequency, identifying which course transitions are most common among high achievers.
Return the result table ordered by pair frequency in descending order and then by first course name and second course name in ascending order.
The result format is in the following example.
Example:
Input:
course_completions table:
+---------+-----------+------------------+-----------------+---------------+ | user_id | course_id | course_name | completion_date | course_rating | +---------+-----------+------------------+-----------------+---------------+ | 1 | 101 | Python Basics | 2024-01-05 | 5 | | 1 | 102 | SQL Fundamentals | 2024-02-10 | 4 | | 1 | 103 | JavaScript | 2024-03-15 | 5 | | 1 | 104 | React Basics | 2024-04-20 | 4 | | 1 | 105 | Node.js | 2024-05-25 | 5 | | 1 | 106 | Docker | 2024-06-30 | 4 | | 2 | 101 | Python Basics | 2024-01-08 | 4 | | 2 | 104 | React Basics | 2024-02-14 | 5 | | 2 | 105 | Node.js | 2024-03-20 | 4 | | 2 | 106 | Docker | 2024-04-25 | 5 | | 2 | 107 | AWS Fundamentals | 2024-05-30 | 4 | | 3 | 101 | Python Basics | 2024-01-10 | 3 | | 3 | 102 | SQL Fundamentals | 2024-02-12 | 3 | | 3 | 103 | JavaScript | 2024-03-18 | 3 | | 3 | 104 | React Basics | 2024-04-22 | 2 | | 3 | 105 | Node.js | 2024-05-28 | 3 | | 4 | 101 | Python Basics | 2024-01-12 | 5 | | 4 | 108 | Data Science | 2024-02-16 | 5 | | 4 | 109 | Machine Learning | 2024-03-22 | 5 | +---------+-----------+------------------+-----------------+---------------+
Output:
+------------------+------------------+------------------+ | first_course | second_course | transition_count | +------------------+------------------+------------------+ | Node.js | Docker | 2 | | React Basics | Node.js | 2 | | Docker | AWS Fundamentals | 1 | | JavaScript | React Basics | 1 | | Python Basics | React Basics | 1 | | Python Basics | SQL Fundamentals | 1 | | SQL Fundamentals | JavaScript | 1 | +------------------+------------------+------------------+
Explanation:
- User 1: Completed 6 courses with average rating 4.5 (qualifies as top performer)
- User 2: Completed 5 courses with average rating 4.4 (qualifies as top performer)
- User 3: Completed 5 courses but average rating is 2.8 (does not qualify)
- User 4: Completed only 3 courses (does not qualify)
- Course Pairs Among Top Performers:
- User 1: Python Basics → SQL Fundamentals → JavaScript → React Basics → Node.js → Docker
- User 2: Python Basics → React Basics → Node.js → Docker → AWS Fundamentals
- Most common transitions: Node.js → Docker (2 times), React Basics → Node.js (2 times)
Results are ordered by transition_count in descending order, then by first_course in ascending order, and then by second_course in ascending order.
We first filter out all top students, denoted as top_students, i.e., students who have completed at least 5 courses with an average rating of at least 4. Then for each top student, we sort by completion time and find all consecutive course pairs, denoted as course_pairs. Finally, we group and count all course pairs, calculate the occurrence count of each course pair, and output the results sorted as required.
# Write your MySQL query statement below
WITH
top_students AS (
SELECT user_id
FROM course_completions
GROUP BY user_id
HAVING COUNT(1) >= 5 AND AVG(course_rating) >= 4
),
course_pairs AS (
SELECT
course_name AS first_course,
LEAD(course_name) OVER (
PARTITION BY user_id
ORDER BY completion_date
) second_course
FROM
top_students
JOIN course_completions USING (user_id)
)
SELECT
*,
COUNT(1) transition_count
FROM course_pairs
WHERE second_course IS NOT NULL
GROUP BY 1, 2
ORDER BY 3 DESC, 1, 2;import pandas as pd
def topLearnerCourseTransitions(course_completions: pd.DataFrame) -> pd.DataFrame:
grp = course_completions.groupby("user_id")
top_students = grp.filter(
lambda df: df.shape[0] >= 5 and df["course_rating"].mean() >= 4
)["user_id"].unique()
df = course_completions[course_completions["user_id"].isin(top_students)].copy()
df = df.sort_values(["user_id", "completion_date"])
df["second_course"] = df.groupby("user_id")["course_name"].shift(-1)
df["first_course"] = df["course_name"]
pairs = df[df["second_course"].notna()][["first_course", "second_course"]]
result = (
pairs.groupby(["first_course", "second_course"])
.size()
.reset_index(name="transition_count")
.sort_values(
["transition_count", "first_course", "second_course"],
ascending=[False, True, True],
key=lambda col: col.str.lower() if col.dtype == "object" else col,
)
.reset_index(drop=True)
)
return result