题解 | 子查询分层聚合 | 每个创作者每月的涨粉率及截止当前的总粉丝量
每个创作者每月的涨粉率及截止当前的总粉丝量
https://www.nowcoder.com/practice/d337c95650f640cca29c85201aecff84
SELECT author, month, fans_growth_rate,
SUM(month_fans) OVER(
PARTITION BY author
ORDER BY month
) AS total_fans
FROM(
SELECT author,
DATE_FORMAT(start_time, '%Y-%m') AS month,
ROUND(SUM(follow_delta)/COUNT(1),3) AS fans_growth_rate,
SUM(follow_delta) AS month_fans
FROM tb_video_info v
JOIN(
SELECT * , CASE
WHEN if_follow = 0 THEN 0
WHEN if_follow = 1 THEN 1
WHEN if_follow = 2 THEN -1 END AS follow_delta
FROM tb_user_video_log
) u
ON u.video_id = v.video_id
WHERE YEAR(start_time) = "2021"
GROUP BY author, DATE_FORMAT(start_time, '%Y-%m')
) AS o
ORDER BY author ASC, total_fans ASC
SQL题解笔记(按“子查询由内到外”拆解)
🧠 总体思路
先在内层做“数据清洗 + 行级计算”,再在中层做“分组聚合”,最后在外层做“窗口累计”。
① 最内层:行为重编码(CASE WHEN)
SELECT *,
CASE
WHEN if_follow = 0 THEN 0
WHEN if_follow = 1 THEN 1
WHEN if_follow = 2 THEN -1
END AS follow_delta
FROM tb_user_video_log
🎯 作用
把“关注行为”转化为“粉丝变化量”:
- 关注 →
+1 - 未关注 →
0 - 取关 →
-1
📌 知识点:CASE WHEN
模板
CASE
WHEN 条件1 THEN 值1
WHEN 条件2 THEN 值2
ELSE 默认值
END
适用场景
- 行为映射(点赞/关注/购买)
- 分类编码
- 构造“增减量”
② 中间层:分组聚合(GROUP BY)
SELECT author,
DATE_FORMAT(start_time, '%Y-%m') AS month,
ROUND(SUM(follow_delta) / COUNT(1), 3) AS fans_growth_rate,
SUM(follow_delta) AS month_fans
FROM ...
GROUP BY author, DATE_FORMAT(start_time, '%Y-%m')
🎯 作用
按「作者 + 月份」统计:
fans_growth_rate:当月涨粉率month_fans:当月净增粉丝数
📌 知识点 1:DATE_FORMAT
模板
DATE_FORMAT(datetime_col, '%Y-%m')
作用
把时间转成“年月分组”,避免:
MONTH(start_time) -- ❌ 会跨年混合
📌 知识点 2:SUM / COUNT / AVG
关键性质(0/1模型)
如果字段是 0/1:
SUM(x) = 总次数 AVG(x) = 比率
但这里是 +1 / 0 / -1:
SUM(follow_delta)→ 净增长SUM / COUNT→ 平均增长率
📌 知识点 3:GROUP BY
模板
GROUP BY 分组字段1, 分组字段2
本题重点
GROUP BY author, month
③ 外层:窗口函数累计(OVER)
SELECT author,
month,
fans_growth_rate,
SUM(month_fans) OVER(
PARTITION BY author
ORDER BY month
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS total_fans
FROM o
🎯 作用
计算:
截至当前月份的累计粉丝数
📌 知识点:窗口函数(SUM OVER)
模板
SUM(字段) OVER(
PARTITION BY 分组字段
ORDER BY 时间字段
)
含义拆解
PARTITION BY author | 每个作者单独计算 |
ORDER BY month | 按时间排序 |
SUM(...) | 累加 |
推荐完整写法
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
👉 表示:从第一行累计到当前行
④ 完整结构总结
SELECT ... 窗口函数
FROM(
SELECT ... GROUP BY
FROM(
SELECT ... CASE WHEN
FROM 明细表
) t1
JOIN 维度表
GROUP BY ...
) t2
⑤ 本题核心流程图
原始日志 ↓ CASE WHEN(行为 → 数值) ↓ GROUP BY(作者 + 月) ↓ 得到:month_fans / growth_rate ↓ 窗口函数 SUM OVER ↓ 得到:累计粉丝 total_fans
⑥ 高频速记
🔹 三层结构
- 内层:CASE WHEN(行级处理)
- 中层:GROUP BY(聚合)
- 外层:OVER(累计/排名)
🔹 三个核心函数
DATE_FORMAT → 时间分组 SUM / AVG → 统计指标 SUM(...) OVER → 累计值
🔹 一句话总结
先把行为变成数值(+1/-1),再按月汇总,最后按时间累计。