题解 | #国庆期间每类视频点赞量和转发量#
国庆期间每类视频点赞量和转发量
https://www.nowcoder.com/practice/f90ce4ee521f400db741486209914a11
看了高赞回答,自己再做一遍
1. 求出每类tag每天的点赞量和转发量
SELECT tag, DATE_FORMAT(start_time,'%Y-%m-%d') dt, sum(if_like) like_cnt, sum(if_retweet) retweet_cnt FROM tb_user_video_log log LEFT JOIN tb_video_info info USING(video_id) GROUP BY tag, dt
2. 统计累计7天的点赞,一周内最大单天转发量
SELECT tag, dt, sum(like_cnt) over(partition by tag order by dt ROWS 6 preceding) sum_like_cnt_7d, max(retweet_cnt) over(partition by tag order by dt ROWS 6 preceding) max_retweet_cnt_7d from (SELECT tag, DATE_FORMAT(start_time,'%Y-%m-%d') dt, sum(if_like) like_cnt, sum(if_retweet) retweet_cnt FROM tb_user_video_log log LEFT JOIN tb_video_info info USING(video_id) WHERE DATE(start_time) BETWEEN DATE_ADD('2021-10-01', INTERVAL -6 DAY) AND '2021-10-03' #如果日期隔天,为了不让它统计到之前的数据 GROUP BY tag, dt ) t
3.筛选出2021年国庆头3天的数据,不能直接筛选,因为有partition by函数
with t1 as (SELECT tag, dt, sum(like_cnt) over(partition by tag order by dt ROWS 6 preceding) sum_like_cnt_7d, max(retweet_cnt) over(partition by tag order by dt ROWS 6 preceding) max_retweet_cnt_7d from (SELECT tag, DATE_FORMAT(start_time,'%Y-%m-%d') dt, sum(if_like) like_cnt, sum(if_retweet) retweet_cnt FROM tb_user_video_log log LEFT JOIN tb_video_info info USING(video_id) WHERE DATE(start_time) BETWEEN DATE_ADD('2021-10-01', INTERVAL -6 DAY) AND '2021-10-03' #如果日期隔天,为了不让它统计到之前的数据 GROUP BY tag, dt ) t) SELECT * from t1 WHERE dt BETWEEN '2021-10-01' and '2021-10-03' order by tag desc, dt
#注意:'2021-10-01' 不能写成'2021-10-1' ,不然查不出来
#有一个疑问,开窗向前取6行会取到七天之前的数据吗?这个应该不会,因为每天都统计了数据,不存在数据会跳着取
优化代码:
SELECT * from (SELECT tag, DATE_FORMAT(start_time,'%Y-%m-%d') dt, sum(sum(if_like)) over(partition by tag order by DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) sum_like_cnt_7d, max(sum(if_retweet)) over(partition by tag order by DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) max_retweet_cnt_7d FROM tb_user_video_log log LEFT JOIN tb_video_info info USING(video_id) WHERE DATE(start_time) BETWEEN DATE_ADD('2021-10-01', INTERVAL -6 DAY) AND '2021-10-03' #如果日期隔天,为了不让它统计到之前的数据 GROUP BY tag, dt) t WHERE dt BETWEEN '2021-10-01' and '2021-10-03' order by tag desc, dt