题解 | 近一个月发布的视频中热度最高的top3视频
近一个月发布的视频中热度最高的top3视频
https://www.nowcoder.com/practice/0226c7b2541c41e59c3b8aec588b09ff
with tmp as ( select coalesce(sum(if(TIMESTAMPDIFF(second, u.start_time, u.end_time) >= v.duration,1,0))*1.0/IF(count(*)=0,1,count(*)),0) as overper, ifnull(sum(u.if_like),0) as likeper, sum(IF(u.comment_id is not null,1,0)) as comper, sum(u.if_retweet) as reper, max(u.end_time) as maxtime, v.video_id as video_id, datediff((select max(end_time) from tb_user_video_log),max(u.end_time)) as renoday from tb_video_info as v,tb_user_video_log as u where v.video_id=u.video_id and datediff(( select max(end_time) from tb_user_video_log ),v.release_time)<30 group by v.video_id ) select tmp.video_id, round((100*tmp.overper+5*tmp.likeper+3*tmp.comper+2*tmp.reper)*(1/(tmp.renoday+1))) as hot_index from tmp order by hot_index desc limit 3
- 时间差计算错误 错误代码: minute(u.end_time - u.start_time) -- 错误:返回分钟部分而非总时长 数据验证(以id=4为例): start_time = 2021-10-03 11:00:50 end_time = 2021-10-03 11:01:35 TIMESTAMPDIFF(MINUTE, ...)返回 0分钟(实际时长45秒),但TIMESTAMPDIFF(SECOND, ...)返回 45秒。 修正方案: TIMESTAMPDIFF(SECOND, u.start_time, u.end_time) -- 以秒为单位计算总时长
- 空值判断错误 错误代码: IF(u.comment_id != null, 1, 0) -- 错误:应使用 IS NOT NULL 数据验证:id=1的comment_id为NULL,错误代码返回0,正确代码应返回0。 修正方案: IF(u.comment_id IS NOT NULL, 1, 0)
- GROUP BY字段不完整 错误代码: GROUP BY v.video_id SELECT v.video_id, v.id -- v.id未聚合或分组 数据验证:若多个视频(不同v.id)共享同一video_id,v.id值将不可控。 修正方案: GROUP BY v.video_id, v.id -- 确保所有非聚合字段被分组 三、数据逻辑与计算错误
- 除零错误 错误代码: SUM(...) / COUNT() -- 当COUNT()=0时崩溃 修正方案: COALESCE(SUM(...) / NULLIF(COUNT(*), 0), 0)
- JOIN逻辑冗余 错误代码: FROM tmp JOIN reno ON tmp.id = reno.id -- 冗余JOIN导致数据重复 数据验证:tmp与reno均含id字段,但未明确主键唯一性。 修正方案: -- 直接合并计算,避免JOIN SELECT ..., (1 / (DATEDIFF(max_maxtime, maxtime) + 1)) FROM tmp 四、测试数据验证发现的问题
- 时间差单位误解 数据记录id=5: start_time = 2021-10-02 10:59:05 end_time = 2021-10-02 11:00:04 实际时长:59秒,但TIMESTAMPDIFF(MINUTE)返回 0分钟。 结论:需改用SECOND单位计算。
- 分母有效性验证 全表COUNT()为11:若某视频无观看记录(COUNT()=0),需避免除零错误。
- sum和count使用错误
- 对于时间间隔,比如前30天是包含当天的
- 两个日期相减date1-date2,返回day,使用day(date1-date2)无法得到,应该用datediff(end,start),注意和timestampdiff(minute,start,end)