题解 | #连续两次作答试卷的最大时间窗#
连续两次作答试卷的最大时间窗
https://www.nowcoder.com/practice/9dcc0eebb8394e79ada1d4d4e979d73c
/*位移开窗连续两次的时间差形成新的表t1,从t1中取出uid,按照uid分组,取出时间差的最大值*/
select uid
,days_windows
,round(days_windows * all_avg,2) AS avg_exam_cnt
from(
select uid
,max(timestampdiff(day,start_time,time_lag))+1 AS days_windows
,count(start_time) / (timestampdiff(day,min(start_time),max(start_time))+1) AS all_avg
from(
select uid
,date(start_time) AS start_time
/*下一次作答时间*/
,date(lead(start_time,1,null) over(partition by uid order by start_time asc)) AS time_lag
from exam_record
where year(start_time) = 2021
)t1
group by uid
)t2
where days_windows > 1
order by days_windows desc
,avg_exam_cnt desc
1、用窗口函数lead()增加一列time_lag记录下一次答题时间,统一转化为date()日期格式,这样在使用timestampdiff()时只会计算日期差,原始时间格式用timestampdiff()计算时会考虑到具体的时间
2、开窗函数得到表t1,按照uid 进行group by,可以得到每一次答题和下一次答题的时间窗的最大值,即可得到最大时间窗。
同时可以得到第一次答题时间min(start_time)和最后一次答题时间max(start_time)的天数差,从而求得这段时间内的平均每天作答试卷数。形成表t2
3、t2的最大时间窗*平均每天作答试卷数,可以得到在days_windows内的作答试卷数量。

