题解 | #试卷完成数同比2020年的增长率及排名变化#
每月及截止当月的答题情况
http://www.nowcoder.com/practice/1ce93d5cec5c4243930fc5e8efaaca1e
看了大家的题解主要以做表的连接为主,这里分享一下不用连接的做法;
步骤拆解:
- 输出每个月份的用户明细并求出用户最早出现的月份作为子表;
- 求出每个月的月活用户数和新增用户数;
- 求出截止当月的单月最大新增用户数、截止当月的累积用户数,结果按月份升序输出;
第一步,输出每个月份的用户明细并求出用户最早出现的月份作为子表:
- 输出不同月份:date_format(start_time,'%Y%m') as start_month
- 在不破坏表结构的情况下输出每个用户最早出现的月份early_month:min(date_format(start_time,'%Y%m')) over(partition by uid order by date_format(start_time,'%Y%m')) as early_month;
(select
uid,
date_format(start_time,'%Y%m') as start_month,
min(date_format(start_time,'%Y%m')) over(partition by uid
order by date_format(start_time,'%Y%m')) as early_month
from exam_record) as t1
第二步,求出每个月的月活用户数和新增用户数;
- 根据月份分组输出月活用户数(注意用distinct去重):count(distinct uid) as mau,group by start_month;
- 根据月份分组输出新增用户数,early_month是每个用户最早出现的月份(新增用户即start_month等于early_month的uid去重):count(distinct if(start_month=early_month,uid,null)) as month_add_cv;
第三步,求出截止当月的单月最大新增用户数、截止当月的累积用户数:
- 求出截至当月的单月最大新增用户数(按月份升序扩大窗口求最大值):max(month_add_cv) over(order by start_month);
- 求出截至当月的累积用户数(按月份升序扩大窗口求累计值):sum(month_add_cv) over(order by start_month);
- 结果按月份升序输出:order by start_month;
最终代码拼接如下:
select
start_month,mau,month_add_cv,
max(month_add_cv) over(order by start_month),
sum(month_add_cv) over(order by start_month)
from
(select
start_month,
count(distinct uid) as mau,
count(distinct if(start_month=early_month,uid,null)) as month_add_cv
from
(select
uid,
date_format(start_time,'%Y%m') as start_month,
min(date_format(start_time,'%Y%m')) over(partition by uid
order by date_format(start_time,'%Y%m')) as early_month
from exam_record) as t1
group by start_month) as t2
order by start_month;