题解 | #每天的日活数及新用户占比#
每天的日活数及新用户占比
https://www.nowcoder.com/practice/dbbc9b03794a48f6b34f1131b1a903eb
1.做标记,将表里记录数据标记为当天是否新用户,是否跨天
1)是否新用户,在今天之前没有出现过
if(a.uid not in(select distinct b.uid from tb_user_log b where a.in_time > b.in_time),1,0) as if_new
2)是否跨天且跨天后,该用户是否出现过(跨天的在后面可以在第二天活跃数加1)
case when datediff(date(a.out_time),date(a.in_time))=1
and a.uid not in(select distinct b.uid from tb_user_log b where date(b.in_time)=date(a.out_time))then 1 else 0 end as if_kua
2.计算每日的活跃数,新用户总数,跨天的数据
select it,count(distinct uid) u,sum(if_new) n,sum(if_kua) k from t group by it
3.因为活跃数需加上前一天跨天的用户,使用窗口函数获取按时间排序的上一条数据的k值,当k>0时,说明前一天有数据跨天且不在count(distinct uid)里,那么活跃数 u 需加上k,否则活跃数仅为u
case when lag(k) over(order by it)>0 then (lag(k) over(order by it))+u else u end as hy
4.计算总数,排序
完整代码:
with t as(
select a.uid,date(a.in_time) it,date(a.out_time) ot,
if(a.uid not in(select distinct b.uid from tb_user_log b where a.in_time > b.in_time),1,0) as if_new,
case when datediff(date(a.out_time),date(a.in_time))=1
and a.uid not in(select distinct b.uid from tb_user_log b where date(b.in_time)=date(a.out_time))then 1 else 0 end as if_kua
from tb_user_log a
order by it
),
t1 as(
select it,count(distinct uid) u,sum(if_new) n,sum(if_kua) k from t group by it
),
t2 as(
select it,case when lag(k) over(order by it)>0 then (lag(k) over(order by it))+u else u end as hy,n from t1
)
select it,hy,round(n/hy,2) from t2 order by it