题解 | #每天的日活数及新用户占比#
每天的日活数及新用户占比
https://www.nowcoder.com/practice/dbbc9b03794a48f6b34f1131b1a903eb
这里先贴个不用窗口函数的常规思路的解法:
首先是in_time, out_time跨天都计为活跃的问题, 代码如下, 用union合并, 这样in_time和out_time两列转一列, 在根据日期分组, count(distinct t1.uid) 得到每日dau
select date(t1.active) as active, count(distinct t1.uid) as dau from ( select ul.in_time as active, ul.uid from tb_user_log ul union select ul.out_time as active, ul.uid from tb_user_log ul ) t1 group by date(t1.active)
其次是每日新增uv计算, 代码如下, 先根据uid分组, 求得min(date(in_time)) 得到初始登录日期, 外层在根据min_date分组, 对uid进行去重统计, 得到每天得新增用户数.
select min_date, count(distinct t1.uid) as new_uv from ( select ul.uid, min(date(ul.in_time)) as min_date from tb_user_log ul group by ul.uid ) t1 group by min_date
最后, 我们已经获取了每日得新增用户数, 每日活跃用户数, 那么只需根据日期关联, 即可求得答案
注意点: 左连接, 当没有新增活跃用户时, 为null, 需要置为0
select t2.active, t2.dau, round(ifnull(t3.new_uv, 0)/t2.dau, 2) as uv_new_ratio from ( select date(t1.active) as active, count(distinct t1.uid) as dau from ( select ul.in_time as active, ul.uid from tb_user_log ul union select ul.out_time as active, ul.uid from tb_user_log ul ) t1 group by date(t1.active) ) t2 left join ( select min_date, count(distinct t1.uid) as new_uv from ( select ul.uid, min(date(ul.in_time)) as min_date from tb_user_log ul group by ul.uid ) t1 group by min_date ) t3 on t2.active=t3.min_date order by t2.active
接着是使用窗口函数求解
窗口函数: count() over(partition by uid order by time) 根据用户分组, 登录时间进行排序, 然后计数, 会得到一列, 当用户为第一次登录时, count() 值为1, 然后依次累加, 这里也可以用使用: min(in_time) over(partition by uid) as min_login, 再将该列和正常时间比较, in_time = min_login就是新用户, 甚至可以用rank() over(partition by uid order by time) 都行.
小声吐槽, 我们公司用的mysql5.7版本的, 都不支持窗口函数
select t2.mau_date, count(distinct uid) as dau, round(sum(if(times=1,1,0))/count(uid), 2) from ( select *, count(*) over (partition by uid order by t1.mau_date) as times from ( select distinct uid,date(in_time) as mau_date from tb_user_log union select distinct uid,date(out_time) as mau_date from tb_user_log ) t1 ) t2 GROUP by t2.mau_date order by t2.mau_date#数据分析师##数据分析工程师#