题解 | #每天的日活数及新用户占比#

每天的日活数及新用户占比

https://www.nowcoder.com/practice/dbbc9b03794a48f6b34f1131b1a903eb

这里先贴个不用窗口函数的常规思路的解法:


首先是in_time, out_time跨天都计为活跃的问题, 代码如下, 用union合并, 这样in_time和out_time两列转一列, 在根据日期分组, count(distinct t1.uid) 得到每日dau

    select  date(t1.active) as active,
        count(distinct t1.uid) as dau
    from (
        select ul.in_time as active,
            ul.uid
        from tb_user_log ul
        union
        select ul.out_time as active,
            ul.uid
        from tb_user_log ul
    ) t1
    group by date(t1.active)

其次是每日新增uv计算, 代码如下, 先根据uid分组, 求得min(date(in_time)) 得到初始登录日期, 外层在根据min_date分组, 对uid进行去重统计, 得到每天得新增用户数.

    select min_date,
        count(distinct t1.uid) as new_uv
    from (
        select ul.uid,
            min(date(ul.in_time)) as min_date
        from tb_user_log ul
        group by ul.uid
    ) t1
    group by min_date

最后, 我们已经获取了每日得新增用户数, 每日活跃用户数, 那么只需根据日期关联, 即可求得答案
注意点: 左连接, 当没有新增活跃用户时, 为null, 需要置为0

select t2.active,
    t2.dau,
    round(ifnull(t3.new_uv, 0)/t2.dau, 2) as uv_new_ratio
from (
    select  date(t1.active) as active,
        count(distinct t1.uid) as dau
    from (
        select ul.in_time as active,
            ul.uid
        from tb_user_log ul
        union
        select ul.out_time as active,
            ul.uid
        from tb_user_log ul
    ) t1
    group by date(t1.active)
) t2
left join (
    select min_date,
        count(distinct t1.uid) as new_uv
    from (
        select ul.uid,
            min(date(ul.in_time)) as min_date
        from tb_user_log ul
        group by ul.uid
    ) t1
    group by min_date
) t3 on t2.active=t3.min_date
order by t2.active

接着是使用窗口函数求解

窗口函数: count() over(partition by uid order by time) 根据用户分组, 登录时间进行排序, 然后计数, 会得到一列, 当用户为第一次登录时, count() 值为1, 然后依次累加, 这里也可以用使用: min(in_time) over(partition by uid) as min_login, 再将该列和正常时间比较, in_time = min_login就是新用户, 甚至可以用rank() over(partition by uid order by time) 都行.

小声吐槽, 我们公司用的mysql5.7版本的, 都不支持窗口函数

select t2.mau_date,
count(distinct uid) as dau,
    round(sum(if(times=1,1,0))/count(uid), 2)
from (
    select *,
        count(*) over (partition by uid order by t1.mau_date) as times
    from (
        select distinct uid,date(in_time) as mau_date
        from tb_user_log
        union
        select distinct uid,date(out_time) as mau_date
        from tb_user_log
    ) t1
) t2
GROUP by t2.mau_date
order by t2.mau_date
#数据分析师##数据分析工程师#
全部评论
学到了
点赞
送花
回复
分享
发布于 2022-07-28 22:16
求解,为什么是t2 left join t1 呢,新手有点不理解
点赞
送花
回复
分享
发布于 2022-10-02 23:18 贵州
滴滴
校招火热招聘中
官网直投
最外层的count做除数的时候,应该有个distinct去重
点赞
送花
回复
分享
发布于 2023-11-06 08:25 江苏

相关推荐

4 3 评论
分享
牛客网
牛客企业服务