题解|有间隔日期的连接|#新用户次日留存率#

2021年11月每天新用户的次日留存率

https://www.nowcoder.com/practice/1fc0e75f07434ef5ba4f1fb2aa83a450

## 思路:分别创建‘每天新用户表’和‘用户活跃表’  ->   使用‘第一天新用户且第二天仍活跃’进行连接,保留相关记录  ->   分组计算留存率
## 额外的思考:一般不会存在第一天in_time,第三天才out_time的情况,所以把in_time和out_time使用UNION进行去重取并集作为活跃日期是合理的
## 第一步:每天新用户表 和 用户活跃表
# (SELECT uid,MIN(DATE(in_time)) AS dt
# FROM tb_user_log
# GROUP BY uid) t1  #每天新用户表


# (SELECT uid,DATE(in_time) AS dt
# FROM tb_user_log
# UINON
# SELECT uid,DATE(out_time) AS dt
# FROM tb_user_log) t2  #用户活跃表


# ## 第二步:使用‘第一天新用户第二天活跃的时间’ 作为连接条件
# t1 LEFT JOIN t2
# ON t1.uid = t2.uid
# AND t1.dt = DATE_SUB(t2.dt,INTERVAL 1 DAY)  ## 或者写为:t2.dt = DATE_ADD(t1.dt,INTERVAL 1 DAY)
# ## 对这里连接的理解,如果不存在相差1天的日期,那么这行就是只保留了t1作为基准表进行左连接的记录,右侧没有连接记录的地方就是空值。
# ## 所以分别COUNT t2.uid就是第二天仍存在, COUNT t1.uid就是第一天新用户

## 第三步:计算留存率
SELECT t1.dt,ROUND(COUNT(t2.uid)/COUNT(t1.uid),2) AS uv_rate
FROM(
    SELECT uid,MIN(DATE(in_time)) AS dt
    FROM tb_user_log
    GROUP BY uid) t1
LEFT JOIN(
    SELECT uid,DATE(in_time) AS dt
    FROM tb_user_log
    UNION
    SELECT uid,DATE(out_time) AS dt
    FROM tb_user_log)t2
ON t1.uid = t2.uid
AND t1.dt = DATE_SUB(t2.dt,INTERVAL 1 DAY) 
WHERE DATE_FORMAT(t1.dt,'%Y-%m') = '2021-11'
GROUP BY t1.dt
ORDER BY t1.dt

####如果需要同时输出次日&7日留存率按照以下进行:
SELECT t1.dt,ROUND(COUNT(t2.uid)/COUNT(t1.uid),2) AS uv_rate,ROUND(COUNT(t3.uid)/COUNT(t1.uid),2) AS uv_rate_7d
FROM(
    SELECT uid,MIN(DATE(in_time)) AS dt
    FROM tb_user_log
    GROUP BY uid) t1
LEFT JOIN(
    SELECT uid,DATE(in_time) AS dt
    FROM tb_user_log
    UNION
    SELECT uid,DATE(out_time) AS dt
    FROM tb_user_log)t2
ON t1.uid = t2.uid
AND t1.dt = DATE_SUB(t2.dt,INTERVAL 1 DAY) 
LEFT JOIN(
    SELECT uid,DATE(in_time) AS dt
    FROM tb_user_log
    UNION
    SELECT uid,DATE(out_time) AS dt
    FROM tb_user_log)t3
ON t1.uid = t3.uid
AND t1.dt = DATE_SUB(t3.dt,INTERVAL 6 DAY) 
WHERE DATE_FORMAT(t1.dt,'%Y-%m') = '2021-11'
GROUP BY t1.dt
ORDER BY t1.dt

全部评论

相关推荐

点赞 收藏 评论
分享
牛客网
牛客企业服务