题解 | #牛客每个人最近的登录日期(四)#
牛客每个人最近的登录日期(四)
https://www.nowcoder.com/practice/e524dc7450234395aa21c75303a42b0a
#法一:通过分组用户,先找出每个用户第一次登录的日期,形成新表,此表意义=每日登录的新用户,然后对这个新表按日期分组,聚合函数统计用户数即可
/* select
t1.fir as date,
count(t1.user_id) as new
from
(
select
min(date) as fir,
user_id
from login
group by user_id
) as t1
group by t1.fir
order by date */
#0的统计不进来
/*select
l1.date as date,
count(case when l2.user_id is not null then 1 else 0 end ) as new
from
login l1 left join login l2
on l1.id=l2.id
where (l2.date, l2.user_id) in
(
select
min(date) as fir,
user_id
from login
group by user_id
)
group by l1.date
order by date */
#法2做不到,count case when end 只记录结果集的1,若无1,返回的是null,不返回0
#ifnull(A,B),A非null返回A,A=null返回B,即日期有新用户数,就返回新用户数,无则返回0,所以需要有一列统计新用户数
/*select
date
from login as l2
where (l2.date, l2.user_id) in
(
select
min(date),
user_id
from login
group by user_id
) */
#挑出用户以及对应的用户第一天登录的日期,形成新表,按日期分组,用count统计每日新登陆数,??count不统计NULL,如果数据是0,还是会返回1,null则不加入计算,还有一个问题,那些没有新用户登录的天数这样子就被忽略了。还是需要自连接
/*select
Y1.t1 as date
,ifnull(Y1.t2,0) as new
from
(select
l1.date as t1
,count( case when l2.user_id is not null then 1 end) as t2
#count case when 如果结果集一直没有1,则无结果返回的是null
from
login l1 left join login l2
on l1.id=l2.id
where (l2.date, l2.user_id) in
(
select
min(date) ,
user_id
from login
group by user_id
)
group by l1.date)as Y1 */
#Y1记录的是每日的新用户登录数,并且无新用户登记为NULL
#成功了
/*select
l1.date as t1
,sum(case when l3.user_id is not null then 1 else 100 end) as h2
#1.count(1)与count(*)得到的结果一致,包含null值。
#2.count(字段)不计算null值
#3.count(null)结果恒为0+count(case when then end 没有指定else时,不满足when后的条件,则返回null),所以不满足when后的条件时,count(case when then end)=0.或者写成count(case when then else null end
from
login l1 left join
(select *
from login l2
where (l2.date, l2.user_id) in
(
select
min(date) ,
user_id
from login
group by user_id
) ) as l3
#妈的,原来连接也能用子查询
on l1.id=l3.id
group by l1.date*/
#左连接后,右表已被剔除的行全为NULL,按左表日期分组,右表中非新用户为NULL
#试一下别人的新思路,即添加新的一行来说明用户在这一天的登录是第几次
select
t1.date as date,
count(case when t1.t_rank=1 then 1 else null end ) as new
from
(
select
*,
row_number()over(partition by user_id order by date asc) as t_rank
from
login
) as t1
group by date
#此时每行都有序号表示对某用户是新登录日or多次登录日
借鉴别人的方法,实在是太绝妙了,精简了很多。
把计天数用窗口函数直接转移成组间排名
窗口函数能在不动原数据的情况下进行额外操作
而where则是改动了原数据