最大连续回答问题天数大于等于3天的用户及其对应等级
某乎问答最大连续回答问题天数大于等于3天的用户及其对应等级
https://www.nowcoder.com/practice/e080f8a685bc4af3b47749ca3310f1fd
select
a.author_id,
b.author_level,
a.days_cnt
FROM
(
select
t1.author_id,
count(distinct t1.answer_date) days_cnt
from
(
select
t.author_id,
t.answer_date,
t.answer_date - t.ranking dt
FROM
(
select
author_id,
answer_date,
dense_rank() over (
PARTITION BY
author_id
order by
answer_date
) ranking
from
answer_tb
) t
) t1
GROUP BY
t1.author_id,
t1.dt
having
count(distinct t1.answer_date) >= 3
) a
LEFT JOIN author_tb b on a.author_id = b.author_id
order by
a.author_id;
思路重点,在于计算连续签到的天数,
【核心思想】在于如果连续签到,则签到日期与签到日期的排序差值所得的日期是一样的
【解题思路】
- 步骤1:排序
- 步骤2:作差
- 步骤3:分组计数
第一步:排序。按照回答日期,因为一天可能回答多个问题,所以用dense_rank()
select author_id,answer_date,dense_rank() over (PARTITION BY author_id order by answer_date) ranking from answer_tb
第二步:作差。用answer_date-ranking,得到日期dt。(dt是判断连续天数的关键)
select t.author_id,t.answer_date,t.answer_date-t.ranking dt FROM (select author_id,answer_date,dense_rank() over
(PARTITION BY author_id order by answer_date) ranking from answer_tb) t
实际输出:(取一部分看结果)
101 | 2021-11-01 | 20211100 第一行与第二行用户id和dt均相同,所以需要去重,去重 |
101 | 2021-11-01 | 20211100 之后利用count计数,此时只会将同一用户同一id计数为1 |
101 | 2021-11-02 | 20211100 |
101 | 2021-11-03 | 20211100 3 |
102 | 2021-11-01 | 20211100 |
102 | 2021-11-01 | 20211100 1 |
103 | 2021-11-01 | 20211100 1 |
第三步:分组。按照author_id和dt分组,依照answer_date去重计数,本题涉及连续回答天数>=3,所以分组后,还需使用having筛选数据
select t1.author_id,count(distinct t1.answer_date) days_cnt from (select t.author_id,t.answer_date,t.answer_date-t.rankingdt FROM (select author_id,answer_date,dense_rank() over (PARTITION BY author_id order by answer_date) ranking from
answer_tb) t) t1 GROUP BY t1.author_id,t1.dt having count(distinct t1.answer_date)>=3;
实际输出:
101 | 3 |
第四步:查找用户等级。将author_tb表与步骤三中的表作表连接即可,最后再按author_id升序排列。
-------------------------------------------------------------------------------------
rank,dense_rank,row_number窗口函数的区别:
rank(排名函数)
RANK函数用于为每一行分配一个排名值,如果有多个行具有相同的排序值,它们会共享相同的排名,下一个排名将会跳过这些相同排名的行。
dense_rank(稠密排名函数)
DENSE_RANK函数也为每一行分配一个排名值,但与RANK函数不同,它不会跳过排名相同的行,而是按照排序顺序连续分配排名,因此可能会出现相同的排名值
row_number(行号函数)
ROW_NUMBER函数为每一行分配一个唯一的行号,不考虑排序值是否相同。每一行都会获得一个递增的整数行号,从1开始。
查看2道真题和解析