现有用户信息表user_info(uid用户ID,nick_name昵称, achievement成就值, level等级, job职业方向, register_time注册时间):
试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间):
试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):
请统计SQL试卷上未完成率较高的50%用户中,6级和7级用户在有试卷作答记录的近三个月中,每个月的答卷数目和完成数目。按用户ID、月份升序排序。
id | uid | nick_name | achievement | level | job | register_time |
1 | 1001 | 牛客1号 | 3200 | 7 | 算法 | 2020-01-01 10:00:00 |
2 | 1002 | 牛客2号 | 2500 | 6 | 算法 | 2020-01-01 10:00:00 |
3 | 1003 | 牛客3号♂ | 2200 | 5 | 算法 | 2020-01-01 10:00:00 |
试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间):
id | exam_id | tag | difficulty | duration | release_time |
1 | 9001 | SQL | hard | 60 | 2020-01-01 10:00:00 |
2 | 9002 | SQL | hard | 80 | 2020-01-01 10:00:00 |
3 | 9003 | 算法 | hard | 80 | 2020-01-01 10:00:00 |
4 | 9004 | PYTHON | medium | 70 | 2020-01-01 10:00:00 |
试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):
id | uid | exam_id | start_time | submit_time | score |
1 | 1001 | 9001 | 2020-01-01 09:01:01 | 2020-01-01 09:21:59 | 90 |
15 | 1002 | 9001 | 2020-01-01 18:01:01 | 2020-01-01 18:59:02 | 90 |
13 | 1001 | 9001 | 2020-01-02 10:01:01 | 2020-01-02 10:31:01 | 89 |
2 | 1002 | 9001 | 2020-01-20 10:01:01 | | |
3 | 1002 | 9001 | 2020-02-01 12:11:01 | | |
5 | 1001 | 9001 | 2020-03-01 12:01:01 | | |
6 | 1002 | 9001 | 2020-03-01 12:01:01 | 2020-03-01 12:41:01 | 90 |
4 | 1003 | 9001 | 2020-03-01 19:01:01 | | |
7 | 1002 | 9001 | 2020-05-02 19:01:01 | 2020-05-02 19:32:00 | 90 |
14 | 1001 | 9002 | 2020-01-01 12:11:01 | | |
8 | 1001 | 9002 | 2020-01-02 19:01:01 | 2020-01-02 19:59:01 | 69 |
9 | 1001 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:20:01 | 99 |
10 | 1002 | 9002 | 2020-02-02 12:01:01 | | |
11 | 1002 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:43:01 | 81 |
12 | 1002 | 9002 | 2020-03-02 12:11:01 | | |
17 | 1001 | 9002 | 2020-05-05 18:01:01 | | |
16 | 1002 | 9003 | 2020-05-06 12:01:01 | | |
由示例数据结果输出如下:
uid | start_month | total_cnt | complete_cnt |
1002 | 202002 | 3 | 1 |
1002 | 202003 | 2 | 1 |
1002 | 202005 | 2 | 1 |
解释:各个用户对SQL试卷的未完成数、作答总数、未完成率如下:
uid | incomplete_cnt | total_cnt | incomplete_rate |
1001 | 3 | 7 | 0.4286 |
1002 | 4 | 8 | 0.5000 |
1003 | 1 | 1 | 1.0000 |
1001、1002、1003分别排在1.0、0.5、0.0的位置,因此较高的50%用户(排位<=0.5)为1002、1003;
1003不是6级或7级;
有试卷作答记录的近三个月为202005、202003、202002;
这三个月里1002的作答题数分别为3、2、2,完成数目分别为1、1、1。
示例1
输入
drop table if exists examination_info,user_info,exam_record; CREATE TABLE examination_info ( id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID', exam_id int UNIQUE NOT NULL COMMENT '试卷ID', tag varchar(32) COMMENT '类别标签', difficulty varchar(8) COMMENT '难度', duration int NOT NULL COMMENT '时长', release_time datetime COMMENT '发布时间' )CHARACTER SET utf8 COLLATE utf8_general_ci; CREATE TABLE user_info ( id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID', uid int UNIQUE NOT NULL COMMENT '用户ID', `nick_name` varchar(64) COMMENT '昵称', achievement int COMMENT '成就值', level int COMMENT '用户等级', job varchar(32) COMMENT '职业方向', register_time datetime COMMENT '注册时间' )CHARACTER SET utf8 COLLATE utf8_general_ci; CREATE TABLE exam_record ( id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID', uid int NOT NULL COMMENT '用户ID', exam_id int NOT NULL COMMENT '试卷ID', start_time datetime NOT NULL COMMENT '开始时间', submit_time datetime COMMENT '提交时间', score tinyint COMMENT '得分' )CHARACTER SET utf8 COLLATE utf8_general_ci; INSERT INTO user_info(uid,`nick_name`,achievement,`level`,job,register_time) VALUES (1001, '牛客1', 3200, 7, '算法', '2020-01-01 10:00:00'), (1002, '牛客2号', 2500, 6, '算法', '2020-01-01 10:00:00'), (1003, '牛客3号♂', 2200, 5, '算法', '2020-01-01 10:00:00'); INSERT INTO examination_info(exam_id,tag,difficulty,duration,release_time) VALUES (9001, 'SQL', 'hard', 60, '2020-01-01 10:00:00'), (9002, 'SQL', 'hard', 80, '2020-01-01 10:00:00'), (9003, '算法', 'hard', 80, '2020-01-01 10:00:00'), (9004, 'PYTHON', 'medium', 70, '2020-01-01 10:00:00'); INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES (1001, 9001, '2020-01-01 09:01:01', '2020-01-01 09:21:59', 90), (1002, 9001, '2020-01-20 10:01:01', null, null), (1002, 9001, '2020-02-01 12:11:01', null, null), (1003, 9001, '2020-03-01 19:01:01', null, null), (1001, 9001, '2020-03-01 12:01:01', null, null), (1002, 9001, '2020-03-01 12:01:01', '2020-03-01 12:41:01', 90), (1002, 9001, '2020-05-02 19:01:01', '2020-05-02 19:32:00', 90), (1001, 9002, '2020-01-02 19:01:01', '2020-01-02 19:59:01', 69), (1001, 9002, '2020-02-02 12:01:01', '2020-02-02 12:20:01', 99), (1002, 9002, '2020-02-02 12:01:01', null, null), (1002, 9002, '2020-02-02 12:01:01', '2020-02-02 12:43:01', 81), (1002, 9002, '2020-03-02 12:11:01', null, null), (1001, 9001, '2020-01-02 10:01:01', '2020-01-02 10:31:01', 89), (1001, 9002, '2020-01-01 12:11:01', null, null), (1002, 9001, '2020-01-01 18:01:01', '2020-01-01 18:59:02', 90), (1002, 9003, '2020-05-06 12:01:01', null, null), (1001, 9002, '2020-05-05 18:01:01', null, null);
输出
1002|202002|3|1 1002|202003|2|1 1002|202005|2|1
uid,start_month,count(start_month) as total_cnt,count(submit_time) as complete_cnt
from
(select
uid,date_format(start_time,'%Y%m') as start_month,submit_time,
dense_rank() over(partition by uid order by substr(start_time,1,7)desc ) as ranking
from exam_record
)a
where ranking<4
and uid in
(
select uid
from
(select uid,level,ntile(2)over( order by incomplete_rate desc) as ranking
from
(
select a.uid,level,count(start_time) as total_times,count(submit_time) as submit_times,1-count(submit_time)/count(start_time) as incomplete_rate
from exam_record a
left join examination_info b on a.exam_id = b.exam_id
left join user_info c on a.uid = c.uid
where tag='SQL'
group by uid,level
)aa
)bb
where ranking =1
and level in (6,7)
)
group by uid,start_month
order by uid asc,start_month asc
发表于 2022-11-22 15:49:38
回复(1)
更多回答
# 第一步,先找出未完成率前50%高的用户ID,注意这里需要的sql试卷 with rote_tab as (select t.uid,t.f_rote,row_number()over(order by t.f_rote desc,uid) as rank2 ,count(t.uid)over(partition by t.tag)as cnt from (select er.uid,ef.tag,(sum(if(submit_time is null,1,0))/count(start_time)) as f_rote from exam_record er left join examination_info ef on ef.exam_id=er.exam_id where tag='SQL' group by uid ) t) select #第四步,分用户和月份进行数据统计;同时需要注意,统计的试卷数是所有类型的,不是之前仅有SQL类型 uid ,start_month ,count(start_time) as total_cnt ,count(submit_time) as complete_cnt from ( select # 第三步,利用窗口函数对每个用户的月份进行降序排序,以便找出最近的三个月; uid ,start_time ,submit_time ,date_format(start_time,'%Y%m') as start_month ,dense_rank()over(partition by uid order by date_format(start_time,'%Y%m') desc) as rank3 from exam_record where uid in (select distinct er.uid from exam_record er left join user_info uf on uf.uid=er.uid where er.uid in (select uid from rote_tab where rank2<=round(cnt/2,0)) and uf.level in (6,7)) # 第二步,进一步找出满足等级为6或7的用户ID ) t2 where rank3<=3 group by uid,start_month order by uid,start_month
编辑于 2021-11-17 11:03:33
回复(20)
统计SQL试卷上未完成率:
with t1 as ( select uid, sum(if(submit_time is null,1,0)) as incomplete_cnt, count(start_time) as total_cnt, sum(if(submit_time is null,1,0))/count(start_time) as incomplete_rate from exam_record where exam_id in ( select exam_id from examination_info where tag = 'SQL' ) group by uid )
请统计SQL试卷上用户未完成率中的百分比排位
,t2 as ( select *, percent_rank() over(order by incomplete_rate desc) as prk from t1 )!月份: 有试卷作答记录的近三个月
试过了datediff和timestampdiff发现很难实现,因为对处理过的月份函数不再适用
所以先对月份进行格式转换 再用窗口函数进行降序排序 找出最近3个月
, t3 as ( select id, uid, date_format(start_time,'%Y%m') as start_month, dense_rank() over(order by date_format(start_time,'%Y%m') desc) as drk, submit_time from exam_record )最终统计每个用户每个月的答卷数目和完成数目
条件:1. SQL试卷上未完成率较高的50%用户
2. 6级和7级用户
3. 有试卷作答记录的近三个月
完整代码如下
select uid, start_month, count(start_month) as total_cnt, count(submit_time) as complete_cnt from t3 where uid in (select uid from t2 where prk <= 0.5) and uid in (select uid from user_info where level in (6,7)) and drk<=3 group by uid,start_month order by uid,start_month
with t1 as ( select uid, sum(if(submit_time is null,1,0)) as incomplete_cnt, count(start_time) as total_cnt, sum(if(submit_time is null,1,0))/count(start_time) as incomplete_rate from exam_record where exam_id in ( select exam_id from examination_info where tag = 'SQL' ) group by uid ) ,t2 as ( select *, percent_rank() over(order by incomplete_rate desc) as prk from t1 ) , t3 as ( select id, uid, date_format(start_time,'%Y%m') as start_month, dense_rank() over(order by date_format(start_time,'%Y%m') desc) as drk, submit_time from exam_record ) select uid, start_month, count(start_month) as total_cnt, count(submit_time) as complete_cnt from t3 where uid in (select uid from t2 where prk <= 0.5) and uid in (select uid from user_info where level in (6,7)) and drk<=3 group by uid,start_month order by uid,start_month
发表于 2022-08-06 11:17:01
回复(0)
第一遍写的代码感觉逻辑上没问题好久没通过
select uid, date_format(start_time,'%Y%m') as start_month, count(id)as total_cnt, count(score)as complete_cnt from ( select b.*, dense_rank()over(partition by a.uid order by date_format(b.start_time,'%Y%m') desc) as r1 from user_info a left join exam_record b on a.uid=b.uid where a.level>=6 and a.uid in(select uid from( select er.uid, percent_rank()over(partition by er.uid order by sum(case when er.score is null then 1 else 0 end)/count(er.id) desc) as r2 from exam_record er left join examination_info e on er.exam_id=e.exam_id where e.tag='SQL' group by er.uid)t1 where t1.r2<=0.5 ) )t where r1<=3 group by uid,start_month order by uid,start_month后来发现percent_rank()函数对已分组的表不用再次分组(partition by)
select uid, date_format(start_time,'%Y%m') as start_month, count(id)as total_cnt, count(score)as complete_cnt from ( select b.*, dense_rank()over(partition by uid order by date_format(start_time,'%Y%m') desc) as r1 from user_info a join exam_record b on a.uid=b.uid where a.level>=6 and a.uid in(select uid from( select er.uid, percent_rank()over(order by sum(case when er.score is null then 1 else 0 end)/count(er.id) desc) as r2 from exam_record er join examination_info e on er.exam_id=e.exam_id where e.tag='SQL' group by er.uid)t1 where r2<=0.5 ) )t2 where r1<=3 group by uid,start_month order by uid,start_month
发表于 2022-01-25 14:48:07
回复(3)
【场景】:等级值百分比、近几个月
【分类】:专用窗口函数、嵌套子查询
分析思路
难点:
统计SQL试卷用户未完成率中的百分比排位使用percent_rank()
(1)统计SQL试卷未完成率
- [条件]:tag = 'SQL'
- [使用]:join ;count(); if()
(2)统计SQL试卷用户未完成率中的百分比排位
- [使用]:percent_rank()
(3)统计SQL试卷用户未完成率中的百分比排位中大于50%,并且是6级和7级用户有试卷作答记录的近三个月
- [使用]:dense_rank();date_format(start_time,'%Y%m')
(4)最终统计每个用户每个月的答卷数目和完成数目
- [使用]:where ranking <= 3;group by uid,start_month;order by uid,start_month
最终结果
select 查询结果 [用户ID;月份;答题数目;完成数目] from 从哪张表中查询数据[多个join连接的表] where 查询条件 [百分比排位中大于50%] group by 分组条件 [用户ID;月份] order by 对查询结果排序 [用户ID、月份升序];
扩展:
保留日期中年月的方法
- substring(start_time,1,7);
- replace(substring(start_time,1,7),'-','');
- date_format(start_time,'%Y%m')
- date(start_time)
求解代码
方法一
with子句
with temp as( #统计SQL试卷未完成率 select uid, count(if(submit_time is null,1,null)) as incomplete_cnt,#未完成数 count(start_time) as total_cnt, #作答总数 count(if(submit_time is null,1,null))/count(start_time) as incomplete_rate #未完成率 from user_info join exam_record using(uid) join examination_info using(exam_id) where tag = 'SQL' group by uid ) ,temp1 as( #统计SQL试卷用户未完成率中的百分比排位 select uid, percent_rank() over(order by incomplete_rate) as nt_complete from temp ) ,temp2 as( #6级和7级用户并且有试卷作答记录的近三个月 select uid, start_time, submit_time, date_format(start_time,%Y%m) as start_month, dense_rank() over(partition by uid order by (date_format(start_time,'%y%m')) desc) as ranking from exam_record where uid in ( select uid from temp1 join user_info using(uid) join exam_record using(uid) where nt_complete >= 0.5 and level between 6 and 7 ) ) #最终统计每个用户每个月的答卷数目和完成数目 select uid, start_month, count(start_time) as total_cnt, count(if(submit_time is null,null,1)) as complete_cnt from temp2 where ranking <= 3 group by uid,start_month order by uid,start_month
优化代码
同时进行未完成率的筛选和百分比排位
with temp as( #统计SQL试卷用户未完成率中的百分比排位 select uid, percent_rank() over(order by count(if(submit_time is null,1,null))/count(start_time)) as nt_complete from user_info join exam_record using(uid) join examination_info using(exam_id) where tag = 'SQL' group by uid ) ,temp1 as( #6级和7级用户并且有试卷作答记录的近三个月 select uid, start_time, submit_time, date_format(start_time,'%Y%m') as start_month, dense_rank() over(partition by uid order by (date_format(start_time,'%Y%m')) desc) as ranking from exam_record where uid in ( select uid from temp join user_info using(uid) join exam_record using(uid) where nt_complete >= 0.5 and level between 6 and 7 ) ) #最终统计每个用户每个月的答卷数目和完成数目 select uid, start_month, count(start_time) as total_cnt, count(if(submit_time is null,null,1)) as complete_cnt from temp1 where ranking <= 3 group by uid,start_month order by uid,start_month
temp 表输出结果:
1 1001|0.000 2 1002|0.500 3 1003|1.000
temp1 表输出结果:
1 1002|2020-05-06 12:01:01|None|202005|1 2 1002|2020-05-02 19:01:01|2020-05-02 19:32:00|202005|1 3 1002|2020-03-02 12:11:01|None|202003|2 4 1002|2020-03-01 12:01:01|2020-03-01 12:41:01|202003|2 5 1002|2020-02-02 12:01:01|2020-02-02 12:43:01|202002|3 6 1002|2020-02-02 12:01:01|None|202002|3 7 1002|2020-02-01 12:11:01|None|202002|3 8 1002|2020-01-01 18:01:01|2020-01-01 18:59:02|202001|4 9 1002|2020-01-20 10:01:01|None|202001|4
方法二
嵌套子查询
#最终统计每个用户每个月的答卷数目和完成数目 select uid, start_month, count(start_time) as total_cnt, count(if(submit_time is null,null,1)) as complete_cnt from( #6级和7级用户并且有试卷作答记录的近三个月 select uid, start_time, submit_time, date_format(start_time,'%Y%m') as start_month, dense_rank() over(partition by uid order by (date_format(start_time,'%Y%m')) desc) as ranking from exam_record where uid in ( select uid from( #统计SQL试卷用户未完成率中的百分比排位 select uid, percent_rank() over(order by count(if(submit_time is null,1,null))/count(start_time)) as nt_complete from user_info join exam_record using(uid) join examination_info using(exam_id) where tag = 'SQL' group by uid ) temp join user_info using(uid) join exam_record using(uid) where nt_complete >= 0.5 and level between 6 and 7 ) ) temp1 where ranking <= 3 group by uid,start_month order by uid,start_month
发表于 2022-11-11 23:06:07
回复(0)
SELECT uid, DATE_FORMAT(start_time, '%Y%m') start_month, count(uid) total_cnt, count(score) complete_cnt FROM (SELECT b.*, DENSE_RANK() over (partition by uid order by DATE_FORMAT(start_time, '%Y%m') desc) ranking FROM user_info a LEFT JOIN exam_record b ON a.uid = b.uid WHERE a.level in (6,7) AND a.uid in (SELECT uid from (SELECT b.uid, PERCENT_RANK() over (partition by a.tag order by ROUND((COUNT(b.uid) - COUNT(b.SCORE)) / COUNT(b.uid), 4)) ranking FROM examination_info a left join exam_record b on a.exam_id = b.exam_id where a.tag = 'SQL' group by b.uid) a WHERE ranking >= 0.5)) a where ranking <= 3 group by uid, start_month order by uid, start_month
发表于 2022-01-11 17:06:29
回复(0)
select h.uid,date_format(h.start_time,'%Y%m') as start_month,count(start_time) as total_cnt,count(submit_time) as complete_cnt
from (select uid,start_time,submit_time from (select uid,start_time,submit_time,dense_rank() over (partition by uid order by date_format(start_time,'%Y%m')desc) as ranking
from exam_record )a where ranking<=3)h where uid in (select uid from (select uid,level,row_number() over (order by (incomplete_cnt/total_cnt) desc,uid) as incomplete_rank from
(select a.uid,level,sum(case when submit_time is null then 1 else 0 end) incomplete_cnt,count(start_time) total_cnt
from exam_record a left join examination_info b on a.exam_id=b.exam_id left join user_info c on a.uid=c.uid where tag='sql'
group by uid)t)n where n.level in (6,7) and incomplete_rank <= ceiling((select count(distinct uid) as total_user
from exam_record)/2))
group by h.uid,date_format(start_time,'%Y%m')
from (select uid,start_time,submit_time from (select uid,start_time,submit_time,dense_rank() over (partition by uid order by date_format(start_time,'%Y%m')desc) as ranking
from exam_record )a where ranking<=3)h where uid in (select uid from (select uid,level,row_number() over (order by (incomplete_cnt/total_cnt) desc,uid) as incomplete_rank from
(select a.uid,level,sum(case when submit_time is null then 1 else 0 end) incomplete_cnt,count(start_time) total_cnt
from exam_record a left join examination_info b on a.exam_id=b.exam_id left join user_info c on a.uid=c.uid where tag='sql'
group by uid)t)n where n.level in (6,7) and incomplete_rank <= ceiling((select count(distinct uid) as total_user
from exam_record)/2))
group by h.uid,date_format(start_time,'%Y%m')
order by start_month,total_cnt desc
理解“SQL试卷上未完成率较高的50%用户”这句话比较难,其实就是先按未完成率降序排序后,用未完成率 排序<=总人数一半即可筛出,因为排序的人数和count distictn(uid)是一致的,要求得到未完成率较高的一半用户,就要<=总人数一半。需用到ceiling向上取整,如果是六个人,那未完成率较高的就是6/2=3人,但是如果是5个人,一半就是2.5,向上取整也是3个人
发表于 2021-11-03 10:29:12
回复(6)
select t1.uid,t1.month_d,count(*)as total_cnt,count(t1.submit_time) as complete_cnt from -- 先求出未完成率的排名 (select uid,count(submit_time is null or null)/count(start_time) as num, PERCENT_RANK() over(order by count(submit_time is null or null)/count(start_time) ) as ranking from exam_record left join examination_info using(exam_id) where tag = 'SQL' group by uid)t inner join ( -- 再求出近三个月的练习记录 select uid,date_format(start_time,'%Y%m') as month_d ,submit_time,exam_id ,dense_rank() over( partition by uid order by date_format(start_time,'%Y%m') desc ) as ranking from exam_record left join user_info using(uid) where level in(6,7) )t1 USING (uid) where t1.ranking <=3 and t.ranking >=0.5 -- 使用限制找到符合条件的记录 group by t1.uid,t1.month_d order by t1.uid,t1.month_d
注意求的是所有的答题次数和完成的次数。
注意sql试卷是限制未完成率排名
6,7级用户是限制做题记录
细节较多,写起来比较费劲。在占位百分比计算时最好在本地测试一下,分开运行看一下结果。
发表于 2022-08-23 17:39:00
回复(0)
怎么理解这段话:"SQL试卷上未完成率较高的50%用户中,6级和7级用户在有试卷作答记录的近三个月中"?
答:首先给出用户未完成率的percent_rank,取小于等于50%的,然后再剔除非6/7级的用户和非SQL试卷的用户。
P.S. 如果先剔除非6/7级的用户和非SQL试卷的再排序,就会出错
select uid,start_month,count(start_time) total_cnt,count(submit_time) complete_cnt from (select uid, start_time,submit_time,date_format(start_time,'%Y%m') start_month,dense_rank() over (partition by uid order by date_format(start_time,'%Y%m') desc) lm from exam_record where uid in (select d.uid from (select *,percent_rank() over (order by incomplete_rate desc ) rk from (select a.uid,tag,sum(if(submit_time is null,1,0)) incomplete_cnt, count(start_time) total_cnt, sum(if(submit_time is null,1,0))/ count(start_time) incomplete_rate from exam_record a left join examination_info b on a.exam_id=b.exam_id left join user_info c on a.uid=c.uid #where tag='SQL' and level between 6 and 7 group by uid) c) d left join user_info f on d.uid=f.uid where rk<=0.5 and tag='SQL' and level between 6 and 7) ) e where lm<=3 group by uid,start_month order by uid,start_month;
发表于 2021-12-20 17:37:18
回复(3)
select uid, start_month, total_cnt, complete_cnt from(select uid, date_format(start_time,'%Y%m') start_month, count(start_time) total_cnt, count(submit_time) complete_cnt, rank()over(partition by uid order by date_format(start_time,'%Y%m')desc) month_rank from(select uid from(select uid, percent_rank() over (order by count(submit_time)/count(start_time)) rank1 from exam_record where exam_id in(select exam_id from examination_info where tag='SQL') group by uid)t1 where rank1<=0.5 and uid in(select uid from user_info where level in(6,7)))t2 left join exam_record using (uid) group by uid,start_month)t3 where month_rank in(1,2,3) order by 1,2
发表于 2022-09-10 12:31:04
回复(0)
select uid,start_month,total_cnt,complete_cnt from (select uid, date_format(start_time,'%Y%m') start_month, count(start_time) total_cnt, count(submit_time) complete_cnt, dense_rank() over (partition by uid order by date_format(start_time,'%Y%m') desc) t_rank from exam_record where uid in (select a.uid from (select uid, avg(case when submit_time is null then 1 else 0 end) incomplete_rate, percent_rank() over (order by avg(case when submit_time is null then 1 else 0 end) desc) pr from exam_record inner join examination_info on exam_record.exam_id = examination_info.exam_id where examination_info.tag = 'SQL' group by uid) as a inner join user_info ui on ui.uid = a.uid where (ui.level = 6&nbs***bsp;ui.level = 7) and pr <= 0.5) group by uid,start_month order by uid,start_month) as b where t_rank <= 3
发表于 2021-11-20 14:47:36
回复(1)
大佬们 我这sql哪有问题呀 debug半天了呜呜呜
select t3.uid, t3.start_month, t3.total_cnt, t3.complete_cnt
from
(select t2.uid, date_format(t2.start_time,'%Y%m') as start_month, count(t2.start_time) as total_cnt, count(t2.submit_time) as complete_cnt, count(if(t2.submit_time is null,1,null))/count(t2.start_time) as incomplete_rate
from
(select *,
dense_rank()over(partition by t.uid
order by date_format(t.start_time, '%Y%m') desc) as ranking
from
(select uid,e3.exam_id,start_time,submit_time,score,e2.tag from exam_record as e3
left join examination_info as e2
on e3.exam_id = e2.exam_id
) as t
where t.tag = 'SQL') as t2
where t2.ranking <= 3
group by t2.uid, start_month
order by t2.uid asc, start_month asc) as t3
where t3.incomplete_rate >= 0.5 and t3.uid in (select uid from user_info where level >= 6);
编辑于 2024-04-25 10:54:56
回复(0)
# 一步步拆分,不断试错
with zb as
(select er.uid,tag,level,date_format(start_time,'%Y%m') start_month,submit_time,dense_rank() over
(partition by uid order by date_format(start_time,'%Y%m') desc) rk
from user_info ui,examination_info ei,exam_record er
where ui.uid = er.uid and ei.exam_id = er.exam_id)
select uid,start_month,count(start_month) total_cnt,count(submit_time) complete_cnt
from zb
where rk <= 3 and level in (6,7) and uid in
(select uid from
(select *,b/a avg_1,percent_rank() over(order by b/a) rk1 from
(select uid,count(start_month) a ,count(start_month)-count(submit_time) b
from zb
where tag = "SQL"
group by uid) z1) z2 where rk1 >=0.5)
group by uid,start_month
order by uid,start_month;
编辑于 2023-12-01 21:52:29
回复(0)
select uid, date_format(start_time,'%Y%m') start_month, count(id) total_cnt, count(score) complete_cnt from ( select a.*, dense_rank() over(partition by uid order by date_format(start_time,' %Y%m') desc) r1 from exam_record a join user_info b using(uid) where level>=6 and uid in ( select uid from ( select uid,percent_rank() over(order by sum(if(score is null,1,0))/count( er.id)desc) r2 from exam_record er join examination_info eo using(exam_id) where tag='SQL' group by uid ) t1 where r2<=0.5 ) ) t2 where r1<=3 group by 1,2 order by 1,22.
with t1 as (select er.uid, date_format(start_time,'%Y%m') start_month, score, dense_rank()over(partition by uid order by date_format(start_time,'%Y%m') desc) d_r from exam_record er join user_info using(uid)), t2 as (select er.uid, percent_rank()over(order by (count(1)-count(score))/count(1) desc) p_r from exam_record er join examination_info using(exam_id) where tag='SQL' group by 1) select uid, start_month, count(1) total_cnt, count(score) complete_cnt from t1 where uid in (select uid from user_info where level in (6,7)) and uid in (select uid from t2 where p_r<=0.5) and d_r<4 group by 1,2 order by 1,2
编辑于 2023-12-15 22:36:51
回复(0)
-- 解法一: select t.uid, date_format(start_time,'%Y%m') as start_month, count(start_time) as total_cnt, count(submit_time) as complete_cnt from ( select *, dense_rank() over(partition by uid order by date_format(start_time,'%Y%m') desc) as ranking from exam_record ) t right join user_info ui on ui.uid = t.uid where level in (6,7) and t.ranking <= 3 and t.uid in ( select uid from ( select uid, percent_rank() over(order by sum(isnull(submit_time)) / count(start_time) desc) pr from exam_record er join examination_info ei on er.exam_id = ei.exam_id where tag = 'SQL' group by uid ) t2 where pr <= 0.5 ) group by t.uid,date_format(start_time,'%Y%m') order by t.uid,date_format(start_time,'%Y%m'); -- 解法二: with t1 as ( select uid, date_format(start_time,'%Y%m') start_month, count(start_time) total_cnt, count(submit_time) complete_cnt, row_number() over(partition by uid order by date_format(start_time,'%Y%m') desc) rn from exam_record er join examination_info ei on er.exam_id = ei.exam_id where uid in ( select uid from ( select uid, percent_rank() over(order by sum(isnull(submit_time)) / count(start_time) desc) irpr from exam_record er join examination_info ei on er.exam_id = ei.exam_id where tag = 'SQL' group by uid ) t2 where irpr <= 0.5) group by uid,start_month ) select uid, start_month, total_cnt, complete_cnt from t1 where uid in (select uid from user_info where level in (6,7)) and rn <= 3 order by uid,start_month;
发表于 2023-08-25 20:10:40
回复(0)
大佬们,我这sql哪边有问题啊?我提交的时候会多出来1002的数据
select
a.uid,
a.time,
total_cnt,
a.complete_cnt
from
(
select
uid,
date_format (start_time, '%Y%m') as time,
dense_rank() over (
partition by
uid
order by
uid,
date_format (start_time, '%Y%m') desc
) ranking,
count(uid) - count(submit_time) incomplete_cnt,
count(uid) total_cnt,
(1 - count(submit_time) / count(uid)) incomplete_rate,
count(submit_time) complete_cnt
from
exam_record
where
uid in (
select
uid
from
exam_record
where
uid in (
select distinct
uid
from
exam_record
where
uid in (
select
uid
from
user_info
where
level in (6, 7)
)
and exam_id in (
select
exam_id
from
examination_info
where
tag = 'SQL'
)
)
and exam_id in (
select
exam_id
from
examination_info
where
tag = 'SQL'
)
group by
uid
having
(count(submit_time) / count(start_time)) <= 0.5
)
group by
uid,
time
) a
where
a.ranking < 4
order by
a.uid,
a.time
发表于 2023-05-04 14:08:55
回复(1)
全网第一,不接受反驳
with tb as (select uid, date_format(start_time,'%Y%m') start_month, count(start_time) total_cnt, count(score) complete_cnt, row_number() over(partition by uid order by date_format(start_time,'%Y%m') desc) rw from exam_record where uid in (select uid from (select *,max(rn) over() m from (select *,row_number() over(order by rate)-1 rn from (select uid,1-count(score)/count(1) rate from exam_record group by uid)tb1 )tb2 )tb3 where rn >= 0.5*m) and uid in (select uid from user_info where level in (6,7)) group by uid,date_format(start_time,'%Y%m')) select uid, start_month, total_cnt, complete_cnt from tb where rw <= 3 order by uid,start_month;
发表于 2023-04-16 14:03:57
回复(1)
求助:为什么这么写5月份的答卷数少一个呢?
with aa as (
select er.uid,
(1-count(submit_time)/count(start_time)) incomplete_rate,
row_number() over (order by (1-count(submit_time)/count(start_time)) asc) rk
from exam_record er
inner join user_info ui on ui.uid=er.uid
inner join examination_info ei on ei.exam_id=er.exam_id
where tag='SQL'
group by er.uid
)
select bb.uid,date_format(start_time,'%Y%m') start_month,
count(start_time) total_cnt,
count(submit_time) complete_cnt
from
(select er.uid,start_time,submit_time,
dense_rank() over (partition by er.uid order by date_format(start_time,'%Y-%m') desc) month_rk
from exam_record er
inner join user_info ui on ui.uid=er.uid
inner join examination_info ei on ei.exam_id=er.exam_id
where tag='SQL'
and er.uid in (select aa.uid from aa where rk/(select max(rk) from aa)>0.5)
and (level=6 or level=7)) bb
where month_rk<=3
group by bb.uid,start_month
order by uid ,start_month ;
发表于 2023-01-30 16:49:04
回复(0)
#按步骤走,平均20分钟写完主要逻辑,10分钟检查一些细节 # 每个人近三个有试卷作答记录的月份 with temp1 as( select uid,dat from( select uid,dat,row_number() over(partition by uid order by dat desc) r from ( select distinct uid,substr((date(start_time)),1,7) dat from exam_record )a )b where r<4 ), # 统计SQL试卷上未完成率较高的50%用户中,6级和7级用户 temp2 as( select uid from( select u.uid, max(level) level, count(submit_time) cnt1, count(start_time) cnt2, percent_rank() over(order by count(submit_time)/count(start_time) desc) percent from user_info u join exam_record e1 on u.uid=e1.uid join examination_info e2 on e1.exam_id=e2.exam_id where tag="SQL" group by u.uid )a where percent>=0.5 and level>5 ) # 在有试卷作答记录的近三个月中,temp1已求得 # 每个月的答卷数目和完成数目。 select temp1.uid, date_format(start_time,"%Y%m") start_month, count(start_time) total_cnt, count(submit_time) complete_cnt from temp1 join temp2 on temp1.uid=temp2.uid join exam_record e1 on temp1.uid=e1.uid and temp1.dat=substr((date(e1.start_time)),1,7) group by temp1.uid,start_month order by temp1.uid,start_month # 按用户ID、月份升序排序。
发表于 2022-12-14 10:24:21
回复(0)
#未完成率较高的50%用户,以未完成率排序,选择排名50%以上的uid
WITH t as (
SELECT *
FROM (
SELECT uid,
PERCENT_RANK()over( ORDER BY SUM(CASE WHEN submit_time is NULL THEN 1 ELSE 0 END) /count(start_time)) rate_incomplete_rate
-- 对未完成率进行分数排序
FROM exam_record
LEFT JOIN user_info
using(uid)
LEFT JOIN examination_info
using(exam_id)
WHERE tag='SQL'
GROUP BY uid ) as A
WHERE rate_incomplete_rate >=0.5 -- 查找未完成率排名高于50%的用户
AND uid IN (SELECT uid FROM user_info WHERE level =6 or level=7)
)
SELECT uid,
DATE_FORMAT(start_time,'%Y%m') as start_month,
COUNT(start_time) as total_cnt, #答卷数目
COUNT(submit_time) as complete_cnt #完成数目
FROM
(SELECT *,
dense_rank()over(partition by uid order by date_format(start_time,'%Y%m')DESC) AS time_rk -- 近三个月中,用DENSE_RANK是因为需要三个不同的月份,并列的不算
FROM exam_record) as t1
RIGHT JOIN t #这边用RIGHT JOIN是因为uid在t里面已经选出来了,t左联结t1,也就是t1右联结t
USING(uid)
WHERE time_rk<=3 -- 查找作答时间最近的3个月
GROUP BY uid,start_month
ORDER BY uid,start_month -- 按照用户id和月份进行升序排序
发表于 2022-10-21 11:51:44
回复(0)
select uid,date_format(start_time,'%Y%m') start_month,count(start_time) total_cnt,count(submit_time) complete_cnt
from
(
select uid,exam_id,start_time,submit_time
,dense_rank() over (partition by uid order by date_format(start_time,'%Y%m') desc) ranking
from exam_record
where uid in
(
select uid
from(
select uid,count(submit_time)/count(start_time),percent_rank()over(order by count(submit_time)/count(start_time)) pr
from exam_record
where exam_id in(select exam_id from examination_info where tag = 'SQL')
group by uid
) a
where pr<=0.5 and uid in (select uid from user_info where level in(6,7))
)
) b
where ranking <=3
group by uid, start_month
order by uid,start_month
发表于 2022-10-10 17:34:08
回复(0)