首页 > 试题广场 >

平均活跃天数和月活人数

[编程题]平均活跃天数和月活人数
  • 热度指数:183532 时间限制:C/C++ 1秒,其他语言2秒 空间限制:C/C++ 256M,其他语言512M
  • 算法知识视频讲解
用户在牛客试卷作答区作答记录存储在表exam_record中,内容如下:
exam_record表(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)
请计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau,上面数据的示例输出如下:
解释:2021年7月有2人活跃,共活跃了3天(1001活跃1天,1002活跃2天),平均活跃天数1.5;2021年9月有4人活跃,共活跃了5天,平均活跃天数1.25,结果保留2位小数。
注:此处活跃指有交卷行为。
示例1

输入

drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2021-07-02 09:01:01', '2021-07-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 81),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-07-02 19:01:01', '2021-07-02 19:30:01', 82),
(1002, 9002, '2021-07-05 18:01:01', '2021-07-05 18:59:02', 90),
(1003, 9002, '2021-07-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 81),
(1005, 9001, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1006, 9002, '2021-09-02 12:11:01', '2021-09-02 12:31:01', 89);

输出

202107|1.50|2
202109|1.25|4
SELECT
    tmp.month,
    ROUND(COUNT(DISTINCT CONCAT(uid, '-', split_day)) / COUNT(DISTINCT uid), 2) AS avg_active_days,
    COUNT(DISTINCT uid) AS mau
FROM (
    SELECT
        uid,
        exam_id,
        start_time,
        submit_time,
        score,
        MONTH(submit_time) AS split_month,
        DAY(submit_time) AS split_day,
        CONCAT('2021', LPAD(MONTH(submit_time), 2, '0')) AS month
    FROM
        exam_record
    WHERE
        YEAR(submit_time) = '2021'
) tmp
GROUP BY tmp.month;

发表于 2025-04-27 17:37:20 回复(0)
SELECT DATE_FORMAT(date, "%Y%m") AS month,
    ROUND(COUNT(*) / COUNT(DISTINCT(uid)), 2) AS avg_active_days,
    ROUND(COUNT(DISTINCT(uid)), 2) AS mau
FROM(
    SELECT uid, DATE(start_time) AS date
    FROM exam_record
    WHERE score IS NOT NULL AND YEAR(start_time) = 2021
    GROUP BY uid, DATE(start_time)
)t
GROUP BY DATE_FORMAT(date, "%Y%m");
发表于 2025-03-31 21:56:06 回复(0)

with a as
(select date_format(start_time,'%Y%m') as month
    ,uid,submit_time  from exam_record
   where submit_time is not null )

select month
,round(count(distinct uid,day(submit_time))/count(distinct uid),2) avg_active_days
,count(distinct uid) mau

from a
group by month
order by month
发表于 2025-03-31 15:06:06 回复(0)
#计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau
#活跃指有交卷行为
#活跃天数,一个人一天提交2次也算活跃一天!所以要么先处理日期,去重,要么在计数时候进行日期和人员分组
/*另外增加知识点ORACLE: count(distinct A,B)和count(A,B) 都报错
MYSQL:   count(distinct A,B)是可以的                                      
                count(A,B) 报错  
                pg数据库报错  
*/      
select date_format(submit_time,'%Y%m') month,round(count(distinct uid,date_format(submit_time,'%Y%m%d'))/count(distinct uid),2)  as avg_active_days,count(distinct uid)  as mau
from exam_record
where submit_time is not null
and year(submit_time)='2021'
group by date_format(submit_time,'%Y%m')
发表于 2024-12-10 14:54:51 回复(0)
SELECT
    month,
    ROUND(SUM(cnt) / COUNT(DISTINCT uid),2) AS avg_active_days,
    COUNT(DISTINCT uid) AS mau
FROM(
SELECT
DATE_FORMAT(submit_time,'%Y%m') AS month,
uid,
COUNT(DISTINCT DATE_FORMAT(submit_time,'%Y%m%d')) AS cnt
FROM exam_record
WHERE submit_time IS NOT NULL
GROUP BY DATE_FORMAT(submit_time,'%Y%m'),uid
) AS sb1
WHERE month LIKE '2021%'
GROUP BY month;
发表于 2024-11-26 22:20:02 回复(0)
-- 2021年每个月
-- 平均月活跃天数, 此活跃指有交卷行为
-- 月度活跃人数   


SELECT
    DATE_FORMAT(submit_time, '%Y%m') AS `month`  -- 2021年每个月
    ,ROUND(  -- 平均月活跃天数
        COUNT(DISTINCT uid, DATE_FORMAT(submit_time, '%Y%m%d')) 
        / 
        COUNT(DISTINCT uid), 2
     ) AS avg_active_days
    ,COUNT(DISTINCT uid) AS mau  -- 月度活跃人数
FROM exam_record
WHERE YEAR(submit_time) = 2021
GROUP BY `month`
;
发表于 2024-08-22 21:52:03 回复(0)
关键在于要对用户活跃去重,一天可能有人做两次题
select
    date_format (date, "%Y%m") month,
    round(count(date) / count(distinct uid), 2) avg_active_days,
    round(count(distinct uid), 2) mau
from
    (
        select distinct
            uid,
            date (submit_time) date
        from
            exam_record
        where
            year (submit_time) = 2021
            and submit_time is not null
    ) tmp
group by
    month

发表于 2024-08-04 16:33:31 回复(0)
SELECT DATE_FORMAT(start_time,'%Y%m') month,
ROUND(COUNT(DISTINCT uid,DATE_FORMAT(start_time,'%Y%m%d'))/COUNT(DISTINCT uid),2)
 avg_active_days,
COUNT(DISTINCT uid) mau
FROM exam_record
WHERE YEAR(start_time) = 2021
AND submit_time is not NULL
GROUP BY month;
这里为什么DATE_FORMAT(start_time,'%Y%m')在select不在group by,是因为group by不可以命名新列,但select 可以。但依然可以执行的原因是,mysql提供了group by 引用 select 新列的灵活性,原本group by执行顺序在select 前,按道理应该在group by那也加上DATE_FORMAT(start_time,'%Y%m')
发表于 2024-08-04 15:39:23 回复(0)
select
concat (substring(time, 1, 4), '', substring(time, 6, 2)) as month,
round(count(time) / count(distinct name), 2) as avg_active_days,
round(count(distinct name), 2) as mau
from
(
select
substring(submit_time, 1, 10) as time,
uid as name,
count(uid) as times
from
exam_record
where
year (submit_time) = 2021
group by
time,
name
) as p
group by
month


发表于 2024-07-18 19:36:23 回复(0)
错误解法:
select
    date_format(start_time, '%Y%m') as month,
    round(count(distinct day(start_time))/count(distinct uid),2) as avg_active_days,
    count(distinct uid) as mau
from exam_record
where year(start_time) = 2021 and submit_time is not null
group by month
错因:
round(count(distinct day(start_time))/count(distinct uid),2)
其中错误部分为
count(distinct day(start_time) 
解释:9.1 uid 1001 和 1002 都活跃,活跃次数为2。但该命令统计的活跃次数为1。
发表于 2024-07-18 14:03:56 回复(0)
select DATE_FORMAT(submit_time,'%Y%m') month,
round(count(distinct (DATE_FORMAT(submit_time,'%Y%m%d')+uid))/count(distinct uid),2) as avg_active_days,
count(distinct uid) mau
from exam_record
WHERE YEAR(submit_time)=2021 and submit_time is not null
GROUP BY DATE_FORMAT(submit_time,'%Y%m')
# 为什么这样写会不对呢?日期+uid组成唯一标识再计算总活跃天数应该没问题啊
# 为什么要用count对uid和日期进行组合呢,组合之后不是会出现不存在的日期+uid了吗?请各位大佬帮忙解答一下谢谢
发表于 2024-07-14 23:22:03 回复(0)
# 月活跃人数:group by moth(submit_time) count(uid)
# 平均月活跃天数=该月有用户活跃(交卷)的总天数/该月活跃(交卷)总人数(distinct uid)
# 这里对我来说,最难的是计算平均活跃天数
# count(distinct uid,date_format(submit_time,'%Y%m%d')
# 看了评论区才知道,加uid的目的是为了把不同用户跟其相应的做题日期能对应上,如果不加则会记录一条,容易出现少计算的情况
# count(distinct 参数1,参数2) 只要两个参数有一个不一样就会返回该条记录
select
date_format(submit_time,'%Y%m') as'moth',
round(count(distinct uid,date_format(submit_time,"%Y%m%d"))/count(distinct uid),2) as avg_active_days,
count(distinct uid) as'mau'
from
exam_record
where submit_time is not null
and
date_format(submit_time,'%Y')=2021
group by 1




发表于 2024-07-14 16:52:58 回复(0)

distinct作用于多列

示例2.1
select distinct name, id from A

执行后结果如下:

实际上是根据name和id两个字段来去重的,这种方式Access和SQL Server同时支持。

发表于 2024-04-30 21:39:00 回复(0)
我不会,我太垃圾了
发表于 2024-04-09 19:48:19 回复(0)
# 调试多次得出的正确代码
select DATE_FORMAT(submit_time,'%Y%m') as month,
       round(count(distinct uid,day(submit_time))/count(distinct uid),2) as avg_active_days,
       count(distinct uid) as mau
from exam_record
where year(submit_time)=2021
group by month
order by month;
编辑于 2024-03-25 10:49:07 回复(0)
select  date_format(start_time,'%Y%m') date
,round(count(distinct uid,date_format(submit_time,'%Y%m%d'))/count(distinct uid),2)
,count(distinct uid)
from exam_record 
where submit_time is not null and year(start_time)=2021
group by date

编辑于 2023-12-12 10:07:52 回复(0)
select
    replace(t1.month,'-','') as month,
    round(avg(t1.active_days),2) as avg_active_days,
    count(distinct uid) as mau
from 
    (select 
        uid,
        mid(start_time,1,7) as month,
        count(distinct day(start_time)) as active_days
    from 
        exam_record
    where
        score is not null
        and year(start_time) = 2021
    group by
        uid,mid(start_time,1,7)) as t1
group by
    t1.month


  1. 内部查询(Subquery):

    首先,从exam_record表中选择数据。条件是score字段不为null,并且start_time字段的年份为2021。然后,通过group by子句,将数据按照uid(用户ID)和start_time的年月(使用mid(start_time,1,7)提取年份和月份)分组。在每个组内,计算每个用户在该月内的活跃天数(使用count(distinct day(start_time)))。

    这个内部查询返回了三列数据:uid(用户ID),month(年月份),和active_days(活跃天数)。

  2. 外部查询:

    在内部查询的基础上,外部查询对内部查询的结果再进行处理。首先,使用replace(t1.month,'-','')去掉month中的破折号,将年月份格式变为没有分隔符的字符串。然后,计算每个月的平均活跃天数(使用round(avg(t1.active_days),2)计算平均值并保留两位小数),以及每个月的活跃用户数(使用count(distinct uid)计算不同用户的数量)。

  3. 结果分组:

    最后,使用group by t1.month将结果按照月份进行分组,得到每个月的平均活跃天数和活跃用户数。

发表于 2023-10-06 17:17:35 回复(0)