




统计活跃用户核心是定义“活跃”和“时间窗口”,再用SQL聚合计算;关键在理清业务逻辑,如DAU、WAU、MAU及留存率的差异化实现与口径对齐。
统计活跃用户核心是定义“活跃”和“时间窗口”,再用SQL聚合计算。关键不在复杂语法,而在理清业务逻辑:比如“近7天登录过的用户”和“连续3天登录的用户”,写法完全不同。
先和产品、运营对齐口径,避免技术实现和业务需求脱节。常见定义包括:
假设日志表 user_behavior 含字段:user_id、event_time(datetime)、event_type(如'login'、'click'):
SELECT COUNT(DISTINCT user_id) AS dau FROM user_behavior WHERE DATE(event_time) = '2025-06-15' AND event_type = 'login';
注意点:
复计数统计“截至今天,过去7天活跃过的用户总数”,用日期范围动态筛选:
SELECT COUNT(DISTINCT user_id) AS wau FROM user_behavior WHERE event_time >= DATE_SUB(CURDATE(), INTERVAL 6 DAY) AND event_time
说明:
CURRENT_DATE - INTERVAL '6 days',ClickHouse用 today() - 6
需要两步:先找出某日新增用户,再查他们在后续日期是否回归。可用自连接或窗口函数:
WITH first_login AS ( SELECT user_id, MIN(DATE(event_time)) AS first_date FROM user_behavior WHERE event_type = 'login' GROUP BY user_id ), retention AS ( SELECT f.first_date, COUNT(DISTINCT b.user_id) AS retained_cnt FROM first_login f LEFT JOIN user_behavior b ON f.user_id = b.user_id AND DATE(b.event_time) = DATE_ADD(f.first_date, INTERVAL 1 DAY) AND b.event_type = 'login' GROUP BY f.first_date ) SELECT first_date, COALESCE(retained_cnt, 0) / COUNT(*) AS retention_rate FROM first_login f LEFT JOIN retention r ON f.first_date = r.first_date GROUP BY first_date;
要点: