概念
电商业务中需要计算用户留存情况,在时间维度上可以分为用户次日、多日后的留存情况,用留存率表示。留存率计算方式如下:
次日留存率 = 当日新用户在次日登录的数量 / 当日新用户注册数量
三日留存率 = 当日新用户在第四天登录的数量 / 当日新用户注册数量
七日留存率 = 当日新用户在第八天登录的数量 / 当日新用户注册数量
其他时间的留存率计算同理。
情况一
假设现在有两张表,一张是注册信息表register_info,另一张是登录信息表login_info,两个表格数据如下。
现在分别计算出每日的新增用户数,以及次日、三日、七日留存率。
首先,将两张表做一个连接,用datediff函数选出用户注册后七日仍有登录的数据(条件1),连接条件是用户id相同(条件2)。查询代码如下:
select r.uid, date(r.register_date) as rt, date(l.login_time) as lt, datediff(l.login_time, r.register_date) as tdiff from register_info r left join login_info l on r.uid=l.uid and date(l.login_time) between date(r.register_date)+interval 1 day and date(r.register_date)+interval 7 day order by uid asc
查询结果如下:(这里每个用户的注册时间只有一个,所以会得出每个用户所有注册登录时间的排列组合便于后续计算)
之后利用case条件判断和count函数计算出各种留存率情况。代码如下:
select rt as 日期, count(distinct uid) as 新增用户数, count(distinct case when tdiff=1 then uid end)/count(distinct uid) as 次日留存率, count(distinct case when tdiff=3 then uid end)/count(distinct uid) as 三日留存率, count(distinct case when tdiff=7 then uid end)/count(distinct uid) as 七日留存率 from ( select r.uid, date(r.register_date) as rt, date(l.login_time) as lt, datediff(l.login_time, r.register_date) as tdiff from register_info r left join login_info l on r.uid=l.uid and date(l.login_time) between date(r.register_date)+interval 1 day and date(r.register_date)+interval 7 day order by uid asc ) t1 group by rt
得到结果如下:
情况二
假如现在只有一张用户登录信息表user_login_info,其中默认每个用户最早的登录时间就是注册时间,表格数据如下:
所以我们需要先用两个子查询得出类似(1)中的注册表t1和登录表t2,之后操作与(1)中相同,代码如下:
select count(distinct uid) 新增用户数, count(distinct case when tdiff=1 then uid end)/count(distinct uid) as 次日留存率, count(distinct case when tdiff=3 then uid end)/count(distinct uid) as 三日留存率, count(distinct case when tdiff=7 then uid end)/count(distinct uid) as 七日留存率 from ( select t1.uid, t1.rt, t2.lt, datediff(t2.lt,t1.rt) as tdiff from ( (select user_id as uid, min(date(login_time)) as rt from user_login_info group by user_id) t1 LEFT JOIN (select user_id as uid, date(login_time) as lt from user_login_info) t2 on t1.uid=t2.uid and t2.lt between t1.rt + interval 1 day and t1.rt + interval 7 day ) ) t3 group by rt order by rt asc
其中t3就是情况一中得到的t1.
查询结果如下:
到此这篇关于SQL计算用户留存率问题的文章就介绍到这了,更多相关SQL 用户留存率内容请搜索IT俱乐部以前的文章或继续浏览下面的相关文章希望大家以后多多支持IT俱乐部!