SQL综合案例之电商漏斗转化分析,pv,uv及

news/2024/11/25 7:40:33/

 漏斗模型示例:

不同的业务场景有不同的业务路径 : 有先后顺序, 事件可以出现多次

注册转化漏斗 : 启动APP --> APP注册页面--->注册结果 -->提交订单-->支付成功

搜购转化漏斗 : 搜索商品--> 点击商品--->加入购物车-->提交订单-->支付成功

秒杀活动选购转化漏斗: 点击秒杀活动-->参加活动--->参与秒杀-->秒杀成功--->成功支付

 电商的购买转化漏斗模型图:

 

处理步骤 :

明确漏斗名称:购买转化漏斗

起始事件:浏览了商品的详情页

目标事件:支付

业务流程事件链路:详情页->购物车->下单页->支付

[事件之间有没有时间间隔要求 , 链路中相邻的两个事件是否可以有其他事件]

 

需求:求购买转化漏斗模型的转换率(事件和事件之间没有时间间隔要求,并且相邻两个事件可以去干其他的事)
1.每一个步骤的uv
2.相对的转换率(下一个步骤的uv/上一个步骤的UV),绝对的转换率(当前步骤的UV第一步骤的UV)关心的事件:e1,e2,e4,e5  ==> 先后顺序不能乱-- 准备数据
user_id  event_id   event_action  event_time
u001,e1,view_detail_page,2022-11-01 01:10:21
u001,e2,add_bag_page,2022-11-01 01:11:13
u001,e3,collect_goods_page,2022-11-01 02:07:11
u002,e3,collect_goods_page,2022-11-01 01:10:21
u002,e4,order_detail_page,2022-11-01 01:11:13
u002,e5,pay_detail_page,2022-11-01 02:07:11
u002,e6,click_adver_page,2022-11-01 13:07:23
u002,e7,home_page,2022-11-01 08:18:12
u002,e8,list_detail_page,2022-11-01 23:34:29
u002,e1,view_detail_page,2022-11-01 11:25:32
u002,e2,add_bag_page,2022-11-01 12:41:21
u002,e3,collect_goods_page,2022-11-01 16:21:15
u002,e4,order_detail_page,2022-11-01 21:41:12
u003,e5,pay_detail_page,2022-11-01 01:10:21
u003,e6,click_adver_page,2022-11-01 01:11:13
u003,e7,home_page,2022-11-01 02:07:11
u001,e4,order_detail_page,2022-11-01 13:07:23
u001,e5,pay_detail_page,2022-11-01 08:18:12
u001,e6,click_adver_page,2022-11-01 23:34:29
u001,e7,home_page,2022-11-01 11:25:32
u001,e8,list_detail_page,2022-11-01 12:41:21
u001,e1,view_detail_page,2022-11-01 16:21:15
u001,e2,add_bag_page,2022-11-01 21:41:12
u003,e8,list_detail_page,2022-11-01 13:07:23
u003,e1,view_detail_page,2022-11-01 08:18:12
u003,e2,add_bag_page,2022-11-01 23:34:29
u003,e3,collect_goods_page,2022-11-01 11:25:32
u003,e4,order_detail_page,2022-11-01 12:41:21
u003,e5,pay_detail_page,2022-11-01 16:21:15
u003,e6,click_adver_page,2022-11-01 21:41:12
u004,e7,home_page,2022-11-01 01:10:21
u004,e8,list_detail_page,2022-11-01 01:11:13
u004,e1,view_detail_page,2022-11-01 02:07:11
u004,e2,add_bag_page,2022-11-01 13:07:23
u004,e3,collect_goods_page,2022-11-01 08:18:12
u004,e4,order_detail_page,2022-11-01 23:34:29
u004,e5,pay_detail_page,2022-11-01 11:25:32
u004,e6,click_adver_page,2022-11-01 12:41:21
u004,e7,home_page,2022-11-01 16:21:15
u004,e8,list_detail_page,2022-11-01 21:41:12
u005,e1,view_detail_page,2022-11-01 01:10:21
u005,e2,add_bag_page,2022-11-01 01:11:13
u005,e3,collect_goods_page,2022-11-01 02:07:11
u005,e4,order_detail_page,2022-11-01 13:07:23
u005,e5,pay_detail_page,2022-11-01 08:18:12
u005,e6,click_adver_page,2022-11-01 23:34:29
u005,e7,home_page,2022-11-01 11:25:32
u005,e8,list_detail_page,2022-11-01 12:41:21
u005,e1,view_detail_page,2022-11-01 16:21:15
u005,e2,add_bag_page,2022-11-01 21:41:12
u005,e3,collect_goods_page,2022-11-01 01:10:21
u006,e4,order_detail_page,2022-11-01 01:11:13
u006,e5,pay_detail_page,2022-11-01 02:07:11
u006,e6,click_adver_page,2022-11-01 13:07:23
u006,e7,home_page,2022-11-01 08:18:12
u006,e8,list_detail_page,2022-11-01 23:34:29
u006,e1,view_detail_page,2022-11-01 11:25:32
u006,e2,add_bag_page,2022-11-01 12:41:21
u006,e3,collect_goods_page,2022-11-01 16:21:15
u006,e4,order_detail_page,2022-11-01 21:41:12
u006,e5,pay_detail_page,2022-11-01 23:10:21
u006,e6,click_adver_page,2022-11-01 01:11:13
u007,e7,home_page,2022-11-01 02:07:11
u007,e8,list_detail_page,2022-11-01 13:07:23
u007,e1,view_detail_page,2022-11-01 08:18:12
u007,e2,add_bag_page,2022-11-01 23:34:29
u007,e3,collect_goods_page,2022-11-01 11:25:32
u007,e4,order_detail_page,2022-11-01 12:41:21
u007,e5,pay_detail_page,2022-11-01 16:21:15
u007,e6,click_adver_page,2022-11-01 21:41:12
u007,e7,home_page,2022-11-01 01:10:21
u008,e8,list_detail_page,2022-11-01 01:11:13
u008,e1,view_detail_page,2022-11-01 02:07:11
u008,e2,add_bag_page,2022-11-01 13:07:23
u008,e3,collect_goods_page,2022-11-01 08:18:12
u008,e4,order_detail_page,2022-11-01 23:34:29
u008,e5,pay_detail_page,2022-11-01 11:25:32
u008,e6,click_adver_page,2022-11-01 12:41:21
u008,e7,home_page,2022-11-01 16:21:15
u008,e8,list_detail_page,2022-11-01 21:41:12
u008,e1,view_detail_page,2022-11-01 01:10:21
u009,e2,add_bag_page,2022-11-01 01:11:13
u009,e3,collect_goods_page,2022-11-01 02:07:11
u009,e4,order_detail_page,2022-11-01 13:07:23
u009,e5,pay_detail_page,2022-11-01 08:18:12
u009,e6,click_adver_page,2022-11-01 23:34:29
u009,e7,home_page,2022-11-01 11:25:32
u009,e8,list_detail_page,2022-11-01 12:41:21
u009,e1,view_detail_page,2022-11-01 16:21:15
u009,e2,add_bag_page,2022-11-01 21:41:12
u009,e3,collect_goods_page,2022-11-01 01:10:21
u010,e4,order_detail_page,2022-11-01 01:11:13
u010,e5,pay_detail_page,2022-11-01 02:07:11
u010,e6,click_adver_page,2022-11-01 13:07:23
u010,e7,home_page,2022-11-01 08:18:12
u010,e8,list_detail_page,2022-11-01 23:34:29
u010,e5,pay_detail_page,2022-11-01 11:25:32
u010,e6,click_adver_page,2022-11-01 12:41:21
u010,e7,home_page,2022-11-01 16:21:15
u010,e8,list_detail_page,2022-11-01 21:41:12-- 创建表
drop table if exists event_info_log;
create table event_info_log
(
user_id varchar(20),
event_id varchar(20),
event_action varchar(20),
event_time datetime
)
DUPLICATE KEY(user_id)
DISTRIBUTED BY HASH(user_id) BUCKETS 1;-- 通过本地文件的方式导入数据
curl \-u root: \-H "label:event_info_log" \-H "column_separator:," \-T /root/data/event_log.txt \http://linux01:8040/api/test/event_info_log/_stream_load

 逻辑分析:

1. 先将用户的事件序列,按照漏斗模型定义的条件进行过滤,留下满足条件的事件

2. 将同一个人的满足条件的事件ID收集到数组,按时间先后排序,拼接成字符串

3. 将拼接好的字符串,匹配漏斗模型抽象出来的正则表达式

方法一:

--1. 先将用户的事件序列,按照漏斗模型定义的条件进行过滤,留下满足条件的事件
--2. 将同一个人的满足条件的事件ID收集到数组,按时间先后排序,拼接成字符串
--3. 将拼接好的字符串,匹配漏斗模型抽象出来的正则表达式1.筛选时间条件,确定每个人的事件序列
select 
user_id,
max(event_ll) as event_seq  
from 
(
select 
user_id,
group_concat(event_id)over(partition by user_id order by report_date) as event_ll
from 
(select user_id,event_id,report_datefrom event_info_logwhere event_id in ('e1','e2','e4','e5')and to_date(report_date) = '2022-11-01'order by user_id,report_date
) as temp
) as temp2
group by user_id;+---------+------------------------+
| user_id | event_ll               |
+---------+------------------------+
| u006    | e4, e5, e1, e2, e4, e5 |
| u007    | e1, e4, e5, e2         |
| u005    | e1, e2, e5, e4, e1, e2 |
| u004    | e1, e5, e2, e4         |
| u010    | e4, e5, e5             |
| u001    | e1, e2, e5, e4, e1, e2 |
| u003    | e5, e1, e4, e5, e2     |
| u002    | e4, e5, e1, e2, e4     |
| u008    | e1, e1, e5, e2, e4     |
| u009    | e2, e5, e4, e1, e2     |
+---------+------------------------+2.确定匹配规则模型
selectuser_id,'购买转化漏斗' as funnel_name ,case-- 正则匹配,先触发过e1,在触发过e2,在触发过e4,在触发过e5when    event_seq  rlike('e1.*e2.*e4.*e5') then 4-- 正则匹配,先触发过e1,在触发过e2,在触发过e4when    event_seq  rlike('e1.*e2.*e4') then 3-- 正则匹配,先触发过e1,在触发过e2when    event_seq  rlike('e1.*e2') then 2-- 正则匹配,只触发过e1when    event_seq  rlike('e1') then 1else 0 end step
from 
(select 
user_id,
max(event_ll) as event_seq  
from 
(
select 
user_id,
group_concat(event_id)over(partition by user_id order by report_date) as event_ll
from 
(select user_id,event_id,report_datefrom event_info_logwhere event_id in ('e1','e2','e4','e5')and to_date(report_date) = '2022-11-01'order by user_id,report_date
) as temp
) as temp2
group by user_id
) as tmp3;+---------+--------------------+------+
| user_id | funnel_name        | step |
+---------+--------------------+------+
| u006    | 购买转化漏斗       |    4 |
| u007    | 购买转化漏斗       |    2 |
| u005    | 购买转化漏斗       |    3 |
| u004    | 购买转化漏斗       |    3 |
| u010    | 购买转化漏斗       |    0 |
| u001    | 购买转化漏斗       |    3 |
| u003    | 购买转化漏斗       |    2 |
| u002    | 购买转化漏斗       |    3 |
| u008    | 购买转化漏斗       |    3 |
| u009    | 购买转化漏斗       |    2 |
+---------+--------------------+------+-- 最后计算转换率
select funnel_name,sum(if(step >= 1 ,1,0)) as step1,sum(if(step >= 2 ,1,0)) as step2,sum(if(step >= 3 ,1,0)) as step3,sum(if(step >= 4 ,1,0)) as step4,round(sum(if(step >= 2 ,1,0))/sum(if(step >= 1 ,1,0)),2) as 'step1->step2_radio',round(sum(if(step >= 3 ,1,0))/sum(if(step >= 2 ,1,0)),2) as 'step2->step3_radio',round(sum(if(step >= 4 ,1,0))/sum(if(step >= 3 ,1,0)),2) as 'step3->step4_radio'
from 
(select'购买转化漏斗' as funnel_name ,case-- 正则匹配,先触发过e1,在触发过e2,在触发过e4,在触发过e5when    event_seq  regexp('e1.*e2.*e4.*e5') then 4-- 正则匹配,先触发过e1,在触发过e2,在触发过e4when    event_seq  regexp('e1.*e2.*.*e4') then 3-- 正则匹配,先触发过e1,在触发过e2when    event_seq  regexp('e1.*e2') then 2-- 正则匹配,只触发过e1when    event_seq  regexp('e1') then 1else 0 end stepfrom (select user_id,max(event_seq) as event_seq from -- 因为在doris1.1版本中还不支持数组,所以拼接字符串的时候还没办法排序(select user_id,-- 用开窗的方式进行排序,然后在有序的按照时间升序,将事件拼接group_concat(concat(report_date,'_',event_id),'|')over(partition by user_id order by report_date) as event_seqfrom event_info_log where to_date(report_date) = '2022-11-01'and event_id in('e1','e4','e5','e2')) as tmp group by user_id) as t1 
) as t2
group by funnel_name;+--------------------+-------+-------+-------+-------+--------------------+--------------------+--------------------+
| funnel_name        | step1 | step2 | step3 | step4 | step1->step2_radio | step2->step3_radio | step3->step4_radio |
+--------------------+-------+-------+-------+-------+--------------------+--------------------+--------------------+
| 购买转化漏斗       |     9 |     9 |     6 |     1 |                  1 |               0.67 |               0.17 |
+--------------------+-------+-------+-------+-------+--------------------+--------------------+--------------------+

 方法二:

1.按照时间排序,将所有事件全部拿出来,拼成一个字符串selectuser_id,max(sz)eventhingfrom(selectuser_id,group_concat(event_id)over(partition by user_id order by event_time asc)szfromevent_info_log)t1group by user_id;+---------+--------------------------------------------+
| user_id | eventhing                                  |
+---------+--------------------------------------------+
| u006    | e6, e4, e5, e7, e1, e2, e6, e3, e4, e5, e8 |
| u007    | e7, e7, e1, e3, e4, e8, e5, e6, e2         |
| u005    | e1, e3, e2, e3, e5, e7, e8, e4, e1, e2, e6 |
| u004    | e7, e8, e1, e3, e5, e6, e2, e7, e8, e4     |
| u010    | e4, e5, e7, e5, e6, e6, e7, e8, e8         |
| u001    | e1, e2, e3, e5, e7, e8, e4, e1, e2, e6     |
| u003    | e5, e6, e7, e1, e3, e4, e8, e5, e6, e2     |
| u002    | e3, e4, e5, e7, e1, e2, e6, e3, e4, e8     |
| u008    | e1, e8, e1, e3, e5, e6, e2, e7, e8, e4     |
| u009    | e3, e2, e3, e5, e7, e8, e4, e1, e2, e6     |
+---------+--------------------------------------------+2.-- 正则匹配select"电商的漏斗模型" as funnel_name,sum(if(step>=1,1,0))as step1_uv,sum(if(step>=2,1,0))as step2_uv,sum(if(step>=3,1,0))as step2_uv,sum(if(step>=4,1,0))as step2_uvfrom(selectuser_id,case when eventhing rlike('e1.*e2.*e4.*e5') then 4when eventhing rlike('e1.*e2.*e4') then 3when eventhing rlike('e1.*e2') then 2when eventhing rlike('e1') then 1else 0 end as stepfrom(selectuser_id,max(sz)eventhingfrom(selectuser_id,group_concat(event_id)over(partition by user_id order by event_time asc)szfromevent_info_log)t1group by user_id)t2)t3

 


http://www.ppmy.cn/news/108653.html

相关文章

Ansible变量

Ansible变量 变量命名规则 变量的名字由字母、下划线和数字组成,必须以字母开头如: good_a ok_b 传递一个YAML/JSON的形式(注意不管是YAML还是JSON,它们的最终格式一定要是字典) # cat a.json {"name"…

做接口测试时遇到接口加密了该如何处理

对明文编码生成信息摘要,以防止被篡改。比如MD5使用的是Hash算法,无论多长的输入,MD5都会输出长度为128bits的一个串。 摘要算法不要秘钥,客户端和服务端采用相同的摘要算法即可针对同一段明文获取一致的密文。 对称加密 对称加…

小学课后兴趣班选课平台的设计与实现(ASP.NET,SQLServer)

系统功能模块设计 中小学课后兴趣班选课平台包括前台功能模块和后台功能模块:前台功能模块是给会员使用的功能模块,在前台功能模块中会员可以实现在线注册,登录,查看发布的新闻资讯信息,查看教师,在线留言&…

腾讯云轻量应用服务器CPU主频多少?型号?

腾讯云轻量应用服务器CPU型号是什么?轻量服务器处理器主频?腾讯云服务器网账号下的CPU处理器型号为2.5GHz主频的Intel(R) Xeon(R) Gold 6133 CPU和2.4GHz主频Intel(R) Xeon(R) CPU E5-26xx v4,腾讯云轻量应用服务器不支持指定底层物理服务器的…

在Linux系统实现服务器端和客户端的套接字通信

目录 一.创建一个socket文件夹用来存放编写的服务器端和客户端程序 二.编写服务器端代码 三.编写客户端代码 四.编译c语言程序 五.断开连接 六.可能涉及到的一些没接触过的知识点 一.创建一个socket文件夹用来存放编写的服务器端和客户端程序 (我系统里的文件…

【多线程】sleep( ) 和 wait( ) 区别面试常问到

文章目录 前言区别1:使用限制区别2:使用场景区别3:所属类区别4:释放锁区别5:线程切换 前言 sleep(休眠) 和 wait(等待) 方法是 Java 多线程中常用的两个方法,它们有什么区别及一些该注意的地方有哪些呢&am…

Django_models模型--03

1. 配置MySQL表🐕🐕🐕 1.1 安装mysql🐅🐅 MySQL Workbench 8.0 CE 1.2 MySQL驱动🐈🐈 1.21 使用mysqlclient🐆 使用mysqlclientpip install mysqlclient( 如果上面的命令安装失…

2023年春秋杯网络安全联赛春季赛Reverse题目复现

文章目录 一.sum1. 分析程序逻辑2.解数独矩阵3.解题脚本 二.Poisoned_tea_CHELL1. 重新识别函数及程序逻辑分析2.IDA动态调试(attach附加调试)3. 输入选项进行单步调试4.解题脚本 一.sum 1. 分析程序逻辑 这里直接贴上当时分析的结果,根据程序行为不难猜出是数独问题(还得靠猜…