1.行程和用户(262)
表:
Trips
+-------------+----------+ | Column Name | Type | +-------------+----------+ | id | int | | client_id | int | | driver_id | int | | city_id | int | | status | enum | | request_at | date | +-------------+----------+ id 是这张表的主键。 这张表中存所有出租车的行程信息。每段行程有唯一 id ,其中 client_id 和 driver_id 是 Users 表中 users_id 的外键。 status 是一个表示行程状态的枚举类型,枚举成员为(‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’) 。表:
Users
+-------------+----------+ | Column Name | Type | +-------------+----------+ | users_id | int | | banned | enum | | role | enum | +-------------+----------+ users_id 是这张表的主键。 这张表中存所有用户,每个用户都有一个唯一的 users_id ,role 是一个表示用户身份的枚举类型,枚举成员为 (‘client’, ‘driver’, ‘partner’) 。 banned 是一个表示用户是否被禁止的枚举类型,枚举成员为 (‘Yes’, ‘No’) 。取消率 的计算方式如下:(被司机或乘客取消的非禁止用户生成的订单数量) / (非禁止用户生成的订单总数)。
写一段 SQL 语句查出
"2013-10-01"
至"2013-10-03"
期间非禁止用户(乘客和司机都必须未被禁止)的取消率。非禁止用户即 banned 为 No 的用户,禁止用户即 banned 为 Yes 的用户。返回结果表中的数据可以按任意顺序组织。其中取消率
Cancellation Rate
需要四舍五入保留 两位小数 。
查询结果格式如下例所示。
示例:
输入:
Trips 表:
+----+-----------+-----------+---------+---------------------+------------+
| id | client_id | driver_id | city_id | status | request_at |
+----+-----------+-----------+---------+---------------------+------------+
| 1 | 1 | 10 | 1 | completed | 2013-10-01 |
| 2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-01 |
| 3 | 3 | 12 | 6 | completed | 2013-10-01 |
| 4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 |
| 5 | 1 | 10 | 1 | completed | 2013-10-02 |
| 6 | 2 | 11 | 6 | completed | 2013-10-02 |
| 7 | 3 | 12 | 6 | completed | 2013-10-02 |
| 8 | 2 | 12 | 12 | completed | 2013-10-03 |
| 9 | 3 | 10 | 12 | completed | 2013-10-03 |
| 10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 |
+----+-----------+-----------+---------+---------------------+------------+Users 表:
+----------+--------+--------+
| users_id | banned | role |
+----------+--------+--------+
| 1 | No | client |
| 2 | Yes | client |
| 3 | No | client |
| 4 | No | client |
| 10 | No | driver |
| 11 | No | driver |
| 12 | No | driver |
| 13 | No | driver |
+----------+--------+--------+
输出:
+------------+-------------------+
| Day | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 | 0.33 |
| 2013-10-02 | 0.00 |
| 2013-10-03 | 0.50 |
+------------+-------------------+
解释:
2013-10-01:- 共有 4 条请求,其中 2 条取消。- 然而,id=2 的请求是由禁止用户(user_id=2)发出的,所以计算时应当忽略它。- 因此,总共有 3 条非禁止请求参与计算,其中 1 条取消。- 取消率为 (1 / 3) = 0.33
2013-10-02:- 共有 3 条请求,其中 0 条取消。- 然而,id=6 的请求是由禁止用户发出的,所以计算时应当忽略它。- 因此,总共有 2 条非禁止请求参与计算,其中 0 条取消。- 取消率为 (0 / 2) = 0.00
2013-10-03:- 共有 3 条请求,其中 1 条取消。- 然而,id=8 的请求是由禁止用户发出的,所以计算时应当忽略它。- 因此,总共有 2 条非禁止请求参与计算,其中 1 条取消。- 取消率为 (1 / 2) = 0.50
方法一(Left Join)
补充知识:
ROUND(A,B)-A为需要四舍五入的值,B为保留小数位数
SUM(A)-A为求和字段
IF(A,B,C)-若A对,则值为B,否则为C
本题思路:
先求出来被禁止的用户和司机,然后再取反,然后再筛选一下时间,这就找到了非禁止用户的范围,接下来只需判断订单状态即可。
SELECT T.request_at AS `Day`, ROUND(SUM(IF(T.STATUS = 'completed',0,1))/ COUNT(T.STATUS),2) AS `Cancellation Rate`
FROM trips AS T LEFT JOIN
(SELECT users_idFROM usersWHERE banned = 'Yes'
) AS A ON (T.Client_Id = A.users_id)
LEFT JOIN (SELECT users_idFROM usersWHERE banned = 'Yes'
) AS A1
ON (T.Driver_Id = A1.users_id)
WHERE A.users_id IS NULL AND A1.users_id IS NULL AND T.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY T.request_at
方法二(NOT IN)
本题思路:
与方法一思路差不多,就是一个用的not in,一个用左连接,这里推荐方法一。
SELECT T.request_at AS `Day`, ROUND(SUM(IF(T.STATUS = 'completed',0,1))/ COUNT(T.STATUS),2) AS `Cancellation Rate`
FROM trips AS T
WHERE
T.Client_Id NOT IN (SELECT users_idFROM usersWHERE banned = 'Yes'
)
AND
T.Driver_Id NOT IN (SELECT users_idFROM usersWHERE banned = 'Yes'
)
AND T.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY T.request_at
方法三(Join)
# Write your MySQL query statement below
SELECT T.request_at AS `Day`, ROUND(SUM(IF(T.STATUS = 'completed',0,1))/ COUNT(T.STATUS),2) AS `Cancellation Rate`
FROM Trips AS T
JOIN Users AS U1 ON (T.client_id = U1.users_id AND U1.banned ='No')
JOIN Users AS U2 ON (T.driver_id = U2.users_id AND U2.banned ='No')
WHERE T.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY T.request_at
2.游戏玩法分析I(511)
活动表
Activity
:+--------------+---------+ | Column Name | Type | +--------------+---------+ | player_id | int | | device_id | int | | event_date | date | | games_played | int | +--------------+---------+ 在 SQL 中,表的主键是 (player_id, event_date)。 这张表展示了一些游戏玩家在游戏平台上的行为活动。 每行数据记录了一名玩家在退出平台之前,当天使用同一台设备登录平台后打开的游戏的数目(可能是 0 个)。查询每位玩家 第一次登陆平台的日期。
查询结果的格式如下所示:
Activity 表:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+Result 表:
+-----------+-------------+
| player_id | first_login |
+-----------+-------------+
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
+-----------+-------------+
方法一(min)
select player_id,min(event_date) first_login from activity group by player_id
3.游戏玩法分析I(550)
Table:
Activity
+--------------+---------+ | Column Name | Type | +--------------+---------+ | player_id | int | | device_id | int | | event_date | date | | games_played | int | +--------------+---------+ (player_id,event_date)是此表的主键。 这张表显示了某些游戏的玩家的活动情况。 每一行是一个玩家的记录,他在某一天使用某个设备注销之前登录并玩了很多游戏(可能是 0)。编写一个 SQL 查询,报告在首次登录的第二天再次登录的玩家的比率,四舍五入到小数点后两位。换句话说,您需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
查询结果格式如下所示:
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+Result table:
+-----------+
| fraction |
+-----------+
| 0.33 |
+-----------+
只有 ID 为 1 的玩家在第一天登录后才重新登录,所以答案是 1/3 = 0.33
方法一(AVG)
avg(a)-a为某字段,求某字段平均值
datediff(a,b)=1,a比b的日期大一,b是a的昨天
思路分析
先求出每个玩家第一次登录日期,然后求出左连接上第二天的,没有则为null,进而求解。
# Write your MySQL query statement below
select round(avg(a.event_date is not null), 2) fraction
from (select player_id, min(event_date) as loginfrom activitygroup by player_id) p
left join activity a
on p.player_id=a.player_id and datediff(a.event_date, p.login)=1