MySQL窗口函数 OVER()全解析

 更新时间:2025年12月31日 11:36:59   作者:代码or搬砖  
MySQL窗口函数是用于在查询结果集中执行计算的强大工具,它们可以对一组行(窗口)进行计算,并为每一行返回一个值,而不会减少行数,本文给大家介绍MySQL窗口函数 OVER()的相关知识,感兴趣的朋友跟随小编一起看看吧

一、窗口函数概述

1. 什么是窗口函数?

窗口函数Window Function)是对一组行(称为"窗口")执行计算,并为每一行返回一个值的函数。与聚合函数不同,窗口函数不减少行数。

2. 窗口函数 vs 聚合函数

特性窗口函数聚合函数
返回行数与输入行数相同通常减少行数(GROUP BY)
分组效果保留所有行,添加计算结果每组返回一行
语法位置SELECT 子句中SELECT 或 HAVING 子句中
典型函数ROW_NUMBER(), RANK(), SUM() OVER()SUM(), COUNT(), AVG()

3. 基本语法结构

窗口函数([参数]) OVER (
  [PARTITION BY <分组列>] 
  [ORDER BY <排序列 ASC/DESC>]
  [ROWS BETWEEN 开始行 AND 结束行]
)
  • OVER() 里面不能直接放 GROUP BY!可以放PARTITION BY
  • PARTITION BY 子句用于指定分组列,关键字:PARTITION BY
  • ORDER BY 子句用于指定排序列,关键字ORDER BY
  • ROWS BETWEEN 子句用于指定窗口的范围,关键字ROWS BETWEEN 即[开始行]、[结束行]

其中,ROWS BETWEEN 子句在实际中可能用得相对少一些,因此有部分参考资料的语法描述省略了ROWS BETWEEN 子句,主要侧重于PARTITION BY分组与ORDER BY排序:

二、窗口函数核心组成部分

1. PARTITION BY - 分区子句

将数据划分为多个分区,在每个分区内独立计算。

-- 创建测试数据
CREATE TABLE sales (
    id INT PRIMARY KEY AUTO_INCREMENT,
    salesperson VARCHAR(50),
    region VARCHAR(50),
    sale_date DATE,
    amount DECIMAL(10, 2)
);
INSERT INTO sales (salesperson, region, sale_date, amount) VALUES
('张三', '北京', '2024-01-01', 1000.00),
('张三', '北京', '2024-01-02', 1500.00),
('李四', '上海', '2024-01-01', 2000.00),
('李四', '上海', '2024-01-02', 2500.00),
('王五', '北京', '2024-01-01', 1200.00),
('王五', '北京', '2024-01-03', 1800.00),
('赵六', '广州', '2024-01-02', 2200.00);
-- 按销售员分区计算
SELECT 
    salesperson,
    sale_date,
    amount,
    -- 每个销售员的销售总额
    SUM(amount) OVER (PARTITION BY salesperson) AS total_by_person,
    -- 每个地区的销售总额
    SUM(amount) OVER (PARTITION BY region) AS total_by_region,
    -- 不分区(全局总额)
    SUM(amount) OVER () AS grand_total
FROM sales
ORDER BY salesperson, sale_date;

输出结果:

salesperson | sale_date  | amount | total_by_person | total_by_region | grand_total
------------|------------|--------|-----------------|-----------------|------------
张三        | 2024-01-01 | 1000.00| 2500.00         | 5500.00         | 12200.00
张三        | 2024-01-02 | 1500.00| 2500.00         | 5500.00         | 12200.00
李四        | 2024-01-01 | 2000.00| 4500.00         | 4500.00         | 12200.00
李四        | 2024-01-02 | 2500.00| 4500.00         | 4500.00         | 12200.00
王五        | 2024-01-01 | 1200.00| 3000.00         | 5500.00         | 12200.00
王五        | 2024-01-03 | 1800.00| 3000.00         | 5500.00         | 12200.00
赵六        | 2024-01-02 | 2200.00| 2200.00         | 2200.00         | 12200.00

2. ORDER BY - 排序子句

在分区内对行进行排序,影响排名函数和累计计算

SELECT 
    salesperson,
    sale_date,
    amount,
    -- 按金额排序(分区内)
    ROW_NUMBER() OVER (PARTITION BY salesperson ORDER BY amount DESC) AS rn,
    -- 累计金额(分区内按日期排序)
    SUM(amount) OVER (PARTITION BY salesperson ORDER BY sale_date) AS running_total,
    -- 移动平均值(最近3行的平均值)
    AVG(amount) OVER (PARTITION BY salesperson ORDER BY sale_date 
         ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_3
FROM sales
ORDER BY salesperson, sale_date;

3.聚合窗口函数

许多窗口函数的教程,通常将常用的窗口函数分为两大类:聚合窗口函数专用窗口函数。聚合窗口函数的函数名与普通常用聚合函数一致,功能也一致。从使用的角度来讲,与普通聚合函数的区别在于提供了窗口函数的专属子句,来使得数据的分析与获取更简便。主要有如下几个:

函数名作用
SUM对指定列的数值求和
AVG计算指定列的平均值
COUNT统计记录/非空值数量
MAX找出指定列的最大值
MIN找出指定列的最小值

4.专用窗口函数

常见的专用窗口函数

函数名分类说明
RANK排序函数类似于排名,并列的结果序号可以重复,序号不连续(如:1,2,2,4)
DENSE_RANK排序函数类似于排名,并列的结果序号可以重复,序号连续(如:1,2,2,3)
ROW_NUMBER排序函数对分组下的所有结果排序,基于分组分配唯一连续的行号(如:1,2,3,4)
PERCENT_RANK分布函数每行按公式 (rank-1) / (rows-1) 计算,结果为0~1的百分比值
CUME_DIST分布函数分组内小于等于当前rank值的行数 ÷ 分组内总行数,结果为0~1的百分比值

5. ROWS BETWEEN - 窗口帧子句

定义窗口函数的计算范围。

-- 各种窗口帧的示例
SELECT 
    salesperson,
    sale_date,
    amount,
    -- 默认:分区内所有行
    SUM(amount) OVER (PARTITION BY salesperson) AS total_all,
    -- ROWS模式:物理行
    SUM(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_rows,
    -- RANGE模式:逻辑值范围(相同值的行视为同一帧)
    SUM(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_range,
    -- 滑动窗口:当前行及前2行
    SUM(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS sum_last_3,
    -- 前后各一行
    SUM(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    ) AS sum_neighbors
FROM sales
ORDER BY salesperson, sale_date;

三、窗口函数分类详解

1. 序号函数(Ranking Functions)

-- 创建测试数据
CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
);
INSERT INTO employees (name, department, salary) VALUES
('张三', '技术部', 8000.00),
('李四', '技术部', 9000.00),
('王五', '技术部', 9500.00),
('赵六', '技术部', 9000.00),
('钱七', '销售部', 7000.00),
('孙八', '销售部', 8500.00),
('周九', '销售部', 8500.00),
('吴十', '销售部', 7500.00);
-- 1. ROW_NUMBER():连续不重复的序号
SELECT 
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
-- 2. RANK():有间隔的排名(相同值排名相同,下一个排名跳跃)
SELECT 
    name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_num
FROM employees;
-- 3. DENSE_RANK():无间隔的排名(相同值排名相同,下一个排名连续)
SELECT 
    name,
    department,
    salary,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank_num
FROM employees;
-- 4. NTILE(n):将数据分为n组
SELECT 
    name,
    department,
    salary,
    NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile
FROM employees;

输出对比:

部门   | 姓名 | 薪资   | ROW_NUMBER | RANK | DENSE_RANK | NTILE(4)
------|------|--------|------------|------|------------|---------
技术部 | 王五 | 9500   | 1          | 1    | 1          | 1
技术部 | 李四 | 9000   | 2          | 2    | 2          | 1
技术部 | 赵六 | 9000   | 3          | 2    | 2          | 2
技术部 | 张三 | 8000   | 4          | 4    | 3          | 2
销售部 | 孙八 | 8500   | 1          | 1    | 1          | 1
销售部 | 周九 | 8500   | 2          | 1    | 1          | 1
销售部 | 吴十 | 7500   | 3          | 3    | 2          | 2
销售部 | 钱七 | 7000   | 4          | 4    | 3          | 2

2. 分布函数(Distribution Functions)

-- 5. PERCENT_RANK():百分比排名 (rank - 1) / (total_rows - 1)
SELECT 
    name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary) AS rank_num,
    PERCENT_RANK() OVER (PARTITION BY department ORDER BY salary) AS percent_rank
FROM employees;
-- 6. CUME_DIST():累计分布(小于等于当前值的行数 / 总行数)
SELECT 
    name,
    department,
    salary,
    CUME_DIST() OVER (PARTITION BY department ORDER BY salary) AS cume_dist
FROM employees;
-- 7. PERCENTILE_CONT():连续百分位数(需要MySQL 8.0.2+)
SELECT 
    department,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) 
        OVER (PARTITION BY department) AS median_salary
FROM employees
GROUP BY department, salary;
-- 8. PERCENTILE_DISC():离散百分位数
SELECT 
    department,
    PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary) 
        OVER (PARTITION BY department) AS median_salary
FROM employees
GROUP BY department, salary;

3. 前后值函数(Value Functions)

-- 9. LAG(column, n, default):获取前n行的值
SELECT 
    name,
    department,
    salary,
    LAG(salary, 1, 0) OVER (PARTITION BY department ORDER BY salary) AS prev_salary,
    salary - LAG(salary, 1, 0) OVER (PARTITION BY department ORDER BY salary) AS salary_diff
FROM employees;
-- 10. LEAD(column, n, default):获取后n行的值
SELECT 
    name,
    department,
    sale_date,
    amount,
    LEAD(amount, 1, 0) OVER (PARTITION BY salesperson ORDER BY sale_date) AS next_amount,
    LEAD(sale_date, 1, NULL) OVER (PARTITION BY salesperson ORDER BY sale_date) AS next_date
FROM sales;
-- 11. FIRST_VALUE(column):窗口内第一个值
SELECT 
    name,
    department,
    salary,
    FIRST_VALUE(salary) OVER (
        PARTITION BY department 
        ORDER BY salary 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS lowest_salary,
    salary - FIRST_VALUE(salary) OVER (
        PARTITION BY department 
        ORDER BY salary 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS diff_from_lowest
FROM employees;
-- 12. LAST_VALUE(column):窗口内最后一个值(注意默认窗口帧!)
SELECT 
    name,
    department,
    salary,
    -- 错误用法:默认窗口帧是 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary) AS wrong_last_value,
    -- 正确用法:指定完整的窗口帧
    LAST_VALUE(salary) OVER (
        PARTITION BY department 
        ORDER BY salary 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS correct_last_value,
    -- 或者使用NTH_VALUE
    NTH_VALUE(salary, 1) OVER (
        PARTITION BY department 
        ORDER BY salary 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS first_salary,
    NTH_VALUE(salary, 2) OVER (
        PARTITION BY department 
        ORDER BY salary 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS second_salary
FROM employees;

4. 聚合函数作为窗口函数

-- 所有聚合函数都可以作为窗口函数使用
SELECT 
    salesperson,
    region,
    sale_date,
    amount,
    -- 聚合函数
    COUNT(*) OVER (PARTITION BY salesperson) AS total_transactions,
    SUM(amount) OVER (PARTITION BY salesperson) AS total_amount,
    AVG(amount) OVER (PARTITION BY salesperson) AS avg_amount,
    MAX(amount) OVER (PARTITION BY salesperson) AS max_amount,
    MIN(amount) OVER (PARTITION BY salesperson) AS min_amount,
    -- 标准差和方差(MySQL 8.0+)
    STDDEV(amount) OVER (PARTITION BY salesperson) AS std_amount,
    VARIANCE(amount) OVER (PARTITION BY salesperson) AS var_amount,
    -- 累计聚合
    SUM(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total,
    -- 移动平均
    AVG(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3,
    -- 百分比
    amount * 100.0 / SUM(amount) OVER (PARTITION BY salesperson) AS percentage
FROM sales
ORDER BY salesperson, sale_date;

到此这篇关于MySQL窗口函数 OVER()讲解的文章就介绍到这了,更多相关mysql 窗口函数over()内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

相关文章

  • You must SET PASSWORD before executing this statement的解决方法

    You must SET PASSWORD before execut

    今天在MySql5.6操作时报错:You must SET PASSWORD before executing this statement解决方法,需要的朋友可以参考下
    2013-06-06
  • Mysql配置主从复制-GTID模式详解

    Mysql配置主从复制-GTID模式详解

    这篇文章主要介绍了Mysql配置主从复制-GTID模式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教
    2024-04-04
  • MySQL5.x版本乱码问题解决方案

    MySQL5.x版本乱码问题解决方案

    这篇文章主要介绍了MySQL5.x版本乱码问题解决方案,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下
    2020-09-09
  • 使用mysqldump实现mysql备份

    使用mysqldump实现mysql备份

    mysqldump客户端可用来转储数据库或搜集数据库进行备份或将数据转移到另一个SQL服务器(不一定是一个MySQL服务器)。今天我们就来详细探讨下mysqldump的使用方法
    2016-11-11
  • MySQL数据库优化技术之配置技巧总结

    MySQL数据库优化技术之配置技巧总结

    这篇文章主要介绍了MySQL数据库优化技术之配置技巧,较为详细的总结分析了MySQL进行硬件级软件优化的相关方法与注意事项,需要的朋友可以参考下
    2016-07-07
  • 解决MySQL Sending data导致查询很慢问题的方法与思路

    解决MySQL Sending data导致查询很慢问题的方法与思路

    这篇文章主要介绍了解决MySQL Sending data导致查询很慢问题的方法与思路,感兴趣的小伙伴们可以参考一下
    2016-04-04
  • 详解MySQL的数据行和行溢出机制

    详解MySQL的数据行和行溢出机制

    在前面的文章中,白日梦曾不止一次的提及到:InnoDB从磁盘中读取数据的最小单位是数据页。 而你想得到的id = xxx的数据,就是这个数据页众多行中的一行。 这篇文章我们就一起来看一下数据行设计的多么巧妙。
    2020-11-11
  • MySQL 5.7中NULL与‘ ‘空字符值的多维度分析(详解)

    MySQL 5.7中NULL与‘ ‘空字符值的多维度分析(详解)

    在数据库设计和开发过程中,正确理解和使用NULL值对于确保数据质量和查询效率至关重要,本文将从多个维度对NULL值进行深入分析,并与空字符串''以及其他控制进行对比,旨在为读者提供一个全面而清晰的理解,感兴趣的朋友跟随小编一起看看吧
    2024-12-12
  • InnoDB中不同SQL语句设置锁的情况详解

    InnoDB中不同SQL语句设置锁的情况详解

    这篇文章主要介绍了InnoDB中不同SQL语句设置锁的情况详解,在Mysql中,锁定读、更新、删除操作通常会对SQL语句处理过程中扫描到的每条索引记录设置记录锁,需要的朋友可以参考下
    2024-01-01
  • 详解Mysql通讯协议

    详解Mysql通讯协议

    这篇文章对Mysql的通讯协议做了详细介绍和说明,希望我们整理的内容对你有用,一起学习下吧。
    2017-12-12

最新评论