1. 程式人生 > 其它 >oracle中的視窗函式over()--2

oracle中的視窗函式over()--2

視窗函式

視窗函式也稱為OLAP(Online Analytical Processing)函式,意思是對資料庫資料進行實時分析處理,視窗函式在Oracle和SQL Server 中也被稱為分析函式,視窗函式語法如下

<視窗函式> OVER ([PARTITION BY <列清單>]           
ORDER BY <排序用列清單> [框架])

語法中<>中的內容不可省略,[]中的內容可以省略。即PARTIION BY和框架可以省略,ORDER BY 不可以省略。框架對彙總範圍進行限定。

(ROWS | RANGE) BETWEEN (UNBOUNDED | [num]) PRECEDING AND ([num] PRECEDING | CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN CURRENT ROW AND (CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN [num] FOLLOWING AND (UNBOUNDED | [num]) FOLLOWING

視窗函式:
    • 1)可以作為視窗函式的聚合函式。
      SUM :求和
      MIN :最小值
      MAX :最大值
      AVG :平均值
      COUNT :計數

      2)專用視窗函式
      RANK :跳躍排序,排序:1,1,3
      DENSE_RANK :連續排序,排序:1,1,2
      ROW_NUMBER:沒有重複值的排序,排序:1,2,3
      FIRST_VALUE :返回組中資料視窗的第一個值
      LAST_VALUE :返回組中資料視窗的最後一個值。
      LAG :LAG(col,n,DEFAULT) 用於統計視窗內往上第n行值。
      LEAD :LEAD(col,n,DEFAULT) 用於統計視窗內往下第n行值。

視窗函式實操

先建立一張產品表

create table product (
product_id int(4) COMMENT 'ID',
product_name varchar(10) COMMENT '產品名稱',
product_type varchar(10) COMMENT '產品型別',
sale_price int(4) COMMENT '價格'
)ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='產品清單'

插入資料

insert into product(product_id,product_name,product_type,sale_price) values(1,'叉子','廚房用具',500),(2,'擦菜板','廚房用具',880),
(3,'菜刀','廚房用具',3000),(4,'高壓鍋','廚房用具',6800),(5,'T恤衫','衣服',1000),(6,'運動T恤','衣服',4000),(7,'圓珠筆','辦公用品',100),(8,'打孔器','辦公用品',500);

結果表如圖

1)可以作為視窗函式的聚合函式。

  • sum求和(累計值)
SELECT product_id, product_name, product_type, sale_price,
SUM(sale_price) OVER (PARTITION BY product_type ORDER BY sale_price range BETWEEN UNBOUNDED PRECEDING and current row ) AS current_sum
FROM Product;

SELECT product_id, product_name, product_type, sale_price,
SUM(sale_price) OVER ( ORDER BY sale_price ) AS current_sum
FROM Product;
# 上邊語句和下邊語句結果相同
SELECT product_id, product_name, product_type, sale_price,
SUM(sale_price) OVER ( ORDER BY sale_price range BETWEEN UNBOUNDED PRECEDING and current row ) AS current_sum
FROM Product;

注:預設框架為range BETWEEN UNBOUNDED PRECEDING and current row,row和range的區別是rows按照行進行計算,如當求第一行的時候,求和為第一行-第一行,當求第二行的時候,求和為第一行-第二行;而range是按照值進行計算,如sale_price, 當sale_price=100,求和範圍為100-100,當sale_price=500,求和範圍為100-500。

SELECT product_id, product_name, product_type, sale_price,
SUM(sale_price) OVER ( ORDER BY sale_price rows BETWEEN UNBOUNDED PRECEDING and current row ) AS current_sum
FROM Product;

  • MIN、MAX、AVG、COUNT
SELECT product_id, product_name, product_type, sale_price,
MIN(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_min,
MAX(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_max,
AVG(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_avg,
COUNT(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_count
FROM Product;

注:預設框架為range BETWEEN UNBOUNDED PRECEDING and current row*,range是按照值進行計算的,以count來進行講述,第一組第一行count計算的範圍為sale_price值,就是100-100的就一個值,計數1;第一組第二行count計算的範圍為100-500,計數2;第二組第一行count計算的範圍為500-500,計數2。後續類似。

2)專用視窗函式

  • RANK、DENSE_RANK、ROW_NUMBER
SELECT product_id, product_name, product_type, sale_price,
rank() OVER ( PARTITION BY product_type ORDER BY sale_price rows BETWEEN 2 PRECEDING and current row ) AS current_rk,
dense_rank() OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_drk,
row_number() OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_rn
FROM Product;

注:rank函式排序是可以跳躍的,dense_rank函式排序是順序的,row_number函式排序是按照行數。

  • FIRST_VALUE、LAST_VALUE
SELECT product_id, product_name, product_type, sale_price,
FIRST_VALUE(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_FV,
LAST_VALUE(sale_price) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_LV
FROM Product;

  • LAG 、LEAD。
SELECT product_id, product_name, product_type, sale_price,
LAG(sale_price,1) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_LAG,
LEAD(sale_price,1) OVER ( PARTITION BY product_type ORDER BY sale_price  ) AS current_LEAD
FROM Product;

總結

視窗函式兼具GROUP BY 子句的分組功能和ORDER BY子句的排序功能,但是PARTITION BY子句跟GROUP BY 不具備彙總功能,也就說PARTITION BY子句不會減少行數。

通過PARTITION BY 分組後的記錄集合稱為視窗。此處的視窗並非“窗戶”的意思,而是代表範圍。這也是“視窗函式”名稱的由來。

原文:https://zhuanlan.zhihu.com/p/273846136