postgres over 用法

总结下postgres中over的用法。

func() OVER( [PRITITION BY col1] ORDER BY col2 [DESC ] )

常用窗口函数汇总:

row_number(): 从当前开始,不间断,如1,2,3,4,5,6
rank() :从当前开始,会间断,如1,2,2,4,5,6
dense_rank():从当前开始不间断,但会重复,如1,2,2,3,4,5
percent_rank():从当前开始,计算在分组中的比例,如0,0.25,0.25,0.75,1,0,1 从0-1不断地循环
cume_dist():当前行的排序除以分组的数量,如分组有4行,则值为0.25,0.5,0.75,1
ntile(num_buckets integer):从1到当前值,除以分组的的数量,尽可能使分布平均
lag(value any [, offset integer [, default any ]]):偏移量函数,取滞后值,如lag(column_name,2,0)表示字段偏移量为2,没有则用default值代替,这里是0,不写默认是null
lead(value any [, offset integer [, default any ]]):偏移量函数,取提前值,类上
first_value(value any):返回窗口框架中的第一个值
last_value(value any):返回窗口框架中的最后一个值
nth_value(value any, nth integer):返回窗口框架中的指定值,如nth_value(salary,2),则表示返回字段salary的第二个窗口函数值

假设有如下的数据表:

depname empno salary enroll_date
develop 10 5200 2007-08-01
sales 1 5000 2006-10-01
personnel 5 3500 2007-12-10
sales 4 4800 2007-08-08
sales 6 5500 2007-01-02
personnel 2 3900 2006-12-23
develop 7 4200 2008-01-01
develop 9 4500 2008-01-01
sales 3 4800 2007-08-01
develop 8 6000 2006-10-01
develop 11 5200 2007-08-15

a. 统计各部门的总薪水,平均薪水和部门的详细情况

select sum(salary) OVER (PARTITION BY depname),avg(salary) OVER (PARTITION BY depname),* from empsalary;
  sum  |          avg          |  depname  | empno | salary | enroll_date 
-------+-----------------------+-----------+-------+--------+-------------
 25100 | 5020.0000000000000000 | develop   |    10 |   5200 | 2007-08-01
 25100 | 5020.0000000000000000 | develop   |     7 |   4200 | 2008-01-01
 25100 | 5020.0000000000000000 | develop   |     9 |   4500 | 2008-01-01
 25100 | 5020.0000000000000000 | develop   |     8 |   6000 | 2006-10-01
 25100 | 5020.0000000000000000 | develop   |    11 |   5200 | 2007-08-15
  7400 | 3700.0000000000000000 | personnel |     2 |   3900 | 2006-12-23
  7400 | 3700.0000000000000000 | personnel |     5 |   3500 | 2007-12-10
 20100 | 5025.0000000000000000 | sales     |     3 |   4800 | 2007-08-01
 20100 | 5025.0000000000000000 | sales     |     1 |   5000 | 2006-10-01
 20100 | 5025.0000000000000000 | sales     |     4 |   4800 | 2007-08-08
 20100 | 5025.0000000000000000 | sales     |     6 |   5500 | 2007-01-02
(11 rows)

b.统计人员在所在部门的薪水排名情况

select rank() OVER (PARTITION BY depname ORDER BY salary),* from empsalary;
 rank |  depname  | empno | salary | enroll_date 
------+-----------+-------+--------+-------------
    1 | develop   |     7 |   4200 | 2008-01-01
    2 | develop   |     9 |   4500 | 2008-01-01
    3 | develop   |    10 |   5200 | 2007-08-01
    3 | develop   |    11 |   5200 | 2007-08-15
    5 | develop   |     8 |   6000 | 2006-10-01
    1 | personnel |     5 |   3500 | 2007-12-10
    2 | personnel |     2 |   3900 | 2006-12-23
    1 | sales     |     4 |   4800 | 2007-08-08
    1 | sales     |     3 |   4800 | 2007-08-01
    3 | sales     |     1 |   5000 | 2006-10-01
    4 | sales     |     6 |   5500 | 2007-01-02
(11 rows)

TODO


参考文档:https://my.oschina.net/Kenyon/blog/79543