MySQL 8.0.2 Windows Function
Q :
What is sql windows function ?
Ans : a windows sql function will perform
aggregation for each row in result set ,
letting each row in the result set , leatting each row retain its
identity .
Example :
Example :
mysql>
CREATE TABLE t(i INT);
mysql> INSERT INTO t VALUES (1),(2),(3),(4);
mysql> SELECT SUM(i) AS sum FROM t;
+------+
| sum |
+------+
| 10 |
+------+
mysql> INSERT INTO t VALUES (1),(2),(3),(4);
mysql> SELECT SUM(i) AS sum FROM t;
+------+
| sum |
+------+
| 10 |
+------+
mysql>
SELECT i, SUM(i) OVER () AS sum FROM t;
+------+------+
| i | sum
|
+------+------+
| 1 |
10 |
| 2 |
10 |
| 3 |
10 |
| 4 |
10 |
So you can see only by using over()
windows function , sum is showing
along with each value that has been summed .
Q: what are
types of windows sql function ?
Ans : sql aggregate function can be used in two
ways :
A: - Windows Function .
B:
- Specialized window function .
A: windows function : count
,sum avg, min ,max , BIT_OR ,BIT_AND ,BIT_XOR
,STDDEV_POP (STD,STDDEV) ,STDDEV_SAMP ,VAR_POP and VAR_SAMP .
B : Specialized window function : RANK , DENSE_RANK , PERCNENT_RANK , CUME
_DIST ,NTLE ,ROW_NUMBER , FIRST_VALUE ,LAST_VALUE ,NTH_VALUE , LEAD AND LAG .
,STDDEV_POP (STD,STDDEV) ,STDDEV_SAMP ,VAR_POP and VAR_SAMP .
B : Specialized window function : RANK , DENSE_RANK , PERCNENT_RANK , CUME
_DIST ,NTLE ,ROW_NUMBER , FIRST_VALUE ,LAST_VALUE ,NTH_VALUE , LEAD AND LAG .
Q: what are
partition , row ordering , determinacy , the window frame ,row peers ,physical and logical window frame bounds
?
Ans :
The Partition : CREATE TABLE sales(employee VARCHAR(50), `date` DATE, sale INT);
The Partition : CREATE TABLE sales(employee VARCHAR(50), `date` DATE, sale INT);
INSERT INTO
sales VALUES ('odin', '2017-03-01', 200),
('odin', '2017-04-01',
300),
('odin', '2017-05-01',
400),
('thor', '2017-03-01',
400),
('thor', '2017-04-01',
300),
('thor', '2017-05-01',
500);
mysql>
SELECT employee, SUM(sale) FROM sales GROUP BY employee;
+----------+-----------+
| employee |
SUM(sale) |
+----------+-----------+
| odin |
900 |
| thor |
1200 |
+----------+-----------+
mysql>
SELECT employee, date, sale, SUM(sale) OVER (PARTITION BY employee) AS sum FROM
sales;
+----------+------------+------+------+
| employee |
date | sale | sum |
+----------+------------+------+------+
| odin | 2017-03-01 | 200 |
900 |
| odin | 2017-04-01 | 300 |
900 |
| odin | 2017-05-01 | 400 |
900 |
| thor | 2017-03-01 | 400 | 1200 |
| thor | 2017-04-01 | 300 | 1200 |
| thor | 2017-05-01 | 500 | 1200 |
+----------+------------+------+------+
By above two
example you can use over() with partition of
for
getting sum along with grouping of
employee .
Example 2:
mysql> SELECT employee, MONTHNAME(date), sale, SUM(sale) OVER (PARTITION BY MONTH(date)) AS sum FROM sales;
mysql> SELECT employee, MONTHNAME(date), sale, SUM(sale) OVER (PARTITION BY MONTH(date)) AS sum FROM sales;
+----------+-----------------+------+------+
| employee |
MONTHNAME(date) | sale | sum |
+----------+-----------------+------+------+
| odin | March |
200 | 600 |
| thor | March |
400 | 600 |
| odin | April |
300 | 600 |
| thor | April |
300 | 600 |
| odin |
May | 400 |
900 |
| thor | May |
500 | 900 |
Order by :
mysql> SELECT employee, sale, date, SUM(sale) OVER (PARTITION by employee ORDER BY date) AS cum_sales FROM sales;
mysql> SELECT employee, sale, date, SUM(sale) OVER (PARTITION by employee ORDER BY date) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee |
sale | date | cum_sales |
+----------+------+------------+-----------+
| odin |
200 | 2017-03-01 | 200 |
| odin |
300 | 2017-04-01 | 500 |
| odin |
400 | 2017-05-01 | 900 |
| thor |
400 | 2017-03-01 | 400 |
| thor |
300 | 2017-04-01 | 700 |
| thor |
500 | 2017-05-01 | 1200 |
+----------+------+------------+-----------+
Windows can
be specified by physical(rows)
and logical(range) boundaries .
mysql>
SELECT employee, sale, date,
SUM(sale) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee |
sale | date | cum_sales |
+----------+------+------------+-----------+
| odin |
200 | 2017-03-01 | 200 |
| thor |
400 | 2017-03-01 | 600 |
| odin |
300 | 2017-04-01 | 900 |
| thor |
300 | 2017-04-01 | 1200 |
| odin |
400 | 2017-05-01 | 1600 |
| thor |
500 | 2017-05-01 | 2100 |
+----------+------+------------+-----------+
Digression: :if you omit an ORDER BY, there is no way to determine
which row comes before another row, so all of the rows in the partition can be
considered peers, and hence a degenerate result of:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT
ROW /* and its peers */).
Determinacy :
mysql>
SELECT employee, sale, date, SUM(sale) OVER (ORDER BY date ROWS UNBOUNDED
PRECEDING) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee |
sale | date | cum_sales |
+----------+------+------------+-----------+
| odin |
200 | 2017-03-01 | 200 |
| thor |
400 | 2017-03-01 | 600 |
| odin |
300 | 2017-04-01 | 900 |
| thor |
300 | 2017-04-01 | 1200 |
| odin |
400 | 2017-05-01 | 1600 |
| thor |
500 | 2017-05-01 | 2100 |
| odin |
200 | 2017-06-01 | 2300 |
| thor |
400 | 2017-06-01 | 2700 |
| odin |
600 | 2017-07-01 | 3300 |
| thor |
600 | 2017-07-01 | 3900 |
| odin |
100 | 2017-08-01 | 4000 |
| thor |
150 | 2017-08-01 | 4150 |
+----------+------+------------+-----------+
Movable window
Frame :
If anyone don’t
want to aggregate over all values in partition , then we use moving
averages :
Example :
TRUNCATE sales;
INSERT INTO
sales VALUES ('odin', '2017-03-01', 200),
('odin', '2017-04-01',
300),
('odin', '2017-05-01',
400),
('odin', '2017-06-01',
200),
('odin', '2017-07-01',
600),
('odin', '2017-08-01',
100),
('thor', '2017-03-01',
400),
('thor', '2017-04-01',
300),
('thor', '2017-05-01',
500),
('thor', '2017-06-01',
400),
('thor', '2017-07-01',
600),
('thor', '2017-08-01',
150);
mysql>
SELECT MONTH(date), SUM(sale),
AVG(SUM(sale)) OVER (ORDER BY
MONTH(date)
RANGE BETWEEN 1 PRECEDING
AND 1 FOLLOWING) AS sliding_avg
FROM sales GROUP BY MONTH(date);
+-------------+-----------+-------------+
| month(date) |
SUM(sale) | sliding_avg |
+-------------+-----------+-------------+
| 3 | 600 |
600.0000 |
| 4 | 600 |
700.0000 |
| 5 | 900 |
700.0000 |
| 6 | 600 |
900.0000 |
| 7 | 1200 |
683.3333 |
| 8 | 250 |
725.0000 |
+-------------+-----------+----------
Same result we
can obtained below complex query :
WITH sums AS (
SELECT SUM(t2.sale) AS sum, t2.date FROM
sales AS t2 GROUP BY t2.date )
SELECT t1.date, ( SELECT SUM(sums.sum) / COUNT(sums.sum)
FROM sums WHERE
MONTH(sums.date) - MONTH(t1.date) BETWEEN -1 AND 1) FROM sales AS t1, sums
MONTH(sums.date) - MONTH(t1.date) BETWEEN -1 AND 1) FROM sales AS t1, sums
GROUP BY
date
ORDER BY t1.date;
And even
without CTEs .
SELECT t1.date,
SUM(sale),
(SELECT SUM(sums.sum) / COUNT(sums.sum)
FROM
(SELECT SUM(t2.sale) AS sum, t2.date
FROM sales AS t2
GROUP BY t2.date) sums WHERE MONTH(sums.date) - MONTH(t1.date)
BETWEEN -1 AND 1 ) AS sliding_avg FROM sales AS t1 GROUP BY date
ORDER BY t1.date;
Comments
Post a Comment