MySQL 8.0.2 Windows Function

Q : What is sql windows function  ?
Ans  : a windows sql function will perform aggregation for each row in result set  , letting  each row in  the result set , leatting each row retain its identity .
Example 
:
mysql> CREATE TABLE t(i INT);
mysql> INSERT INTO t VALUES (1),(2),(3),(4);
mysql> SELECT SUM(i) AS sum FROM t;
+------+
| sum  |
+------+
|   10 |
+------+

mysql> SELECT i, SUM(i) OVER () AS sum FROM t;
+------+------+
| i    | sum  |
+------+------+
|    1 |   10 |
|    2 |   10 |
|    3 |   10 |
|    4 |   10 |

So  you can see only by  using over()  windows function  , sum is showing along with  each value that  has been summed .

Q: what are types of windows sql function  ?
Ans :  sql aggregate function can be used in two ways  :
            A: -  Windows Function .
            B:  -    Specialized window function .
 
     A: windows function  :  count ,sum avg, min ,max , BIT_OR ,BIT_AND ,BIT_XOR
          ,STDDEV_POP  (STD,STDDEV) ,STDDEV_SAMP ,VAR_POP and VAR_SAMP   .
     B : Specialized window function  : RANK , DENSE_RANK , PERCNENT_RANK , CUME
        _DIST ,NTLE ,
ROW_NUMBER , FIRST_VALUE ,LAST_VALUE ,NTH_VALUE , LEAD           AND LAG .


Q: what are partition , row ordering , determinacy , the window frame ,row peers  ,physical and logical window  frame bounds  ?
Ans :
  The Partition  : 
CREATE TABLE sales(employee VARCHAR(50), `date` DATE, sale INT);
     INSERT INTO sales VALUES ('odin', '2017-03-01', 200),
                         ('odin', '2017-04-01', 300),
                         ('odin', '2017-05-01', 400),
                         ('thor', '2017-03-01', 400),
                         ('thor', '2017-04-01', 300),
                         ('thor', '2017-05-01', 500);

mysql> SELECT employee, SUM(sale) FROM sales GROUP BY employee;
+----------+-----------+
| employee | SUM(sale) |
+----------+-----------+
| odin     |       900 |
| thor     |      1200 |
+----------+-----------+


mysql> SELECT employee, date, sale, SUM(sale) OVER (PARTITION BY employee) AS sum FROM sales;
+----------+------------+------+------+
| employee | date       | sale | sum  |
+----------+------------+------+------+
| odin     | 2017-03-01 |  200 |  900 |
| odin     | 2017-04-01 |  300 |  900 |
| odin     | 2017-05-01 |  400 |  900 |
| thor     | 2017-03-01 |  400 | 1200 |
| thor     | 2017-04-01 |  300 | 1200 |
| thor     | 2017-05-01 |  500 | 1200 |
+----------+------------+------+------+

By above two example you can use over() with partition of    for getting  sum along with  grouping of  employee .

Example 2:
mysql> SELECT employee, MONTHNAME(date), sale, SUM(sale) OVER (PARTITION BY MONTH(date)) AS sum FROM sales;
+----------+-----------------+------+------+
| employee | MONTHNAME(date) | sale | sum  |
+----------+-----------------+------+------+
| odin     | March           |  200 |  600 |
| thor     | March           |  400 |  600 |
| odin     | April           |  300 |  600 |
| thor     | April           |  300 |  600 |
| odin     | May             |  400 |  900 |
| thor     | May             |  500 |  900 |







Order by  :
mysql> SELECT employee, sale, date, SUM(sale) OVER (PARTITION by employee ORDER BY date) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee | sale | date       | cum_sales |
+----------+------+------------+-----------+
| odin     |  200 | 2017-03-01 |       200 |
| odin     |  300 | 2017-04-01 |       500 |
| odin     |  400 | 2017-05-01 |       900 |
| thor     |  400 | 2017-03-01 |       400 |
| thor     |  300 | 2017-04-01 |       700 |
| thor     |  500 | 2017-05-01 |      1200 |
+----------+------+------------+-----------+

Windows can  be specified by physical(rows)  and logical(range)  boundaries .

mysql> SELECT employee, sale, date,
       SUM(sale) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING              AND CURRENT ROW) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee | sale | date       | cum_sales |
+----------+------+------------+-----------+
| odin     |  200 | 2017-03-01 |       200 |
| thor     |  400 | 2017-03-01 |       600 |
| odin     |  300 | 2017-04-01 |       900 |
| thor     |  300 | 2017-04-01 |      1200 |
| odin     |  400 | 2017-05-01 |      1600 |
| thor     |  500 | 2017-05-01 |      2100 |
+----------+------+------------+-----------+


Digression: :if you omit an ORDER BY, there is no way to determine which row comes before another row, so all of the rows in the partition can be considered peers, and hence a degenerate result of:
   (PARTITION by employee /* ORDER BY */
  RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW /* and its peers */).


Determinacy :
mysql> SELECT employee, sale, date, SUM(sale) OVER (ORDER BY date ROWS UNBOUNDED PRECEDING) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee | sale | date       | cum_sales |
+----------+------+------------+-----------+
| odin     |  200 | 2017-03-01 |       200 |
| thor     |  400 | 2017-03-01 |       600 |
| odin     |  300 | 2017-04-01 |       900 |
| thor     |  300 | 2017-04-01 |      1200 |
| odin     |  400 | 2017-05-01 |      1600 |
| thor     |  500 | 2017-05-01 |      2100 |
| odin     |  200 | 2017-06-01 |      2300 |
| thor     |  400 | 2017-06-01 |      2700 |
| odin     |  600 | 2017-07-01 |      3300 |
| thor     |  600 | 2017-07-01 |      3900 |
| odin     |  100 | 2017-08-01 |      4000 |
| thor     |  150 | 2017-08-01 |      4150 |
+----------+------+------------+-----------+


Movable window Frame :

If anyone don’t want to aggregate over all values in partition , then we use moving averages  :

Example  :

TRUNCATE sales;
INSERT INTO sales VALUES ('odin', '2017-03-01', 200),
                         ('odin', '2017-04-01', 300),
                         ('odin', '2017-05-01', 400),
                         ('odin', '2017-06-01', 200),
                         ('odin', '2017-07-01', 600),
                         ('odin', '2017-08-01', 100),
                         ('thor', '2017-03-01', 400),
                         ('thor', '2017-04-01', 300),
                         ('thor', '2017-05-01', 500),
                         ('thor', '2017-06-01', 400),
                         ('thor', '2017-07-01', 600),
                         ('thor', '2017-08-01', 150);                        

mysql> SELECT MONTH(date), SUM(sale),
              AVG(SUM(sale)) OVER (ORDER BY MONTH(date)
               RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sliding_avg
                FROM sales GROUP BY MONTH(date);

+-------------+-----------+-------------+
| month(date) | SUM(sale) | sliding_avg |
+-------------+-----------+-------------+
|           3 |       600 |    600.0000 |
|           4 |       600 |    700.0000 |
|           5 |       900 |    700.0000 |
|           6 |       600 |    900.0000 |
|           7 |      1200 |    683.3333 |
|           8 |       250 |    725.0000 |
+-------------+-----------+----------


Same result we can obtained below complex query  :

WITH sums AS (
   SELECT SUM(t2.sale) AS sum, t2.date FROM sales AS t2  GROUP BY t2.date )
   SELECT t1.date, ( SELECT SUM(sums.sum) / COUNT(sums.sum) FROM sums WHERE      
   MONTH(sums.date) - MONTH(t1.date)  BETWEEN -1 AND 1) 
FROM sales AS t1, sums
   GROUP BY  date
   ORDER BY t1.date;

And even without CTEs .

SELECT t1.date, SUM(sale),
      (SELECT SUM(sums.sum) / COUNT(sums.sum) FROM
      (SELECT SUM(t2.sale) AS sum, t2.date FROM sales AS t2
      GROUP BY t2.date) sums WHERE MONTH(sums.date) - MONTH(t1.date)
      BETWEEN -1 AND 1 ) AS sliding_avg FROM sales AS t1 GROUP BY date
      ORDER BY t1.date;


Comments

Popular posts from this blog

Install Mysql from ZIP without MySQL Installer or Exe

Logging in Python .

Dictionary in Python