Home Window Functions
Post
Cancel

Window Functions

  • Grouping data in T-SQL

    1
    2
    3
    4
    5
    6
    7
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              ModifiedDate
          FROM SaleGoal
          WHERE SalesYear = 2011
    
  • Window syntax in T-SQL

    • Create the window with OVER caluse
    • PARTITION BY creates the frame
    • If you do not include PARTITION BY the frame is the entire table
    • To arrange the results, use ORDER BY
    • Allows aggregations to be created at the same time as the window
    1
    2
    
          -- Create a Window Data Grouping
              OVER (PARTITION BY SalesYear ORDER BY SalesYear)
    
  • Window Functions (SUM)

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson, 
              SalesYear,
              CurrentQuota,
              SUM(CurrentQuota)
              OVER (PARTITION BY SalesYear) AS YearlyTotal,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Window Functions (COUNT)

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              COUNT(CurrentQuota)
              OVER (PARTITION BY SalesYear) AS QuotaPerYear,
              ModifiedDate as ModDate
          FROM SaleGoal 
    

Common Window Functions

  • FIRST_VALUE() and LAST_VALUE()

    • FIRST_VALUE() returns the first value in the window
    • LAST_VALUE() returns the last value in the window

      Note that for FIRST_VALUE and LAST_VALUE the ORDER BY command is required

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
          -- Select the columns
          SELECT 
              SalesPerson,
              SalesYear,
              CurrentQuota,
              -- First value from every window
              FIRST_VALUE(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY ModifiedDate) AS StartQuota,
              -- Last value from every window
              LAST_VALUE(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY ModifiedDate) AS
              EndQuota,
              ModifiedDate as ModDate
          FROM SaleGoal
    

    image

  • Gettting the next value with LEAD()

    • Provides the ability to query the value from the next row
    • NextQuota Column is created by using LEAD()
    • Requires the use of ORDER BY to order the rows
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
          -- Create a window function to get the values from the next row
              LEAD(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY ModifiedDate) AS NextQuota,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Getting the previous value with LAG()

    • Provides the ability to query the value from the previous row
    • PreviousQuota Column is created by using LAG()
    • Requires the use of ORDER BY to order the rows
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
          -- Create a window function to get the values from the previous row
              LAG(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY ModifiedDate) AS PreviousQuota,
              ModifiedDate AS ModDate
          FROM SaleGoal
    

Increasing Window Complexity

  • Reviewing Aggregations

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson, 
              SalesYear,
              CurrentQuota,
              SUM(CurrentQuota)
              OVER (PARTITION BY SalesYear) AS YearlyTotal,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Adding ORDER BY to an aggregation

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              SUM(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY SalesPerson) AS YearlyTotal,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Creating a running total with ORDER BY

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              SUM(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY ModifiedDate) AS RunningTotal,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Adding row numbers

    • ROW_NUMBER()sequentially numbers the rows in the window
    • ORDER BY is required when using ROW_NUMBER()
    1
    2
    3
    4
    5
    6
    7
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              ROW_NUMBER()
              OVER (PARTITION BY SalesPerson ORDER BY SalesYear) AS QuotabySalesPerson
          FROM SaleGoal
    

Using Windows For Calculating Statistics

  • Calculating the Standard Deviation

    • Calculate standard deviation either for the entire table or for each window
    • STDEV() calculates the standard deviation

    Calculating the standard deviation for the entire table

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT
              SalesPerson,
              SalesYear,
              CurrentQuota,
              STDEV(CurrentQuota)
              OVER() AS StandardDev,
              ModifiedDate as ModDate
          FROM SaleGoal
    

    Calculating the standard deviation for each partition

    1
    2
    3
    4
    5
    6
    7
    8
    
          SELECT 
              SalesPerson,
              SalesYeaer,
              CurrentQuota,
              STDEV(CurrentQuota)
              OVER (PARTITION BY SalesYear ORDER BY SalesYear) AS StDev,
              ModifiedDate AS ModDate
          FROM SaleGoal
    
  • Calculating the Mode

    • Mode is the value which appears the most often in your data
    • To calculate mode:
      • Create a CTE containing an ordered count of values using ROW_NUMBER
      • Write a query using the CTE to pick the value with the highest row number
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
          WITH QuotaCount AS (
              SELECT
                  SalesPerson,
                  SalesYear,
                  CurrentQuota,
                  ROW_NUMBER()
                  OVER (PARTITION BY CurrentQuota ORDER BY CurrentQuota) AS QuotaList
              FROM SaleGoal
          )
            
          SELECT 
              CurrentQuota, 
              QuotaList AS Mode
          FROM QuotaCount
          WHERE QuotaList In (SELECT MAX(QuotaList) FROM QuotaCount)
    
  • Advanced Window Functinos

    image

This post is licensed under CC BY 4.0 by the author.