Home Basic Aggregate Functions
Post
Cancel

Basic Aggregate Functions

  • Key Aggregation Functions

    • Counts
      • COUNT()
      • COUNT_BIG()
      • COUNT(DISTINCT)
    • Other Aggregates
      • SUM()
      • MIN()
      • MAX()
  • What counts with COUNT()

    • Number of Rows
      • COUNT(*)
      • COUNT(1)
      • COUNT(1/0)
    • Non-NULL Values
      • COUNT(d.YR)
      • COUNT(NULLIF(d.YR, 1990))
  • Distinct Counts

    1
    2
    3
    4
    
          SELECT
              COUNT(DISTINCT c.CalendarYear) AS Years,
              COUNT(DISTINCT NULLIF(c.CalendarYear, 2010)) AS Y2
          FROM dbo.Calendar c;
    
  • Filtering aggregates with CASE

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
          SELECT
              MAX(CASE WHEN ir.IncidentTypeId = 1
                      THEN ir.IncidentDate
                      ELSE NULL
                  END) AS I1,
              MAX(CASE WHEN ir.IncidentTypeID = 2
                      THEN ir.IncidentDate
                      ELSE NULL
                  END) AS I2,
          FROM dbo.IncidentRollup ir; 
    

  • Statistical Aggregate Functions

    • AVG() : Mean
    • STDEV(): Standard Deviation
    • STDEVP(): Population Stadard Deviation
    • VAR(): Variance
    • VARP(): Population Variance
  • What about median?

    1
    2
    3
    4
    5
    
          SELECT
              TOP(1) PERCENTILE_CONT(0.5)
                  WITHIN GROUP (ORDER BY l.SomeVal DESC)
                  OVER () AS MedianIncidents
          FROM dbo.LargeTable l;
    
  • The cost of median

     MedianMean
    Est. Cost95.7%4.3%
    Duration68.5s0.37s
    CPU68.5s8.1s
    Reads72,560,94639,468
    Writes87,9820

Downsampling and Upsampling Data

  • Data in nature

    1
    2
    3
    
          SELECT 
              SomeDate
          FROM dbo.SomeTable
    

    image

  • Downsampling Data

    1
    2
    3
    
          SELECT
              CAST(SomeDate AS DATE) AS SomeDate
          FROM dbo.SomeTable
    

    image

  • Further Downsampling

    1
    2
    3
    
          SELECT
              DATEADD(HOUR, DATEDIFF(HOUR,0,SomeDate), 0) AS SomeDate
          FROM dbo.SomeTable
    

    DATEDIFF(HOUR,0,'2019-08-11 06:21:16') = 1,048,470
    DATEADD(HOUR< 1048748,0) = 2019-08-11 06:00:00

    image

  • What about upsampling?

    DownsamplingUpsampling
    Aggregate dataDisaggregate data
    Can usually sum or coutn resultsNeed an allocation rule
    Provides a higher-level picture of the dataProvides artifical granularity
    Acceptable for most purposesAcceptable for data generation, calculated averages

Grouping by ROLLUP, CUBE, and GROUPING SETS

  • Hierarchial rollups with ROLLUP

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
          SELECT
              t.Month,
              t.Day,
              SUM(t.Events) AS Events
          FROM TABLE
          GROUP BY
              t.Month,
              t.Day
          WITH ROLLUP
          ORDER BY
              t.Month,
              t.Day;
    
  • Cartesian aggregation with CUBE

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
          SELECT
              t.IncidentType,
              t.Office,
              SUM(t.Events) AS Events
          FROM TABLE
          GROUP BY
              t.IncidentType,
              t.Office
          WITH ROLLUP
          ORDER BY
              t.IncidentType,
              t.Office;
    
  • Define grouping sets with GROUPING SETS

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
          SELECT
              t.IncidenType,
              t.Office,
              SUM(t.Events) AS Events
          FROM TABLE
          GROUP BY GROUPING SETS
          (
              (t.IncidentType, t.Office),
              ()
          )
          ORDER BY
              t.IncidentType,
              t.Office;
    
This post is licensed under CC BY 4.0 by the author.