Glossary/SQL Aggregation, GROUP BY, Window Functions & Constraints
SQL & Databases

SQL Aggregation, GROUP BY, Window Functions & Constraints

Summarising data, computing analytics, and enforcing business rules in SQL.


Definition

Aggregation functions (COUNT, SUM, AVG, MIN, MAX) summarise groups of rows into single values. GROUP BY partitions rows into groups for aggregation. HAVING filters those groups. Window functions compute aggregates over a sliding window of rows without collapsing them — enabling ranking, running totals, and moving averages. SQL integrity constraints enforce data correctness automatically at the database layer.

Aggregate functions and GROUP BY

Complete aggregation — e-commerce analytics

-- Orders(OrderID, CustomerID, Amount, Region, OrderDate, Status)

-- Basic aggregation
SELECT
    COUNT(*)              AS TotalOrders,
    COUNT(DISTINCT CustomerID) AS UniqueCustomers,
    SUM(Amount)           AS TotalRevenue,
    AVG(Amount)           AS AvgOrderValue,
    MIN(Amount)           AS SmallestOrder,
    MAX(Amount)           AS LargestOrder
FROM Orders WHERE Status = 'Completed';

-- GROUP BY with HAVING
SELECT
    Region,
    EXTRACT(YEAR FROM OrderDate) AS Year,
    COUNT(*)                     AS OrderCount,
    SUM(Amount)                  AS Revenue
FROM Orders WHERE Status = 'Completed'
GROUP BY Region, EXTRACT(YEAR FROM OrderDate)
HAVING SUM(Amount) > 100000
ORDER BY Revenue DESC;

-- ROLLUP: adds subtotals + grand total
SELECT
    COALESCE(Region, 'ALL')  AS Region,
    COALESCE(Status, 'ALL')  AS Status,
    SUM(Amount) AS Revenue
FROM Orders
GROUP BY ROLLUP(Region, Status);

-- Conditional aggregation (CASE inside SUM)
SELECT
    SUM(CASE WHEN Status = 'Completed' THEN Amount ELSE 0 END) AS CompletedRev,
    SUM(CASE WHEN Status = 'Cancelled' THEN 1     ELSE 0 END) AS CancelCount
FROM Orders;

Window functions — analytics without losing rows

Window functions: ranking, running totals, LAG/LEAD

-- Ranking
SELECT Name, Salary, DeptID,
    ROW_NUMBER() OVER (ORDER BY Salary DESC)                     AS GlobalRank,
    RANK()       OVER (ORDER BY Salary DESC)                     AS RankWithGaps,
    DENSE_RANK() OVER (ORDER BY Salary DESC)                     AS RankNoGaps,
    RANK()       OVER (PARTITION BY DeptID ORDER BY Salary DESC) AS DeptRank,
    NTILE(4)     OVER (ORDER BY Salary DESC)                     AS SalaryQuartile
FROM Employee;
-- ROW_NUMBER: 1,2,3,4,5     (no ties)
-- RANK:       1,2,2,4,5     (tie gets same rank, skips next)
-- DENSE_RANK: 1,2,2,3,4     (tie gets same rank, no skip)

-- Running total and 7-day moving average
SELECT OrderDate, Amount,
    SUM(Amount) OVER (ORDER BY OrderDate
                      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal,
    AVG(Amount) OVER (ORDER BY OrderDate
                      ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)         AS MovingAvg7Day
FROM Orders WHERE CustomerID = 101;

-- LAG/LEAD: access previous/next row
SELECT OrderDate, Amount,
    LAG(Amount)  OVER (ORDER BY OrderDate) AS PrevAmount,
    LEAD(Amount) OVER (ORDER BY OrderDate) AS NextAmount,
    Amount - LAG(Amount) OVER (ORDER BY OrderDate) AS ChangeFromPrev
FROM Orders WHERE CustomerID = 101;

SQL integrity constraints

All constraint types in one table

CREATE TABLE Product (
    ProductID    SERIAL         PRIMARY KEY,        -- auto-increment + NOT NULL + UNIQUE
    SKU          VARCHAR(20)    NOT NULL UNIQUE,    -- alternate key
    Name         VARCHAR(200)   NOT NULL,
    Price        DECIMAL(10,2)  NOT NULL CHECK (Price >= 0),
    DiscountPct  DECIMAL(5,2)   DEFAULT 0 CHECK (DiscountPct BETWEEN 0 AND 100),
    CategoryID   INT            NOT NULL
                                REFERENCES Category(CategoryID)
                                ON DELETE RESTRICT ON UPDATE CASCADE,
    Stock        INT            NOT NULL DEFAULT 0 CHECK (Stock >= 0),
    -- Table-level constraint across multiple columns
    CONSTRAINT chk_min_sale_price CHECK (Price * (1 - DiscountPct/100) >= Price * 0.1)
);

Practice questions

  1. Difference between RANK() and DENSE_RANK()? (Answer: Both give same rank to ties. RANK() skips next ranks after tie (1,2,2,4). DENSE_RANK() does not skip (1,2,2,3). Use DENSE_RANK for top-N filtering.)
  2. SELECT DeptID, AVG(Salary) FROM Employee HAVING AVG(Salary) > 50000 — is this valid? (Answer: Yes — without GROUP BY, entire table is one group. Returns one row if company average exceeds 50000.)
  3. COUNT(*) vs COUNT(column_name)? (Answer: COUNT(*) counts all rows including NULLs. COUNT(column) counts only non-NULL values in that column.)
  4. What does PARTITION BY do in a window function? (Answer: Resets the window computation for each partition group — like GROUP BY but without collapsing rows. Each partition gets its own independent ranking/aggregation.)
  5. What SQL constraint prevents EndDate before StartDate? (Answer: CHECK (EndDate >= StartDate) — user-defined integrity constraint.)

On LumiChats

Window functions are essential for data analysis. Ask LumiChats: 'Find the top 3 earners in each department with their rank and department average' — generates the complete PARTITION BY and RANK() query with explanation.

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

3 terms