Grouping & aggregation

Median and percentiles in SQL

3 min read

Most warehouses have an ordered-set aggregate for percentiles; MySQL and SQLite don’t, so they need a window trick.

-- PostgreSQL, Snowflake, DuckDB  (WITHIN GROUP ordered-set aggregate)
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) AS median
FROM orders;
-- 0.9 for the 90th percentile, etc. Snowflake and DuckDB also have MEDIAN(amount).

-- BigQuery  (no WITHIN GROUP — use the window form or approx)
SELECT DISTINCT PERCENTILE_CONT(amount, 0.5) OVER () AS median FROM orders;
-- faster on big tables:
SELECT APPROX_QUANTILES(amount, 100)[OFFSET(50)] AS median_approx FROM orders;

MySQL 8+ and SQLite have no built-in — find the middle row(s) with ROW_NUMBER:

-- MySQL 8+ / SQLite 3.25+  (average the middle one or two rows)
SELECT AVG(amount) AS median
FROM (
  SELECT amount,
    ROW_NUMBER() OVER (ORDER BY amount) AS rn,
    COUNT(*)     OVER ()                AS cnt
  FROM orders
) t
WHERE rn IN (FLOOR((cnt + 1) / 2), CEIL((cnt + 1) / 2));

PERCENTILE_CONT vs PERCENTILE_DISC. _CONT interpolates between the two middle values (the usual “median”); _DISC returns an actual value from the data. For an even row count they can differ — the window workaround above mirrors _CONT by averaging the two middle rows.

Open the full version (with copy buttons) ↗

← All recipes