4 SQL Aggregation
Coding
SQL
Database
Aggregation
This lecture discusses SQL Aggregation, including computing on a column,
GROUP BY
, and HAVING
clauses.
Aggregation
- Computing on a column
- We often want to compute something across the values in a column
SUM
,AVG
,MIN
, andMAX
- We can also count the number of tuples (rows) by using
COUNT(*)
SELECT *
FROM Took;
SELECT grade
FROM Took;
SELECT AVG(grade)
FROM Took;
SELECT AVG(grade) AS myAVG /* It is recommend to always rename */
FROM Took;
/* One can combine multiple aggregations */
SELECT MAX(grade), AVG(grade), COUNT(*), MIN(sID)
FROM Took;
GROUP BY
- If we follow a
SELECT-FROM-WHERE
expression withGROUP BY <attributes>
- The rows are grouped together according to the values of those attributes, and
- any aggregations is applied only within each group
SELECT oID, AVG(grade) AS offAvg
FROM Took
GROUP BY oID;
- Note:
SELECT
cannot include un-aggregated columns
/* The following will produce an error */
SELECT oID, AVG(grade) AS offAvg, sID
FROM Took
GROUP BY oID;
- Restrictions on Aggregation:
- If any aggregation is used, then each element of the
SELECT
list must be either- aggregated, or
- an attribute on the
GROUP BY
list.
- If any aggregation is used, then each element of the
HAVING
- Sometimes, we want to keep some groups and eliminate others from our result set.
WHERE
let us decide which tuples to keep - performed before the groupingHAVING
decides which groups to keep - performed after the grouping
...
GROUP BY <attributes>
HAVING <condition>
- Only groups satisfying the condition are kept.
- Requirements on
HAVING
clauses:- Outside subqueries,
HAVING
may refer to attributes only if they are either:- aggregated, or
- an attribute on the
GROUP BY
list
- This requirement is the same as
SELECT
clause.
- Outside subqueries,
SELECT oID, AVG(grade) as offAvg
FROM Took
GROUP BY oID
HAVING AVG(grade) > 80;
/* Cannot use offAvg here. HAVING is executed before SELECT*/
/* The following two queries are WRONG / produce undesired results */
SELECT oID, AVG(grade) as offAvg
FROM Took
GROUP BY oID
WHERE AVG(grade) > 80; /* Incorrect! */
SELECT oID, AVG(grade) as offAvg
FROM Took
WHERE AVG(grade) > 80 /* Incorrect! */
GROUP BY oID;
/* Another HAVING example */
SELECT oID, avg(grade) as offAvg
GROUP BY oID
HAVING oID <= 5
ORDER BY oID;
Order of Execution of a SQL Query
Query Order | Execution Order |
---|---|
SELECT |
FROM |
FROM |
WHERE |
WHERE |
GROUP BY |
GROUP BY |
HAVING |
HAVING |
SELECT |
ORDER BY |
ORDER BY |
- However, note that we can order by a column even if it is not selected.
/* The following query is legal. */
SELECT sID
FROM Took
GROUP BY sID
HAVING AVG(grade) > 80
ORDER BY AVG(grade)