SELECT Group By
A SELECT statement clause that divides the query result into groups of rows, usually for the purpose of performing one or more aggregations on each group. The SELECT statement returns one row per group.
Syntax​
GROUP BY {
column_expression
} [ ,...n ]
Arguments​
column_expression​
Specifies a column belonging to a blockchain. The column must appear in the FROM clause of the SELECT statement, but is not required to appear in the SELECT list. If a column appears in the select list and is not included in the group-by expression it must be aggregated using an aggregate function.
The column expression cannot contain:
- A column alias that is defined in the SELECT list. It can use a column alias for a derived blockchain that is defined in the FROM clause.
GROUP BY column-expression [ ,...n ]​
Groups the SELECT statement results according to the values in a list of one or more column expressions.
For example, this query creates a sales
table with columns for country
, region
, and sales
.
It inserts four rows and two of the rows have matching values for country
and region
.
CREATE BLOCKCHAIN sales HISTORICAL
(country STRING SIZE = 50 PACKED, region STRING SIZE = 50 PACKED, sales INT32);
INSERT sales VALUES ("Canada", "Alberta", 100), ("Canada", "British Columbia", 200),
("Canada", "British Columbia", 300), ("Canada", "British Columbia", 300),
("Canada", "British Columbia", 300), ("United States", "Montana", 100);
The sales
blockchain contains these rows:
country | region | sales |
---|---|---|
Canada | Alberta | 100 |
Canada | British Columbia | 200 |
Canada | British Columbia | 300 |
United States | Montana | 100 |
This next query groups country
and region
and returns the aggregate sum for each combination of values.
SELECT country, region, SUM(sales) AS totalsales
FROM sales
GROUP BY country, region;
The query result has 3 rows since there are 3 combinations of values for country
and region
. The TotalSales for Canada and British Columbia is the sum of two rows.
country | region | totalsales |
---|---|---|
Canada | Alberta | 100 |
Canada | British Columbia | 500 |
United States | Montana | 100 |
General Remarks​
How GROUP BY interacts with the SELECT statement​
SELECT list:
- Vector aggregates. If aggregate functions are included in the SELECT list, GROUP BY calculates a summary value for each group. These are known as vector aggregates.
WHERE clause:
- bSQL removes rows that do not meet the conditions in the WHERE clause before any grouping operation is performed.
HAVING clause:
- bSQL uses the having clause to filter groups in the result set.
ORDER BY clause:
- Use the ORDER BY clause to order the result set. The GROUP BY clause does not order the result set.
NULL values:
- If a grouping column contains NULL values, all NULL values are considered equal, and they are collected into a single group.
Examples​
The following examples use the DEMO DATABASE.
A. Use a simple GROUP BY clause​
The following example retrieves the total for each secotor
from the companies
blockchain.
SELECT sector, COUNT(*) AS total
FROM financial.companies
GROUP BY sector;
B. Use a GROUP BY clause with multiple tables​
The following example retrieves the number of sectors and the sum of their prices.
SELECT c.sector, COUNT(*) AS count, SUM(p.price) AS totalprice
FROM pricing AS p
JOIN companies AS c
ON c.symbol = p.symbol
GROUP BY c.sector;
C. Use a GROUP BY clause with a HAVING clause​
The following example uses the HAVING
clause to specify which of the groups generated in the GROUP BY
clause should be included in the result set.
SELECT c.sector, COUNT(*) AS count, SUM(p.price) AS totalprice
FROM pricing AS p
JOIN companies AS c
ON c.symbol = p.symbol
GROUP BY c.sector
HAVING SUM(p.price) > 10000;