SQL GROUP BY Clause
The SQL GROUP BY statement is used in conjunction with the aggregate functions to arrange identical data into groups.
The SQL GROUP BY statement is used to collect data across multiple records and group the result-set by one or multiple columns.
In the SQL GROUP BY statement must contains a aggregate function in the SQL query.
The SQL GROUP BY statement can be applied in any valid SQL SELECT statement.
Related Links
Sql GROUP BY Syntax
The basic syntax is:
SELECT column_name1, column_name2, aggregate_function(column_name)
FROM table_name
GROUP BY column_name1, column_name2;
The order of group by statement will look the below statement:
SELECT column_name1, column_name2, aggregate_function(column_name)
FROM table_name
[WHERE condition]
GROUP BY column_name1, column_name2
[ORDER BY column_name1, column_name2];
Note:
- In every GROUP BY statement must contain atleast one aggregate function.
- You can group only the column which you listed in SELECT statement.
Sample Database Table - Books
BookId | BookName | BookPrice | RelYear | DomainName | AuthorName |
---|---|---|---|---|---|
1 | Oracle Interview Questions | 120 | 2015 | Optimization | Balan |
2 | SQL Puzzles & Answers | 136.33 | 2014 | Database | Balan |
3 | The Microsoft Data Warehouse | 84.22 | 2015 | Performance | Siva Kumar |
4 | Jump Start MySQL | 178.69 | 2012 | Performance | Padmavathi |
5 | Head First SQL | 150 | 2011 | Performance | Balan |
6 | Pro SQL Azure | 200 | 2015 | Optimization | Keshavan |
SQL GROUP BY Example
The following SQL statement will calculates total price into "DomainName" wise from the "Books" table:
SELECT DomainName, SUM(BookPrice) As 'Total Price'
FROM Books
GROUP BY DomainName;
The result of above query is:
DomainName | Total Price |
---|---|
Database | 136.33 |
Optimization | 320 |
Performance | 412.91 |
The following SQL statement will calculates total price into "DomainName" wise and select records that has "RelYear" < 2014 from the "Books" table:
SELECT DomainName, SUM(BookPrice) As 'Total Price'
FROM Books
WHERE RelYear < 2014
GROUP BY DomainName;
The result of above query is:
DomainName | Total Price |
---|---|
Database | 136.33 |
Performance | 328.69 |
SQL GROUP BY More Than One Column Example
The following SQL statement will calculates total price into "DomainName" and "RelYear" wise and select records that has "RelYear" greater than "2010" and less than "2015" from the "Books" table:
SELECT DomainName, RelYear, SUM(BookPrice) As 'Total Price'
FROM Books
WHERE RelYear > 2010 AND RelYear < 2015
GROUP BY DomainName, RelYear;
The result of above query is:
DomainName | RelYear | Total Price |
---|---|---|
Database | 2014 | 136.33 |
Performance | 2011 | 150 |
Performance | 2012 | 178.69 |
The following SQL statement will calculates total price into "DomainName" and "RelYear" wise, than order the result by "RelYear" column from the "Books" table:
SELECT DomainName, RelYear, SUM(BookPrice) As 'Total Price'
FROM Books
WHERE RelYear > 2010 AND RelYear < 2015
GROUP BY DomainName, RelYear
ORDER BY RelYear;
The result of above query is:
DomainName | RelYear | Total Price |
---|---|---|
Performance | 2011 | 150 |
Performance | 2012 | 178.69 |
Database | 2014 | 136.33 |
Related Links
SQL Groupby 5 Minutes
Example 1: If you require a timing compensation to keep a transaction running after it has failed abnormally. The application must be run every 5 minutes as per the requirement. The logic is straightforward. Simply type a query statement.
SELECT * FROM table name t WHERE condition one
AND condition two
AND t. time <= CONCAT(date_format(DATE_SUB(NOW(), INTERVAL 5 MINUTE),'%Y%m%d%H%i%S'),'000')
LIMIT 0, 1000 GROUP BY date;
Example 2: Every n seconds, a monitoring system collects data (n is approximately 10 but varies). I'd like to group the data into five-minute intervals. Is there a way to group the timestamp information into 5-minute pieces so that grouping works:
SELECT FLOOR(UNIX_TIMESTAMP(timestamp)/(5 * 60)) AS timekey
FROM table
GROUP BY timekey;
SQL Groupby 7 Days
Method 1: The GROUP BY clause is used with DATEPART() to group data by week. We also use it to display the week number in the SELECT clause.
SQL Server uses the DATEFIRST option to determine which weekday should be considered the first day of the week.
If the DATEFIRST value is the first day of the week is:
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
DATEPART() with week uses the week numbering:
- If Sunday is the first day of the week (DATEFIRST = 7), Week 2 starts on Sunday 6 January and ends on Saturday 12 January.
- If Monday is the first day of the week (DATEFIRST = 1), Week 2 starts on Monday 7 January and ends on Sunday 13 January.
- If Thursday is the first day of the week (DATEFIRST = 4), Week 2 starts on Thursday 3 January and ends on Wednesday 9 January.
Note: The DATEFIRST setting depends on SQL Server's language version. The default value for U.S. English is 7 (i.e. Sunday).
Have a look at the query below:
SELECT DATEPART(week, RegistrationDate) AS Week,
COUNT(CustomerID) AS Registrations
FROM Customers
WHERE '20180101' <= RegistrationDate
AND RegistrationDate < '20190101'
GROUP BY DATEPART(week, RegistrationDate)
ORDER BY DATEPART(week, RegistrationDate);
Output:
Week Registrations
1 62
2 112
... ...
52 98
Look at the article How to Get the First Day of the Week if you have trouble reading week numbers.
The week where the year finishes and the next begins is frequently split when using DATEPART() with week. To put it another way, the last few days of December fall into weeks 52/53 of the previous year, while the first few days of January fall into week 1 of the new year.
Method 2: Using the MySQL WEEK function, you may get the date's week number. Based on the Server System Variables, the start of the week is set to Sunday or Monday by default. Otherwise, it can be set while the function is being used.
So the query
SELECT SUM(cost) AS total, CONCAT(date, ' - ', date + INTERVAL 6 DAY) AS week
FROM daily_expense
GROUP BY WEEK(date)
ORDER BY WEEK(date)
Output:
+-------+-------------------------+
| total | week |
+-------+-------------------------+
| 83 | 2010-07-31 - 2010-08-06 |
| 427 | 2010-08-07 - 2010-08-13 |
| 345 | 2010-08-14 - 2010-08-20 |
| 389 | 2010-08-21 - 2010-08-27 |
| 318 | 2010-08-28 - 2010-09-03 |
| 404 | 2010-09-04 - 2010-09-10 |
| 272 | 2010-09-11 - 2010-09-17 |
| 40 | 2010-09-13 - 2010-09-19 |
+-------+-------------------------+
SQL Groupby Alias
In SQL Server, create a group by for the alias column. In general, we can't utilise alias column names in query functions like where, group by, order by, and so forth. To implement group by for the alias column, every sql query statements must be wrapped in a subquery and grouping must be done outside of the statement.
Because of the logical sequence of processing, SQL Server does not enable you to reference the alias in the GROUP BY clause. The alias is not known when the GROUP BY clause is analysed because it is executed before the SELECT clause.
Example 1: Using the alias in the ORDER BY clause:
SELECT ItemName, FirstLetter, COUNT(ItemName)
FROM(
SELECT ItemName, SUBSTRING(ItemName, 1, 1) AS FirstLetter FROM table1
) ItemNames
Example 2: To implement group by to alias column name in sql server we need to write the query like as shown below:
Select Joindate,count(Id) as NumberofEmployees
FROM (
Select Id,Name,CreatedDate as Joindate from @temp
) subdata
group by Joindate
If you observe above query “JoinDate” is an alias column name in sub query and we are using same “JoinDate” alias column in group by outside of that statements in sql server.
Example 3: Because the GROUP BY clause is run before the SELECT clause in Oracle and SQL Server, you can't utilize a term defined in the SELECT clause in the GROUP BY clause.
SELECT
itemName as ItemName,
substring(itemName, 1,1) as FirstLetter,
Count(itemName)
FROM table1
GROUP BY itemName, FirstLetter
SQL Groupby All
Example 1: Kirkland and Tacoma aren't listed because no employees from these cities were employed in or after 1993. The GROUP BY clause usually only affects rows that remains after the WHERE filter has been applied.
You can produce empty groups—or groups with zero rows for cities that the WHERE clause filters out—by adding the ALL option to the GROUP BY clause. The consequences of appending the ALL option to the GROUP BY clause are shown in Table 3:
SELECT City, COUNT(*) AS
NumEmps FROM dbo.Employees
WHERE HireDate >= '19930101'
GROUP BY ALL City;
Kirkland and Tacoma now appear in the query results, each with a count of 0, showing that no employees from these cities were hired in or after 1993. Including these cities in the result set indicates that the Employees table contains employees from these cities—just not employees hired in or after 1993.
Get the same result without using the GROUP BY ALL option, but the alternatives are more complex.
Example 2: SQL GROUP BY ALL
SELECT start_Date,
SUM(salary) Total
FROM employee
WHERE start_Date BETWEEN '7/1/2001' AND '7/31/2005'
GROUP BY ALL Region;
GO
Output:
Column 'employee.start_Date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Example 3: SQL you may not be aware of: The "ALL" option when using a GROUP BY:
replace the SUM(CASE…) expression by using GROUP BY ALL, instead of just a standard GROUP BY, like this:
select CustomerID, sum(Amount) as TotalAmount
from Sales
where ProductID = 1
group by all CustomerID
Output:
CustomerID TotalAmount
———– ———————
1 1010.00
2 NULL
3 7730.00
Warning: Null value is eliminated by an aggregate or other SET operation.
SQL Groupby and Orderby
GROUP BY and ORDER BY are clauses (or statements) that perform the same task of sorting query results.
GROUP BY and ORDER BY are two significant SQL keywords for organising data. They serve extremely distinct goals, so much so that they can be used individually or simultaneously.
As a result of sorting, data with similar values will be grouped together, making it easier for you to manually evaluate afterwards. However, GROUP BY is the SQL method for analyzing comparable data.
The DESC keyword is used to sort the result set in descending order.
It's frequently used with aggregate functions like AVG(), MAX(), COUNT(), MIN(), and so on. One thing to keep in mind regarding the group by clause is that tuples are grouped based on their attribute values' similarity.
Without aggregation operations, GROUP BY has no actual purpose. GROUP BY can sometimes provide results in sorted order, but you shouldn't count on it. The sequence in which results are returned is non-deterministic and is determined by the query's execution by the database engine.
The GROUP BY statement comes before the ORDER BY statement since the latter works with the query's final result.
Groupby | Orderby |
---|---|
The ORDER BY clause sorts the query result based on certain columns. | The goal of the GROUP BY clause is to sum up unique combinations of column values. |
The group by statement is used to group rows with the same value together. | The Order by statement, on the other hand, sorts the result set in ascending or descending order. |
It's possible that it'll be allowed in the CREATE VIEW statement. | It isn't used in the CREATE VIEW statement. |
It is always used before the order by keyword in a select statement. | It is always used after the group by keyword in a select statement. |
An attribute cannot be in the aggregate function's group by statement. | When using the order by statement method, the attribute can be placed under the aggregate function. |
The tuples are sorted in the group by clause based on their attribute values' matching. | The result-set is sorted in ascending or descending order in the order by clause. |
The display of tuples is controlled by grouping (rows). | The order by clause determines how columns are displayed. |
The WHERE clause is followed by the GROUP BY clause. | The ORDER BY clause comes after the GROUP BY clause. |
Order By Syntax : SELECT column_1, column_2, column_3........... FROM Table_Name ORDER BY column_1, column_2, column_3....... ASC|DESC; Table_Name: Name of the table. ASC: keyword for ascending order DESC: keyword for descending order |
Group By Syntax : SELECT function_Name(column_1), column_2 FROM Table_Name WHERE condition GROUP BY column_1, column_2 ORDER BY column_1, column_2; |
Example 1: This situation occasionally tricks people into believing that GROUP BY sorts results. In actuality, there is no guarantee that GROUP BY will display results in ascending order. If you need results in a specific order, you have to do it yourself like below:
SELECT style,
COUNT(Name)
FROM beers
GROUP BY style
ORDER BY style
Therefore, GROUP BY works nicely along with ORDER BY.
Example 2: Before we get into their differences consider the general setup of the SELECT statement:
SELECT columnlist
From table
GROUP BY columnA, columnB
ORDER BY columnlist
The ORDER BY clause is at the end of the list. This can help you figure out that the ORDER BY statement is used to order the query's final result. In reality, it can be used to arrange GROUP BY clause results.
SQL Groupby Calculated Column
Example 1: Add the same calculation to the GROUP BY clause:
select dateadd(day, -7, Convert(DateTime, mwspp.DateDue)
+ (7 - datepart(weekday, mwspp.DateDue))),
sum(mwspp.QtyRequired)
from manufacturingweekshortagepartpurchasing mwspp
where mwspp.buildScheduleSimID = 10109 and mwspp.partID = 8366
group by dateadd(day, -7, Convert(DateTime, mwspp.DateDue)
+ (7 - datepart(weekday, mwspp.DateDue)))
order by dateadd(day, -7, Convert(DateTime, mwspp.DateDue)
+ (7 - datepart(weekday, mwspp.DateDue)))
Example 2: GROUP BY calculated column:
SELECT PAYMENTNO, AMOUNT * 100 AS CENTS
FROM PENALTIES
GROUP BY CENTS
ORDER BY CENTS;
Output:
+-----------+----------+
| PAYMENTNO | CENTS |
+-----------+----------+
| 5 | 2500.00 |
| 7 | 3000.00 |
| 4 | 5000.00 |
| 2 | 7500.00 |
| 1 | 10000.00 |
+-----------+----------+
5 rows in set (0.00 sec)
SQL Groupby Case
In SQL, the case statement returns a value based on a condition. Along with the Where, Order By, and Group By clauses, we may utilize a Case statement in select queries. It's also possible to utilize it in the Insert statement. We'll look at the CASE statement and its different applications in this post.
Example 1: Compute the total count of products such as "SQL" and "Oracle" using the query below, which includes a Group By clause and a CASE statement:
SELECT
CASE WHEN Products LIKE '%SQL%' THEN 'MSSQL'
WHEN Products LIKE '%ORACLE%' THEN 'Oracle'
ELSE NULL END Products ,
SUM(Count) OverallCount
FROM @Tmp
GROUP BY CASE WHEN Products LIKE '%SQL%' THEN 'MSSQL'
WHEN Products LIKE '%ORACLE%' THEN 'Oracle'
ELSE NULL END
Example 2: We can use the CASE expression we’ve built to create summary groups. In the following SQL we’re grouping the data by PriceRange. Summary statistics on the minimum, maximum, and average ListPrice are created.
SELECT
CASE
WHEN ListPrice = 0 THEN 'No Price'
WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
WHEN ListPrice > 2000 THEN 'Luxury'
ELSE 'UNLISTED'
END as PriceRange,
Min(ListPrice) as MinPrice,
Max(ListPrice) as MaxPrice,
AVG(ListPrice) as AvgPrice,
Count(ListPrice) as NumberOfProducts
FROM Production.Product
GROUP BY CASE
WHEN ListPrice = 0 THEN 'No Price'
WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
WHEN ListPrice > 2000 THEN 'Luxury'
ELSE 'UNLISTED'
END
ORDER BY MinPrice
Unlike the ORDER BY clause, we can’t reference the column alias PriceRange in the GROUP BY. The entire CASE expression has to be repeated.
Example 3: Repeat the entire CASE statement, you should remove the AS year_group column naming when you copy/paste into the GROUP BY clause:
SELECT CASE WHEN year = 'FR' THEN 'FR'
WHEN year = 'SO' THEN 'SO'
WHEN year = 'JR' THEN 'JR'
WHEN year = 'SR' THEN 'SR'
ELSE 'No Year Data' END AS year_group,
COUNT(1) AS count
FROM benn.college_football_players
GROUP BY CASE WHEN year = 'FR' THEN 'FR'
WHEN year = 'SO' THEN 'SO'
WHEN year = 'JR' THEN 'JR'
WHEN year = 'SR' THEN 'SR'
ELSE 'No Year Data' END
Example 4: When there are naming conflicts or when the output column (the CASE expression) is confused with the source column result, which has different content.
GROUP BY model.name, attempt.type, attempt.result
You need to GROUP BY your CASE expression instead of your source column:
GROUP BY model.name, attempt.type
,CASE WHEN attempt.result = 0 THEN 0 ELSE 1 END
Example 5: CASE statement in a GROUP BY clause.
Because a given collection of data may be needed in slightly different contexts by different groups inside the company, you'll see this a lot. This, like many of my example queries, should be written better. It, like many of my sample queries, reflects what I observe in the wild (and for those keeping track at home, I'm now testing with the WideWorldImporters database).
CREATE PROCEDURE dbo.InvoiceGrouping (@x INT)
AS
SELECT SUM(il.UnitPrice),
COUNT(i.ContactPersonID),
COUNT(i.AccountsPersonID),
COUNT(i.SalespersonPersonID)
FROM Sales.Invoices AS i
JOIN Sales.InvoiceLines AS il
ON il.InvoiceID = i.InvoiceID
GROUP BY CASE WHEN @x = 7 THEN i.ContactPersonID
WHEN @x = 15 THEN i.AccountsPersonID
ELSE i.SalespersonPersonID
END;
GO
Running this for any given value above, 7, 15 or other, you’ll get the same execution plan, regardless of the column used in the GROUP BY. However, Parameter Sniffing is still something of a factor. When you group this data by SalesPersonID, you only get 10 rows back. This will be shown as the estimated number of rows returned if some value other than 7 or 15 is used as a parameter.
SQL Groupby Concatenate Column
To concatenate strings in MySQL using GROUP BY, use GROUP_CONCAT() with a SEPARATOR parameter of comma(') or space (' '), for example.
This function returns a string containing all of the non-NULL values from a group concatenated together.
Syntax:
GROUP_CONCAT(expr);
Example 1: script to perform concatenation of strings by applying GROUP BY in SQL Server.
SELECT [EmpID],
STUFF((
SELECT ', ' + [EmpName] + ':' + CAST([DeptName] AS VARCHAR(MAX))
FROM tbl_GroupStringTable
WHERE (EmpID = Results.EmpID)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS NameValues
FROM tbl_GroupStringTable Results
GROUP BY EmpID
Example 2: use GROUP BY to concatenate the strings in MySQL. Perform GROUP BY on the basis of Id and concatenate the strings using GROUP_CONCAT() function in MySQL.
select Id,group_concat(Name SEPARATOR ',') as GroupConcatDemo from GroupConcatenateDemo
group by Id;
Output:
+------+-----------------+
| Id | GroupConcatDemo |
+------+-----------------+
| 10 | Larry,Elon,Bob |
| 11 | Mike,Sam |
| 12 | John |
+------+-----------------+
SQL Groupby Count
The query will produce the number of items as the count for each subgroup produced based on the table column values or expressions if the COUNT() function is used in conjunction with GROUP BY. A set of identical values (on a column) will be handled as a separate entity.
The count() function is an aggregate function that is used to determine the number of rows that meet the specified criteria. When we use the COUNT() function in conjunction with the GROUP BY clause, the number of counts for each of the given grouped rows in the table query will be displayed.
Using the table column value or any expression, the GROUP BY clause creates a cluster of rows into a type of summary table row. We also utilize MySQL aggregate functions with the GROUP BY clause to group entries having a calculated value in the column. MAX, MIN, COUNT, SUM, and AVG are examples of this, which are used with a SELECT statement to offer information about each group in the result set.
Syntax:
SELECT attribute1 , COUNT(attribute2)
FROM table_name
GROUP BY attribute1
Example 1: Query to find the number of students in each branch:
SELECT stu_branch, COUNT(stu_id) AS number_of_students
FROM student_marks
GROUP BY stu_branch
Example 2: To get data of 'working_area' and number of agents for this 'working_area' from the 'agents' table with the following condition:
'working_area' should come uniquely, the following SQL statement can be used :
SELECT working_area, COUNT(*)
FROM agents
GROUP BY working_area;
Example 3: To determine the number of payment transactions completed by each staff, group the entries in the payment table by the values in the staff_id column and use the COUNT() function to get the number:
SELECT staff_id,
COUNT (payment_id)
FROM
payment
GROUP BY
staff_id;
The GROUP BY clause separates the payment rows into groups and organises them according to the value in the staff id_column. It uses the COUNT() function to return the number of rows for each group.
Example 4: MySQL GROUP BY COUNT is a query that groups data based on a specific column value, and each group has its own count number that represents the group's identical values. The following SQL statement is used to perform this information in the result set rows.
SELECT CustomerName, City, COUNT(*) FROM Customer GROUP BY City;
SQL Groupby Count Condition
The groupby clause with some count condition.
Example 1: Query that counts the number of Upgrades and group by name:
SELECT Name, COUNT(Upgraded) as 'Total Upgrades'
WHERE Upgrade = '1'
GROUP BY Name
Output:
Name Total Upgrades
Jones 2
Smith 1
Brown 3
Other wise, Use a HAVING clause to filter an aggregated column with group by and condition.
SELECT id, count(oID)
FROM MyTable
GROUP BY oID
HAVING count(oID) = 1
Example 2: Query to GROUP BY COUNT with WHERE clause:
mysql> SELECT * FROM GroupByWithWhereClause
-> WHERE IsDeleted= 0 AND MoneyStatus= 'done'
-> GROUP BY SUBSTR(UserId,1,3)
-> HAVING COUNT(*) > 1
-> ORDER BY Id DESC;
Output:
+----+-----------+-------------+--------+
| Id | IsDeleted | MoneyStatus | UserId |
+----+-----------+-------------+--------+
| 18 | 0 | done | 106 |
| 8 | 0 | done | 103 |
| 2 | 0 | done | 101 |
+----+-----------+-------------+--------+
SQL Groupby Count Distinct
For each GROUP BY result, the COUNTDISTINCT function returns the number of unique values in a field.
COUNTDISTINCT can only be used on single-assigned characteristics, not multi-assigned ones. The use of a multi-assign attribute produces erroneous results.
Example 1: The project's slow query has both COUNT(DISTINCT) and GROUP BY. The query response time is quite slow. The query of 260,000 records takes 18 seconds. Here's the sql code:
SELECT
a.channel_code AS channelCode,
a.channel_name AS channelName,
DATE_FORMAT(a.create_date, '%Y') AS orderDate,
COUNT(DISTINCT a.order_no) AS orderCount,
COUNT(DISTINCT a.user_id) AS userCount,
SUM(a.payment) AS totalAmount,
SUM(a.content_rate_fee) AS rateAmount
FROM
user_order a
WHERE
a.del_flag = '0'
AND a.create_date >= '2017'
AND a.create_date <= '2018'
GROUP BY
a.channel_code,
DATE_FORMAT(a.create_date, '%Y') DESC
LIMIT 80,20
Example 2: For example, for the following records:
- Record 1: Size=small, Color=red
- Record 2: Size=small, Color=blue
- Record 3: Size=small, Color=red
- Record 4: Size=small
The following statement returns for each size the number of different values for the Color attribute:
RETURN result AS
SELECT COUNTDISTINCT (Color) as Total
GROUP BY Size
Output:
There is only one group, and hence only one record, because all of the records have the same Size value. Because the Color attribute has two distinct values: red and blue, the Total value for this group is 2.
Example 3: COUNT DISTINCT and GROUP BY:
select Coder
, count(distinct course)
, count(*)
from offerings
group by Coder;
Output:
CODER COUNT(DISTINCTCOURSE) COUNT(*)
---------- --------------------- ----------
1 2 3
4 2 2
8 2 2
11 1 1
13 2 2
3 3
SQL Groupby Date Range
Example 1: The months, quarters, and years are organized as follows. Between the zero date and the provided date, the integer number of equivalent units is added back to the zero date. That places us at the start of the period. The end is found in the same way, with the exception that we add the number that is one bigger than the difference. That provides us the start of the next period, from which we subtract one day to get the right ending date.
SELECT PeriodStart,
PeriodEnd,
Count = SUM(Count)
FROM (
SELECT
PeriodStart = CASE @Frequency
WHEN 'day' THEN Date
WHEN 'week' THEN DATEADD(DAY, 1 - DATEPART(WEEKDAY, Date), Date)
WHEN 'month' THEN DATEADD(MONTH, DATEDIFF(MONTH, 0, Date), 0)
WHEN 'quarter' THEN DATEADD(QUARTER, DATEDIFF(QUARTER, 0, Date), 0)
WHEN 'year' THEN DATEADD(YEAR, DATEDIFF(YEAR, 0, Date), 0)
END,
PeriodEnd = CASE @Frequency
WHEN 'day' THEN Date
WHEN 'week' THEN DATEADD(DAY, 7 - DATEPART(WEEKDAY, Date), Date)
WHEN 'month' THEN DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, 0, Date) + 1, 0))
WHEN 'quarter' THEN DATEADD(DAY, -1, DATEADD(QUARTER, DATEDIFF(QUARTER, 0, Date) + 1, 0))
WHEN 'year' THEN DATEADD(DAY, -1, DATEADD(YEAR, DATEDIFF(YEAR, 0, Date) + 1, 0))
END,
Count
FROM atable
WHERE Date BETWEEN @DateStart AND @DateEnd
) s
GROUP BY
PeriodStart,
PeriodEnd
EXEC spReport '1/1/2011', '12/31/2011', 'day':
Output:
PeriodStart PeriodEnd Count
----------- ---------- -----
2011-11-15 2011-11-15 6
2011-12-16 2011-12-16 9
2011-12-17 2011-12-17 2
2011-12-18 2011-12-18 5
Example 2: GROUP BY DATE Range:
WITH FixDate AS (SELECT CAST(chardate as date) AS realdate
FROM #temp),
MyRange AS (SELECT CASE WHEN datediff(year,RealDate,getdate()) < 13 THEN '<13'
WHEN datediff(year,RealDate,getdate()) between 13 and 18 THEN '13-18'
WHEN datediff(year,RealDate,getdate()) between 19 and 35 THEN '19-35'
WHEN datediff(year,RealDate,getdate()) between 36 and 55 THEN '36-55'
WHEN datediff(year,RealDate,getdate()) > 55 THEN '>55' END AS [Range]
FROM FixDate)
SELECT [Range], count(1)
FROM MyRange
GROUP BY [Range]
Example 3: day to begin and end at 04:00:00 as in my earlier query.
I have tried several queries without success, but am unsure of how to proceed. For instance:
SELECT `date`, SUM(`quantity`)
FROM `liquids`
WHERE ((`date` = DATE_ADD(`date`, INTERVAL 0 DAY) AND `time` > '04:00:00')
OR
(`date` = DATE_ADD(`date`, INTERVAL 1 DAY) AND `time` <= '04:00:00'))
GROUP BY `date`;
Example 4: Group on -almost- anything you want to. If you can get something in a column to show what you want, you can group on it. So for just two periods you could simply group on an if:
SELECT * FROM table
GROUP BY if(stamp between "2005-07-01" and "2006-02-01", 0, 1)
SQL Groupby Distinct
GROUP BY and DISTINCT It was a comparison that revealed that GROUP BY is a preferable option than DISTINCT in most cases.
By deleting duplicates, the DISTINCT and GROUP BY clauses minimise the amount of returned rows in the result set.
When you want to apply an aggregate function to one or more columns, however, you should use the GROUP BY clause.
Example 1: Patients with at least two weight readings above 150 on the same visit:
SELECT DISTINCT p.patient_id
FROM PATIENTS p
WHERE p.weight_val > 150
GROUP BY p.patient_id, p.visit_id
HAVING COUNT(*) >= 2
Example 2: It is equivalent to the following query that uses the DISTINCT operator :
SELECT DISTINCT
city,
state,
zip_code
FROM
sales.customers;
GROUP BY city;
Example 3: One of the query comparisons I provided in that post was for a sub-query between a GROUP BY and DISTINCT, demonstrating that the DISTINCT is significantly slower since it must fetch the Product Name for every entry in the Sales database, rather than only for each individual ProductID. This is clear from the query plans, which show that the Aggregate in the first query works with data from just one table rather than the join results. Oh, and the results of both queries are the same: 266 rows.
select distinct od.ProductID,
(select Name
from Production.Product p
where p.ProductID = od.ProductID) as ProductName
from Sales.SalesOrderDetail od;
SQL Groupby Exists
Here we have discussed how SQL EXIST can work with GROUP BY in a select statement.
To get 'cust_code', 'cust_name', 'cust_city' and 'grade' from the 'customer' table, with following conditions:
'grade' in 'customer' table must be 2, more than 2 agents are present in grade 2, 'grade' in customer table should make a group, the following SQL statement can be used :
SELECT cust_code,cust_name,cust_city,grade
FROM customer
WHERE grade=2 AND
EXISTS(
SELECT COUNT(*) FROM customer
WHERE grade=2
GROUP BY grade
HAVING COUNT(*)>2);
SQL Groupby First Characters
Example 1: perform some aggregation function on EMPLOYEE_ID if you want that to work :
select substr(first_name,1,1) as alpha, count(employee_id)
from employees
group by substr(first_name,1,1)
Example 2: group by the first 5 characters of the address too, you can do this:
select firstname, MAX(address) AS Address
from t
group by firstname, SUBSTRING(address,0,5)
Example 3: SQL Server also allows you to specify an expression for the columns in the GROUP BY clause, as I have down in the code below:
SELECT CONVERT(CHAR(7),H.OrderDate,120) AS [YYYY-MM]
, SUM(LineTotal) AS SummarizedLineTotal
FROM AdventureWorks2012.Sales.SalesOrderDetail D
JOIN AdventureWorks2012.Sales.SalesOrderHeader H
ON D.SalesOrderId = H.SalesOrderID
GROUP BY CONVERT(CHAR(7),H.OrderDate,120)
ORDER BY SummarizedLineTotal DESC;
This code groups the data by the year and month of the OrderDate. By using the expression CONVERT(CHAR(7),H.OrderDate,120) I told SQL Server to take the first seven characters of the OrderDate ODBC canonical date format, which is the YYYY-MM portion of the OrderDate. Being able to summarize my data based on this expression allowed me to determine the total SummarizeLineTotal value for a given year and month. By using an expression in the GROUP BY clause and ordering based on the LineTotal amount I can determine which year and month had the greatest and least SummarizeLineTotal amount.
SQL Groupby vs Having
Main Article :- Sql difference between GROUP BY and HAVING Clause
The SQL HAVING clause allows you to filter data after it has been grouped using the GROUP BY clause in SQL.
The aggregate function with the GROUP BY clause is similar to the Having Clause. With aggregate functions, the HAVING clause is utilised instead of the WHERE clause. The GROUP BY Clause, on the other hand, groups rows with the same values into summary rows.
The where clause is combined with the having clause to discover rows that meet particular criteria. After the group, the having clause is always utilised. By clause.
The SQL SELECT statement is combined with the GROUP BY Clause. Only column names, aggregate functions, constants, and expressions can be utilized in the SELECT statement used in the GROUP BY clause.
When all of the data in a table is grouped together. There will be occasions when we need to limit the results to a certain set of criteria. In such circumstances, the HAVING clause can be used.
If you add the conditions after the term HAVING, any entries that fulfil these requirements will be displayed after the GROUP BY, while the rest will be hidden.
Groupby | Having |
---|---|
MYSQL is a database management system. BY GROUPING Clause is a type of expression that collects data from numerous records and returns a record with one or more columns. | SQL The GROUP BY clause's Having Clause is used to limit the results returned. |
Having an additional condition to the query. | The groupby clause organizes data into groups based on a specific column or row. |
In aggregate functions, having can be used without the groupby clause, in which case it works as a where clause. | With the select statement, groupby can be used without the having clause. |
Aggregate functions can be included in the having clause. | It isn't possible to have aggregate functions in it. |
It uses several conditions to limit the query output. | It divides the output into categories based on certain rows or columns. |
Example 1: Suppose we want to know all the release years for movie category id 8. We would use the following script to achieve our results.
SELECT * FROM `movies` GROUP BY `category_id`,`year_released` HAVING `category_id` = 8;
Output:
movie_id title director year_released category_id
9 Honey mooners John Schultz 2005 8
5 Daddy's Little Girls NULL 2007 8
Note only movies with category id 8 have been affected by our GROUP BY clause.
Example 2: sql groupby having:
SELECT COUNT (SALARIES) AS COUNT_SALARIES, EMPLOYEES
FROM EMPLOYEES
GROUP BY SALARIES
HAVING COUNT(SALARIES) > 1;
Example 3: Let’s say we have our average student grade per student. And let’s say we want to only show students where their average is above 52.
SELECT student_id,
AVG(student_grade) AS avg_grade
FROM sample_group_table
GROUP BY student_id
HAVING AVG(student_grade) >= 52
ORDER BY student_id;
Output:
STUDENT_ID AVG_GRADE
2 52.85352273
3 53.276
7 53.86227723
9 53.02617021
You can see there are a few records here, and these are the only students with an average grade of 52 or over.
SQL Groupby Like
LIKE function is not supported in the GROUP BY clause.
Example 1: To overcome those problem use this below query:
SELECT x.term,
COUNT(*)
FROM (SELECT CASE
WHEN CHARINDEX('Fall_2009', t.column) > 0 THEN
SUBSTRING(t.column, CHARINDEX('Fall_2009', t.column), LEN(t.column))
WHEN CHARINDEX('Spring_2009', t.column) > 0 THEN
SUBSTRING(t.column, CHARINDEX('Spring_2009', t.column), LEN(t.column))
ELSE
NULL
END as TERM
FROM TABLE t) x
GROUP BY x.term
Example 2:
SELECT
EXTRACT(YEAR FROM date_created) AS yr, -- for each year
EXTRACT(MONTH FROM date_created) AS mth, -- & month combination
count(*) AS c, -- count all rows
SUM(CASE WHEN confirmed = 1 THEN 1 ELSE 0 END) -- count only confirmed rows
FROM users
WHERE country = 'DE'
GROUP BY yr, mth
SQL Groupby Month
SQL GROUP BY month is used to acquire grouped data in a summarised format in SQL tables depending on each of the months.
When we need monthly data from an organisation, such as sales, submissions, or tests, we can use the group by clause and define the grouping criteria as the month record value present in the table records, or we can simply retrieve the month value using SQL functions from date and time data typed column values to group the results by month.
- We must utilize the datename function in SQL to arrange queries by month name. There are two parameters in this function.
- The first is interval; day names should be written as dw, while month names should be written as m or mm.
- The date parameter is the second one. We may also use the month method to get just the month index.
Syntax:
The syntax of the group by month clause is as follows:
SELECT
column1, column2,..., columnm, aggregate_function(columni)
FROM
target_table
WHERE
conditions_or_constraints
GROUP BY expressionDerivingMonthOfColumn ;
expressionDerivingMonthOfColumn – This is the column that will be used as a criterion in the MYSQL query to construct groups based on month value. The criterion can be assigned to a specific column name or several column names. We can even mention expressions as a criterion for grouping, which can include SQL methods for retrieving month from date and time-related data types. It could alternatively be the name of a single column that stores the month value for those records.
Example 1: List the number of students born each month:
Select DateName(mm,birthdate) as MName,count(*) as Count from students
group by DATEName(mm,birthdate)
Example 2: The retrieved records will only be a single record for the rows with the same values for month, and data will be displayed in a monthly format. Group the resultset of the educba_ articles table contents based on a column named month so that the retrieved records will only be a single record for the rows with the same values for month. We'll also get the average monthly rate and the month's name from the query statement. Our query statement will be as follows :
SELECT
AVG(a.` rate`),a.` month`
FROM
educba_articles a
GROUP BY a.` month`;
Example 3: SQL query to group by month name.
select date_format(order_date, '%M'),sum(amount)
from sales
group by date_format(order_date, '%M');
Output:
+-------------------------------+-------------+
| date_format(order_date, '%M') | sum(amount) |
+-------------------------------+-------------+
| December | 600 |
| November | 400 |
| October | 250 |
+-------------------------------+-------------+
In the above SQL query we use date_format(order_date, “%M”) to convert a date column into month name, and use sum column to add up the sales amounts.
SQL Groupby Month and Year
The MySQL function DATE FORMAT() can be used to group months and years. Also utilized is the GROUP BY clause.
Syntax:
SELECT DATE_FORMAT(yourColumnName, '%m-%Y') from yourTableName
GROUP BY MONTH(yourColumnName), YEAR(yourColumnName)DESC;
Example 1: Let’s run the below code to count the employe hire_date in month and year:
SELECT
DATE_TRUNC('month',hire_date)
AS hire_date_to_month,
DATE_TRUNC('year',hire_date)
AS hire_date_to_year,
COUNT(emp_no) AS count
FROM employee
GROUP BY DATE_TRUNC('month',hire_date),DATE_TRUNC('year',hire_date);
Example 2: Query to group by month and year is as follows:
select DATE_FORMAT(DueDate, '%m-%Y') from GroupMonthAndYearDemo
GROUP BY MONTH(DueDate) , YEAR(DueDate)DESC;
Output:
+-------------------------------+
| DATE_FORMAT(DueDate, '%m-%Y') |
+-------------------------------+
| 12-2020 |
| 12-2019 |
| 12-2018 |
| 12-2017 |
| 12-2016 |
+-------------------------------+
From the output displaying month and year grouped by using GROUP BY.
Example 3:
SELECT CAST(MONTH(date) AS VARCHAR(2)) + '-' + CAST(YEAR(date) AS VARCHAR(4)) AS Mjesec,
SUM(marketingExpense) AS SumaMarketing, SUM(revenue) AS SumaZarada
FROM [Order]
WHERE (idCustomer = 1) AND (date BETWEEN '2001-11-3' AND '2011-11-3')
GROUP BY CAST(MONTH(date) AS VARCHAR(2)) + '-' + CAST(YEAR(date) AS VARCHAR(4))
SQL Groupby Multiple Columns
GROUP BY several columns in SQL is a technique that allows you to group records depending on a criterion. Another benefit of this strategy is that we can group more than one table column at a time. SUM, MAX, MIN, AVG, COUNT, and other aggregate functions are frequently employed with it.
When a single column is used for grouping, all records with the same value for that column are combined into a single record for the resultset. When grouping criteria are established on several columns, all of the values of those columns must be the same as the values of other columns in order for them to be considered for grouping into a single record.
The purpose of grouping by is to bring together records that have the same values for the grouping criteria.
Syntax:
SELECT
column1, column2,..., columnm, aggregate_function(columni)
FROM
target_table
WHERE
conditions_or_constraints
GROUP BY criteriacolumn1 , criteriacolumn2,...,criteriacolumnj;
The GROUP BY clause has the following syntax. When we need to summarise and reduce the resultset, we use this optional clause in the select clause. In the SELECT clause, it should always come after the FROM and WHERE clauses. Some of the concepts used in the preceding syntax are defined further below :
- column1, column2,…, column – These are the names of the columns of the target_table table that need to retrieved and fetched in the resultset.
- aggregate_function(column) – These are the aggregate functions defined on the columns of target_table that needs to be retrieved from the SELECT query.
- target_table – Name of the table from where the result is to be fetched.
- conditions_or_constraints – If you want to apply certain conditions on certain columns they can be mentioned in the optional WHERE clause.
- criteriacolumn1 , criteriacolumn2,…,criteriacolumnj – These are the columns that will be used as criteria in the MYSQL query to generate the groups. The criterion can be applied to a single column name or several column names. We can even use phrases as a criterion for grouping. The alias cannot be used as a grouping criteria in the GROUP BY clause in SQL.
Note: multiple criteria of grouping should be mentioned in a comma-separated format.
Example 1: GROUP BY usage, let's say student might be able to assign the same subject to his name more than once (as shown in table Students_Subjects). In this scenario we might be able to count number of times each subject was assigned to a student by GROUPing by more than one column:
SELECT Students.FullName, Subjects.Subject,
COUNT(Students_subjects.Subject_id) AS NumberOfOrders
FROM ((Students_Subjects
INNER JOIN Students
ON Students_Subjcets.Student_id=Students.Id)
INNER JOIN Subjects
ON Students_Subjects.Subject_id=Subjects.Subject_id)
GROUP BY Fullname,Subject
Output:
FullName Subject SubjectNumber
Matt Jones Maths 2
Matt Jones P.E 1
Frank Blue P.E 1
Frank Blue Physics 1
Anthony Angel Maths 1
Example 2: GROUP BY using multiple columns:
The data will be grouped by customer and year of order date, and then all rows of a specific customer will be further subdivided by year of order date. As a result, there will be one row for each combination of customer and order year.
SELECT customer_name,
YEAR(order_date) order_year,
SUM(total_orders) total_orders
FROM orders
GROUP BY customer_name, YEAR(order_date)
ORDER BY customer_name, order_year
Output:
customer_name order_year total_orders
----------------- ----------- ------------
Jack 2019 2
Jack 2020 14
John 2018 45
John 2019 70
John 2020 120
Rose 2019 8
Rose 2020 32
There is one row for each unique combination of customer and year of order date.
Example 3: we will group the resultset of the educba_learnning table contents based on sessions and expert_name columns so that the retrieved records will only a single record for the rows having the same values for sessions and expert_name collectively. Our query statement will be as follows:
SELEC sessions,
expert_name
FROM
educba_learning
GROUP BY sessions,
expert_name ;
Example 4: let us try to group students’ first and last names based on the sum of their grades as the aggregate function. We can perform this operation with the following code:
SELECT SUM(stu_grade), stu_lastName, stu_firstName
FROM student_details
GROUP BY stu_lastName, stu_firstName;
Output:
SUM(stu_grade) stu_lastName stu_firstName
40 Sanghavi Preet
50 John Rich
60 Brow Veron
70 Jos Geo
80 Shah Hash
90 Parker Sachin
25 Miller David
50 Joh Richa
60 Brow Verona
70 Josh Geoa
80 Ash Hasha
90 Parker Allen
The output, as can be seen, groups the columns stu_firstName and stu_lastName. In MySQL, we can also group several columns. As a result, using the approaches stated above, the GROUP BY statement can be used effectively with one or several columns.
SQL Groupby Multiple Tables
Example 1: Three different tables and I need to count how many rows has a specific value (COUNT(track)) then group by tracks.id and sort by only one result per user (tracks.uid).
My three tables:
`users`
+--------+-------------------+
| `idu` | `username` |
+--------+-------------------+
| 567 | 'TrainingPuppy' |
| 687 | 'BathroomMakeover'|
| 45 | 'PoshNachos' |
| 15 | 'SewingButtons' |
+--------+-------------------+
`views`
+--------+------+---------+
| `id` | `by` | `track` |
+--------+------+---------+
| 1 | 5 | 55 |
| 2 | 5 | 55 |
| 3 | 67 | 55 |
| 4 | 6 | 29 |
| 5 | 125 | 2 |
| 6 | 5 | 698 |
| 7 | 5 | 698 |
+--------+------+---------+
`tracks`
+--------+-------+-----------------------+---------------------+
| `id` | `uid` | `title` | `time` |
+--------+-------+-----------------------+---------------------+
| 2 | 15 | 'Worf is in the air' | 2016-02-11 22:57:35 |
| 29 | 567 | 'Stargold' | 2016-08-11 22:57:28 |
| 55 | 567 | 'No love liers' | 2016-10-11 22:57:51 |
| 698 | 567 | 'Lofe' | 2016-11-11 22:57:44 |
+--------+-------+-----------------------+---------------------+
Query:
SELECT `views`.`track`, `tracks`.*, `users`.*, COUNT(`track`) as `count`
FROM `views`,`tracks`,`users`
WHERE `views`.`track` = `tracks`.`id`
AND `tracks`.`uid` = `users`.`idu`
GROUP BY `tracks`.`uid`
ORDER BY `count`
DESC LIMIT 0, 20
Output:
+--------+-------+---------+---------------------+----------+-----------------------+---------------------+
| `id` | `uid` | `count` | `username` | `track` | `title` | `time` |
+--------+-------+---------+---------------------+----------+-----------------------+---------------------+
| 29 | 567 | 6 | 'TrainingPuppy' | 29 | 'Stargold' | 2016-10-11 22:57:51 |
| 2 | 15 | 1 | 'SewingButtons' | 2 | 'Worf is in the air' | 2016-02-11 22:57:35 |
+--------+-------+---------+---------------------+----------+-----------------------+---------------------+
Example 2:
SELECT t1.TimeStamp, t2.State, t1.OtherData
FROM Table1 t1
inner join Table2 t2
on t1.SpecialNumber = t2.SpecialNumber
inner join (SELECT MAX(time stamp) maxts, state
FROM table1 inner join table2
ON table1.specialnumber = table2.specialnumber
GROUP BY state) t3
on t2.State = t3.State and t1.TimeStamp = t3.maxts
SQL Groupby Null
SUM(), AVG(), MAX(), MIN(), COUNT(), and other aggregation methods can be used with the GROUP BY clause (). In this post, we'll show you how to use the GROUP BY clause with NULL values.
If the grouping column is null, that row is treated as a separate group in the results. The null values in the grouping column constitute a single group if there are more than one.
When two NULL values are compared, the outcome is NULL (not TRUE). The two NULL values are not treated as equivalent. If the same constraint were applied to the GROUP BY clause, SQL would require each entry with a NULL grouping column to be placed in its own group.
Simply put, if a grouping column contains multiple null values, the null values are grouped together.
Example 1: This example uses group by and the advance column, which contains some null values:
select advance, avg(price * 2)
from titles
group by advance
Output:
advance
------------------ -----------------
NULL NULL
0.00 39.98
2000.00 39.98
2275.00 21.90
4000.00 19.94
5000.00 34.62
6000.00 14.00
7000.00 43.66
8000.00 34.99
10125.00 5.98
15000.00 5.98
Example 2: Grouping by NULL Values:
Our base table had no null values in the instances we've seen so far in the course. Let's have a look at how to group a table with null values.
select grp_a, grp_b, count(*)
from t2
group by grp_a, grp_b
order by grp_a, grp_b ;
Output:
GRP_A GRP_B COUNT(*)
---------- ---------- ----------
A1 X1 1
A1 X2 1
A1 (null) 3
A2 (null) 1
Example 3: Creating a new group for each entry with a NULL in a grouping column is inconvenient and useless, so SQL designers made it such that NULL values are treated the same in GROUP BY clauses. As a result, if two rows have NULL values in the same grouping field and matching values in the non-NULL grouping columns, the DBMS will group the rows together :
SELECT A, B, SUM(amount_purchased) AS 'C'
FROM customers
GROUP BY A, B
ORDER BY A, B
Output:
A B C
NULL NULL 61438.0000
NULL 101 196156.0000
AZ NULL 75815.0000
AZ 103 36958.0000
CA 101 78252.0000
LA NULL 181632.0000
for CUSTOMERS that contain the following rows.
A B amount_purchased
NULL NULL 45612.00000
NULL NULL 15826.00000
NULL 101 45852.0000
NULL 101 74815.0000
NULL 101 75489.0000
AZ NULL 75815.0000
AZ 103 36958.0000
CA 101 78252.0000
LA NULL 96385.0000
LA NULL 85247.0000
Example 4: The GROUP BY Clause and NULL Values:
Let's start by executing a simple SQL query with both the GROUP BY clause and NULL values:
SELECT department
FROM employee
GROUP BY department;
Output:
department
1.
2. IT
3. FINANCES
Note: For clarification, I've included a numbered list here; normally, the results would be displayed as an unnumbered list.
The first result value is a NULL, which is represented by an empty string (the empty line before the IT department). This empty area reflects all of the NULL values returned by the GROUP BY clause, indicating that NULLs are treated as valid values by GROUP BY.
SQL Groupby and Orderby
GROUP BY and ORDER BY are both clauses (or statements) that perform the same task of sorting query results. However, each of these serves a distinct purpose, so distinct that they can be used individually or in tandem.
In SQL, group by is used to organize related data into groups.
In SQL, the command Order By is used to sort data in ascending or descending order.
The ORDER BY clause is used to sort the results of a query by one or more columns. With the use of aggregate functions like COUNT(), AVG, MIN(), and MAX(), the GROUP BY clause is used to group data into groups. It works like this: if a column has the same values in different rows, it will combine those rows into a group.
Example 1: In the following example records are ordered by the actor_id field, which is what results are grouped on. If we wanted to order results using different - i.e. non-grouped - fields, we would have to add an ORDER BY clause. Here's the same query, but ordered by the number of films which each actor has appeared in, from most to least:
SELECT count(film_actor.actor_id) AS num_of_films_appeared_in, actor.actor_id,
actor.first_name,
actor.last_name
FROM actor INNER JOIN film_actor ON actor.actor_id = film_actor.actor_id
GROUP BY film_actor.actor_id
ORDER BY film_actor.actor_id, last_name, first_name DESC;
Example 2: To get 'agent_code' and 'agent_name' columns from the table 'agents' and sum of 'advance_amount' column from the table 'orders' after a joining, with following conditions:
'agent_code' of 'agents' and 'orders' must be same, the same combination of 'agent_code' and 'agent_name' of 'agents' table must be within a group, 'agent_code' of 'agents' table should arrange in an order, default is ascending order, the following SQL statement can be used:
SELECT agents.agent_code,agents.agent_name,
SUM(orders.advance_amount)
FROM agents,orders
WHERE agents.agent_code=orders.agent_code
GROUP BY agents.agent_code,agents.agent_name
ORDER BY agents.agent_code;
SQL Groupby Rules
With the help of various functions, the SQL GROUP BY Statement is used to sort identical data into groups.
For example, if a column has the same values in different rows, it will group these rows together.
Syntax:
SELECT column1, function_name(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2
ORDER BY column1, column2;
- function_name: Name of the function used for example, SUM() , AVG().
- table_name: Name of the table.
- condition: Condition used.
Important Points:
- With the SELECT statement, the GROUP BY clause is used.
- The GROUP BY clause comes after the WHERE clause in the query.
- If any ORDER BY clauses are used in the query, GROUP BY is placed before them.
RULES:-
Rule 1: As a result, the following statement is incorrect: the TOWN column exists in the SELECT clause, but it is not an aggregation function parameter and does not appear in the list of columns used to aggregate the results.
SELECT TOWN, COUNT(*)
FROM PLAYERS
GROUP BY PLAYERNO
Rule 2: In most cases, the expressions used to build groups appear in the SELECT clause as well. However, this isn't required. The SELECT clause can contain expressions that exist in the GROUP BY clause.
Rule 3: Inside a compound expression, an expression that is used to generate groups can also appear in the SELECT clause.
Rule 4: If an expression occurs twice or more in a GROUP BY clause, double expressions are simply removed. The GROUP BY clause GROUP BY TOWN, TOWN is converted to GROUP BY TOWN. Also, GROUP BY SUBSTR(TOWN,1,1), SEX, SUBSTR(TOWN,1,1) is converted to GROUP BY SUBSTR(TOWN,1,1), SEX. Therefore, it has no use for double expressions.
Rule 5: The rules for SELECT statements without a GROUP BY clause are the same as the rules for SELECT statements with a GROUP BY clause. We add a rule for SELECT statements with a GROUP BY clause: DISTINCT is redundant when the SELECT clause includes all the columns indicated in the GROUP BY clause (if used outside an aggregation function). The GROUP BY phrase organises the rows so that the columns in which they're grouped don't have any duplicate values.
Example: The most significant is that any data that isn't specified as a parameter to GROUP BY requires an aggregation function. Consider the following question:
SELECT genre_id, unit_price FROM tracks GROUP BY genre_id;
Because the database is unsure what to do with unit_price, it throws an error. While each group has only one genre_id, there are numerous unit_prices. Without any sort of aggregation method, they can't all be output as a single value.
SQL Groupby Sum
When a GROUP BY clause is used, SUM is applied.
The aggregate functions provide a summary of the data in the table. After the rows have been separated into groups, the aggregate functions are used to return only one value for each group. It is preferable to utilise the GROUP BY clause in the query result to identify each summary row. An aggregate function must be performed to all columns other than those indicated in the GROUP BY clause.
Example 1: SQL Server SUM() function with GROUP BY example:
The following statement finds total stocks by store id:
SELECT store_id,
SUM(quantity) store_stocks
FROM
production.stocks
GROUP BY
store_id;
- To begin, the GROUP BY clause separated the stocks into groups based on the store id.
- Second, the total stocks for each group are calculated using the SUM() method.
Example 2: sql group by sum with condition:
SELECT CityID,
COUNT(*) NumberOfDogsInThisCity,
SUM(CASE WHEN DogName = 'Tedy' THEN Wheight ELSE 0 END) WheightOfAllDogWithNameTedy
FROM Dogs
GROUP BY CityID
Example 3: In a single column, use the Sum function. The total income of each department will be calculated using the aggregate sum transact query:
SELECT [Occupation],
SUM(YearlyIncome) AS [Total Income]
FROM [Customer]
GROUP BY Occupation
Example 4: SQL Group by Sum function and Multiple Columns example.
SELECT [Occupation],
Education,
SUM(YearlyIncome) AS [Total Income]
FROM [Customer]
GROUP BY Occupation, Education
Example 5: To get data of 'agent_code' and the sum of 'advance_amount' for each individual 'agent_code' from the 'orders' table with the following condition:
Same 'agent_code' should not come twice, the following SQL statement can be used :
SELECT agent_code,
SUM (advance_amount)
FROM orders
GROUP BY agent_code;
SQL Groupby Timestamp
Example 1: Group datetime fields by year/month by using a simple ‘format’. Of course, there are other possibilities for doing this. See example below:
declare @format varchar(100) = 'yyyy/MM'
SELECT format(timestamp,@format), count(*)
FROM [MY_DATABASE].[dbo].[MY_TABLE]
group by format(timestamp,@format)
order by format(timestamp,@format) desc;
Example 2:
SELECT created_at::date, COUNT(*)
FROM table_name
WHERE product_uuid = '586d8e21b9529d14801b91bd' AND
created_at > now() - interval '30 days'
GROUP BY created_at::date
ORDER BY created_at::date ASC;
Your version is aggregating by each date/time value but only showing the date component.
Example 3: A group of 30 second periods that state the form of a specific time. '2012-01-01 00:00:00' is the time. DATEDIFF calculates the difference in seconds between the time stamp value and the current time. The total is then divided by 30 to obtain the grouping column.
SELECT MIN(TimeColumn) AS TimeGroup, SUM(Quantity) AS TotalQuantity FROM YourTable
GROUP BY (DATEDIFF(ss, TimeColumn, '2012-01-01') / 30)
Each group's minimum time stamp will be output as TimeGroup. However, maximum or even grouping column values can be converted to time and shown again.
Example 4: Group by minute, hour, day, week, etc., it's tempting to just group by your timestamp column, however, then you'll get one group per second, which is likely not what you want. Instead, you need to "truncate" your timestamp to the granularity you want, like minute, hour, day, week, etc. The function you need here is DATE_FORMAT:
SELECT
date_format(created_at,'%Y-%m-%d %H-%i'), -- leave out -%i if you want to group by hour
count(1)
FROM users
GROUP BY 1;
SQL Groupby Union and Unionall
To improve load performance, very large tables are sometimes segmented into several small tables and accessed using a UNION ALL in a view. For certain very specific queries using such a view with a GROUP BY, the Sybase IQ optimizer is able to enhance performance by copying some GROUP BY operations into each arm of such a UNION ALL, performing the operations in parallel, then combining the results. This strategy, known as split GROUP BY, decreases the quantity of data handled by the top level GROUP BY, resulting in faster query processing.
Example 1: Union first, then aggregate. The following example uses a subquery, but you can use a temp table instead if you prefer:
Select StartTime, Row_Date, sum(acdcalls+abncalls)...[other sums here]
From (
select * from db1
union all
select * from db2
) a
group by StartTime, RowDate
You can still have your where clauses and your specific columns in the subquery if necessary (the example above will only work if db1 and db2 have the same columns in the same order - otherwise you will need to specify your columns). I am not sure why you want to group by Row_Date if you are limiting both of your selects to Row_Date = GetDate(), though.
Example 2: Query and UNION these queries (this is the same as the following query).
SELECT EmployeeCode, DepartmentCode, LocationCode, SUM(salary) TotalCost
FROM #EmployeeMaster
Group BY EmployeeCode, DepartmentCode, LocationCode
UNION
SELECT NULL AS EmployeeCode, DepartmentCode, NULL AS LocationCode, SUM(salary) TotalCost
FROM #EmployeeMaster
Group BY DepartmentCode
UNION
SELECT NULL AS EmployeeCode, NULL AS DepartmentCode, LocationCode, SUM(salary) TotalCost
FROM #EmployeeMaster
Group BY LocationCode
UNION
SELECT NULL AS EmployeeCode, NULL AS DepartmentCode, NULL AS LocationCode, SUM(salary) TotalCost
FROM #EmployeeMaster
Example 3: Only certain queries with a GROUP BY over a UNION ALL show a performance improvement. The following simple query, for example, benefits from the split GROUP BY:
CREATE VIEW vtable (v1 int, v2 char(4)) AS
SELECT a1, a2 FROM tableA
UNION ALL
SELECT b1, b2 FROM tableB;
SELECT COUNT(*), SUM(v1) FROM vtable GROUP BY v2;
SQL Groupby Where
GROUP BY is the command that is used to aggregate the results.
The GROUP BY clause divides the members of the rows into groups (It is like a Pivot to grouping columns member to see aggregated value). The columns that you specify in the GROUP BY clause determine the groups.
Syntax:
SELECT
select_list
FROM
table_name
WHERE Condition
GROUP BY
column_name1,
column_name2, ..
Example 1: The WHERE clause goes before the GROUP BY:
select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv
from Customer as cu
inner join SalesInvoice as si
on cu.CustomerID = si.CustomerID
where cu.FirstName = 'mark'
group by cu.CustomerID,cu.FirstName,cu.LastName
Example 2: Below Query Product_Category is appear only For individual Product_Container:
SELECT
Product_Category, Product_Container
FROM
[dbo].[Customers_Tbl]
WHERE
([Product_Category] = 'Furniture')
OR
([Product_Category] = 'Office supplies')
GROUP BY
Product_Category, Product_Container
SQL Groupby with Join
We can use the Group by clause with aggregate functions in SQL Inner Join to group the result set by one or more columns. The traditional Group by method uses an Inner Join on the final result produced after connecting two or more tables. If you're unfamiliar with the Group by clause in SQL, I recommend reading here to get a quick grasp of the concept.
Example 1: Below is the code that makes use of Group By clause with the Inner Join.
SELECT pz.CompanyCity, pz.CompanyName, SUM(f.UnitsSold) AS TotalQuantitySold
FROM PizzaCompany pz
INNER JOIN Foods f ON pz.CompanyId = f.CompanyId
GROUP BY pz.CompanyCity, pz.CompanyName
ORDER BY pz.CompanyCity
Here, we intend to obtain total items sold by each Pizza company present in the City. As you can see below, aggregated result in ‘totalquantitysold’ column as 18 (7+11) and 9 (6+3) for Los Angeles and San Diego respectively is computed.
Example 2: Query returns order number, order status, and total sales from the orders and orderdetails tables using the INNER JOIN clause with the GROUP BYclause:
SELECT t1.orderNumber,
t1.status,
SUM(quantityOrdered * priceEach) total
FROM
orders t1
INNER JOIN orderdetails t2
ON t1.orderNumber = t2.orderNumber
GROUP BY orderNumber;
Example 3: The Left Join can also be used with the GROUP BY clause. The following statement returns customer id, customer name, qualification, price, and date using the Left Join clause with the GROUP BY clause.
SELECT customers.customer_id, cust_name, qualification, price, date
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id
GROUP BY price;
Example 4:
ID name salary
----------- ---------- -----------
1 Jason 1234
2 Robert 4321
3 Celia 5432
4 Linda 3456
5 David 7654
6 James 4567
7 Alison 8744
8 Chris 9875
9 Mary 2345
select * from job;
GO
Output:
ID title averageSalary
----------- ---------- -------------
1 Developer 3000
2 Tester 4000
3 Designer 5000
4 Programmer 6000
SELECT count(e.id)
FROM Employee e LEFT JOIN job j
ON e.ID = j.ID
group by j.ID
ORDER BY j.ID
GO
SQL Groupby without Aggregate Function
If you use the GROUP BY clause in a SELECT statement without employing aggregate functions, the result will be the same as if you used the DISTINCT clause.
Consider utilising the DISTINCT option rather than the GROUP BY clause without the aggregate function, because the DISTINCT option is faster and produces the same results.
If you don't have an aggregate function, don't utilise the GROUP BY clause.
To retrieve different rows, the GROUP BY clause was utilised without an aggregate function.
SELECT ColumnA ,
ColumnB
FROM T
GROUP BY ColumnA ,
ColumnB
Example 1: For example, we have the following table :
Select * from Student_info;
+------+---------+------------+------------+
| id | Name | Address | Subject |
+------+---------+------------+------------+
| 101 | YashPal | Amritsar | History |
| 105 | Gaurav | Chandigarh | Literature |
| 125 | Raman | Shimla | Computers |
| 130 | Ram | Jhansi | Computers |
| 132 | Shyam | Chandigarh | Economics |
| 133 | Mohan | Delhi | Computers |
| 150 | Saurabh | NULL | Literature |
+------+---------+------------+------------+
Using GROUP BY clause as follows, we can get the same result set as we got by using DISTINCT :
Select ADDRESS from Student_info GROUP BY Address;
Output:
+------------+
| ADDRESS |
+------------+
| NULL |
| Amritsar |
| Chandigarh |
| Delhi |
| Jhansi |
| Shimla |
+------------+
6 rows in set (0.00 sec)
There is a distinction between the two result sets returned by MySQL: the result set returned by MySQL query using the GROUP BY clause is sorted, whereas the result set returned by MySQL query using the DISTICT clause is not.
Example 2: GROUP BY Without Aggregate Functions Not Valid in PostgreSQL:
There are two uses of the query SELECT * FROM {weather_config} WHERE uid < 0 GROUP BY uid ORDER BY uid DESC.
Using GROUP BY in this manner is MySQL-specific, violates ANSI requirements, and causes a PostgreSQL error.
Following my study and testing of the code, I believe that SELECT DISTINCT uid FROM {weather config} WHERE uid < 0 ORDER BY uid DESC does what the code requires. With both setup and use, my testing in PostgreSQL (8.3) provides what I presume is the expected behaviour of the module.
I haven't tried it in MySQL yet, but it's very typical SQL, so I expect it to work. The actual modifications can be found in the attached patch file.