SQl GROUP BY Vs PARTITION BY Statement
When creating a complex report in SQL, the PARTITION BY
and GROUP BY
clauses are frequently utilised.
In many circumstances, returning the data is valuable (and even required), but more complex calculations are frequently required.
GROUP BY and PARTITION BY are useful in this situation. There are some important differences between them, despite the fact that they both do grouping. In this article, we'll look at the distinctions.
SQL GROUP BY Statement
In SQL queries, the GROUP BY clause is used to define groups based on a set of criteria.
These criteria are commonly found in reports as categories. The following are some examples of grouping criteria:
All employees should be grouped by their yearly salary level, and all trains should be grouped by their first stop.
Incomes and costs should be grouped by month, and students should be grouped by the class they are enrolled in.
The GROUP BY clause turns data into a new result set in which the original entries are divided into groups based on our criteria.
More information on the GROUPBY clause may be found in this article.
On these groups, we can execute some extra actions , the majority of which are connected to aggregate functions.
To summarise, aggregate functions are used to aggregate our data, and as a result, the original features in the query result are lost in the process.
There are several aggregate functions, but COUNT
, SUM
, AVG
, MIN
, and MAX
are the most widely utilised.
Example: We have this table Book to implement group by:
ID | Name | Price | Year | Domain |
---|---|---|---|---|
1 | Sql Server Concurrency | 99.99 | 2020 | Database |
2 | Red Gate Guide to SQL Server | 71.87 | 2018 | Optimization |
3 | Simply SQL Server | 71.87 | 2021 | Security |
4 | SQL in a Nutshell | 168.27 | 2016 | Optimization |
Using GROUP BY clause to execute below query :
Mysql, sqlite,Sqlserver,Ms access
Select Year,Name
From BookAdvance
Group by Year
Order by Year;
Output: The result will be:
year | name |
---|---|
2016 | SQL in a Nutshell |
2018 | Red Gate Guide to SQL Server |
2020 | Sql Server Concurrency |
2021 | Simply SQL Server |
SQL PARTITION BY Statement
You can use a PARTITION BY in our queries to produce aggregated data on the defined groups, depending on what you need to perform.
To calculate aggregated numbers, the PARTITION BY function is coupled with the OVER()
and windows functions.
This is similar to GROUP BY and aggregate functions, with one key difference: when you use PARTITION BY , the row-level features are kept rather than compacted.
That is, you still have access to both the original row-level details and the aggregated values. As window functions, you can use any aggregate function.
Example: We have Book table to implement this function:-
ID | Name | Price | Year | Domain |
---|---|---|---|---|
1 | Sql Server Concurrency | 99.99 | 2020 | Database |
2 | Red Gate Guide to SQL Server | 71.87 | 2018 | Optimization |
3 | Simply SQL Server | 71.87 | 2021 | Security |
4 | SQL in a Nutshell | 168.27 | 2016 | Optimization |
when you use PARTITION BY clause:
Mysql,Sqlite,Sql server
Select ID,Name,
Round(Avg(Price) OVER (PARTITION BY ID)) Rate
FROM BookAdvance;
Output: The result will be
id | name | rate |
---|---|---|
1 | Sql Server Concurrency | 100 |
2 | Red Gate Guide to SQL Server | 72 |
3 | Simply SQL Server | 72 |
4 | SQL in a Nutshell | 168 |
SQL Difference Between GROUP BY And PARTITION BY CLAUSE
Here, we discussed the 4 contrasts in groupby & partition by in below table:-
GROUP BY | PARTITION BY |
---|---|
The GROUP BY clause is a serial operation that minimises the number of rows in the query results by merging numerous rows into a single row using the supplied criteria. | Using the specified criteria, the PARTITION BY clause divides table data into numerous divisions. It does not impact the number of rows returned in the query results, but it does change how the result of a window function is generated. |
Using a GROUP BY clause collapses original rows; as a result, the original values cannot be accessed later in the query. | The PARTITION BY clause preserves the original values while allowing us to generate aggregated values. |
Group by divides the result set into groups, delivering one row per group. | Analytic functions (Partition by) return aggregate results rather than grouping the result set, implying that the group value is returned several times with each record. |
Instead of using the WHERE clause in the filter condition, we should use the HAVING clause. |
Apart from partition column, we may use WHERE clause in filter condition. |