SQl GROUP BY Vs PARTITION BY Statement

When creating a complex report in SQL, the PARTITION BY and GROUP BY clauses are frequently utilised.

In many circumstances, returning the data is valuable (and even required), but more complex calculations are frequently required.

GROUP BY and PARTITION BY are useful in this situation. There are some important differences between them, despite the fact that they both do grouping. In this article, we'll look at the distinctions.


SQL GROUP BY Statement

In SQL queries, the GROUP BY clause is used to define groups based on a set of criteria.

These criteria are commonly found in reports as categories. The following are some examples of grouping criteria:

All employees should be grouped by their yearly salary level, and all trains should be grouped by their first stop.

Incomes and costs should be grouped by month, and students should be grouped by the class they are enrolled in.

The GROUP BY clause turns data into a new result set in which the original entries are divided into groups based on our criteria.

More information on the GROUPBY clause may be found in this article.

On these groups, we can execute some extra actions , the majority of which are connected to aggregate functions.

To summarise, aggregate functions are used to aggregate our data, and as a result, the original features in the query result are lost in the process.

There are several aggregate functions, but COUNT, SUM , AVG, MIN, and MAX are the most widely utilised.

Example: We have this table Book to implement group by:

ID Name Price Year Domain
1 Sql Server Concurrency 99.99 2020 Database
2 Red Gate Guide to SQL Server 71.87 2018 Optimization
3 Simply SQL Server 71.87 2021 Security
4 SQL in a Nutshell 168.27 2016 Optimization

Using GROUP BY clause to execute below query :

Mysql, sqlite,Sqlserver,Ms access

Select Year,Name
From BookAdvance
Group by Year
Order by Year;

Output: The result will be:

year name
2016 SQL in a Nutshell
2018 Red Gate Guide to SQL Server
2020 Sql Server Concurrency
2021 Simply SQL Server

SQL PARTITION BY Statement

You can use a PARTITION BY in our queries to produce aggregated data on the defined groups, depending on what you need to perform.

To calculate aggregated numbers, the PARTITION BY function is coupled with the OVER() and windows functions.

This is similar to GROUP BY and aggregate functions, with one key difference: when you use PARTITION BY , the row-level features are kept rather than compacted.

That is, you still have access to both the original row-level details and the aggregated values. As window functions, you can use any aggregate function.

Example: We have Book table to implement this function:-

ID Name Price Year Domain
1 Sql Server Concurrency 99.99 2020 Database
2 Red Gate Guide to SQL Server 71.87 2018 Optimization
3 Simply SQL Server 71.87 2021 Security
4 SQL in a Nutshell 168.27 2016 Optimization

when you use PARTITION BY clause:

Mysql,Sqlite,Sql server

Select ID,Name,
Round(Avg(Price) OVER (PARTITION BY  ID)) Rate
FROM BookAdvance;

Output: The result will be

id name rate
1 Sql Server Concurrency 100
2 Red Gate Guide to SQL Server 72
3 Simply SQL Server 72
4 SQL in a Nutshell 168

SQL Difference Between GROUP BY And PARTITION BY CLAUSE

Here, we discussed the 4 contrasts in groupby & partition by in below table:-

GROUP BY PARTITION BY
The GROUP BY clause is a serial operation that minimises the number of rows in the query results by merging numerous rows into a single row using the supplied criteria. Using the specified criteria, the PARTITION BY clause divides table data into numerous divisions. It does not impact the number of rows returned in the query results, but it does change how the result of a window function is generated.
Using a GROUP BY clause collapses original rows; as a result, the original values cannot be accessed later in the query. The PARTITION BY clause preserves the original values while allowing us to generate aggregated values.
Group by divides the result set into groups, delivering one row per group. Analytic functions (Partition by) return aggregate results rather than grouping the result set, implying that the group value is returned several times with each record.
Instead of using the WHERE clause in the filter condition, we should use the HAVING clause. Apart from partition column, we may use WHERE clause in filter condition.