Wednesday, August 29, 2012

SQL 2005 - RANK and DENCE_RANK and NTILE in sql server



1. Introduction



"Ranking functions" are useful to provide ranking on the table columns based on the value stored in it. Say for example, let us say a table is having students id and their total marks as  separate columns. This allows you to apply the ranking function on the total marks column in a meaningful way. With ranking function, you can easily find 2nd maximum or 4th maximum marks easily. The ranking functions are available in SQL 2005 and later versions.

Some useful ranking functions are displayed below:

  1. ROW_NUMBER()
  2. RANK()
  3. DENSE_RANK()
  4. NTILE()

In this article, we will look how we can use those ranking functions by making use of NorthWnd Sample database.



2. Demo View



First we will create a view from the NorthWnd Db and that will help the explanation of ranking functions. Below is the view:

Create View NetSales as
Select Ord.OrderId, Cat.CategoryName, Prod.ProductName,
            Ord.UnitPrice * Ord.Quantity as NetSales
from (Categories cat INNER Join Products prod
            ON cat.CategoryId = Prod.CategoryId)
      INNER Join [Order Details] Ord
            ON ord.ProductId = Prod.ProductId;

In the above view, we joined category, products and Order Details table together to get the required data. Also note that the NetSales computed column is nothing but the total sales of particular product for a given order id. We are going to perform ranking functions on this Netsales view. The result of the view is shown below (Note that view does not have the order by clause as it is shown in the select query)


View used for the article
Fig 1. View used for the article





3. ROW_NUMBER function



"Row_number" applies the unique sequence of numbers for each row. This is useful when you want to delete duplicate rows in a table. Below example shows how the row_number is applied on the ascending ordered NetSales:

Row_Number Example

Fig 2. Row_Number Example


Here, Rows marked under A, B and C has same NetSales values. And note that the Row Number is applied sequentially without any repeats even the NetSales values are same.



4. RANK function



In the below example, we used a "RANK function" to provide the Rank for each product based on the generated NetSales. Note that we applied Rank on NetSales column based on the values it holds in a descending order. So the First Rank is given to the highest net sales and Second Rank is given to Next Highest and so on…

Applying Rank Function
Fig 3. Applying Rank Function

In the above example, Set of Rows marked in A, B and C have same NetSales. As these groups have same netsales value, they have the same rank. Also note the Gaps in the Rank say we do not have Rank 2, as there are two products in Rank 1.

The example below shows having different sorting order for Rank and Select Query’s statement. Note that the ranking is still applied on the descending order of Netsales and the output is ordered based on the Product Name.

Rank Order and Record Order
Fig 4. Rank Order and Record Order



5. DENSE_RANK function



The "DENSE_RANK" function works same as rank function and the only difference is that it avoids the gaps in the rank. The below example shows the dense rank in effect:

Dense Rank usage
Fig 5. Dense Rank usage

Note that even though the Rank 1 is shared between two rows, the next rank given to the NetSales of 10540.00 is 2 not 3. And this is how dense rank differs from the normal rank function.



6. NTILE function



"NTILE function" is normal distribution function. Say for example NTILE (100) means percentile of 100 and when we apply that on the Netsales in Descending orders, the sales is distributed in 100 groups. The top sales in NetSales value are placed on percentile 1 and least sales in value is placed in percentile 100. Say for example if you call NTILE(5) in place of NTILE(100), then you are placing the Netsales in 5 groups stating TOP 5, meaning that RANK 1 group, Rank 2 Group, Rank3 Group… Rank 5 Groups.

In the below example the Netsales is distributed on a 100 percentile:

NTILE Function
Fig 6. NTILE Function



7. Applying Rank Functions within a Group



All the above examples used rank functions for full table of data. But we can use the rank functions for a group and rank will be reset when the group changes. In our example, we can try applying rank function for Netsales within each category. Doing so will rank the NetSales of product within each category. Below is the example for it:

Fig 7. Applying Rand within Group by clause


 For all the Examples, the tables from NorthWnd Db is used. You can download it from here: Download

No comments:

Post a Comment

Leave your comment(s) here.

Like this site? Tell it to your Firend :)