Wednesday, August 6, 2014

Except, Dense_rank() and rank() and NTILE()

Except

EXCEPT returns any distinct values from the left query that are not also found on the right query.
INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.
The basic rules for combining the result sets of two queries that use EXCEPT or INTERSECT are the following:
·         The number and the order of the columns must be the same in all queries.
·         The data types must be compatible.
·         USE AdventureWorks2008R2;
·         GO
·         SELECT ProductID
·         FROM Production.WorkOrder
·         EXCEPT
·         SELECT ProductID
·         FROM Production.Product ;
·         --Result: 0 Rows (work orders without products)

 

dense_rank() and rank()

 

SQL SERVER – 2005 – Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE

I have not written about this subject for long time, as I strongly believe that Book On Line explains this concept very well. SQL Server 2005 has total of 4 ranking function. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic.
ROW_NUMBER () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of each row within the partition of a result set.
DENSE_RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of rows within the partition of a result set, without any gaps in the ranking.
NTILE (integer_expression) OVER ([<partition_by_clause>] <order_by_clause>)
Distributes the rows in an ordered partition into a specified number of groups.
All the above definition and syntax are taken from BOL. It is difficult to explain above function anything else than what they are explained in BOL. Following example is excellent example from BOL again. This function explains usage of all the four function together in one query.
USE AdventureWorks;
GO
SELECT c.FirstName, c.LastName
,ROW_NUMBER() OVER (
ORDER BY a.PostalCode) AS 'Row Number'
,RANK() OVER (
ORDER BY a.PostalCode) AS 'Rank'
,DENSE_RANK() OVER (
ORDER BY a.PostalCode) AS 'Dense Rank'
,NTILE(4) OVER (
ORDER BY a.PostalCode) AS 'Quartile'
,s.SalesYTD, a.PostalCode
FROM Sales.SalesPerson s
INNER JOIN Person.Contact c
ON s.SalesPersonID = c.ContactID
INNER JOIN Person.Address a
ON a.AddressID = c.ContactID
WHERE TerritoryID IS NOT NULL
AND
SalesYTD <> 0;
Resultset:


Most

 

The RANK()function in SQL Server returns the position of a value within the partition of a result set, with gaps in the ranking where there are ties.

The DENSE_RANK() function in SQL Server returns the position of a value within the partition of a result set, leaving no gaps in the ranking where there are ties.

Let us understand this difference with an example and then observe the results while using these two functions:

We will run two queries, one using RANK() and the other using DENSE_RANK() and observe the difference in the results. We will be using the ORDERS table of the NORTHWIND database to demonstrate the difference. The query will fetch the list of Customers ordered by the highest number of orders each has placed.

Using the RANK() function

SELECT RANK() OVER (ORDER BY TotCnt DESC) AS TopCustomers, CustomerID, TotCnt

FROM (SELECT CustomerID, COUNT(*) AS TotCnt

FROM Orders Group BY CustomerID) AS Cust



OUTPUT
[RANK.JPG]

As shown in the results above, while using the RANK() function, if two or more rows tie for a rank, each tied rows receives the same rank, however with gaps in the ranking where there are ties. For example, Customers 'FOLKO' and 'HUNGO' have the same number of orders(i.e. 19), so they are both ranked 4. The Customers with the next highest order(HILAA, BERGS, RATTC) are ranked number 6 instead of 5, because there are two rows that are ranked higher at 4.




SQL Functions – NTILE()

http://sqlandme.com/2011/06/30/sql-functions-ntile/

NTILE() distributes the result set into specified number of ordered partitions. For each row in result set NTILE() will returns a group number to which the row is associated.
This is very useful while distributing the result set into multiple groups in case you need to distribute the results into multiple tables.
For example,
– © 2011 – Vishal (http://SqlAndMe.com)

SELECT      Products.ProductID, Products.Name, Products.Price,
            NTILE(3) OVER (ORDER BY Products.Price) AS [Partition]
FROM
(
      SELECT      804 AS [ProductID],
                  'HL Fork' AS [Name], 229.49 AS [Price] UNION ALL
      SELECT      871, 'Mountain Bottle Cage',  9.99     UNION ALL
      SELECT      845, 'Mountain Pump',         24.99    UNION ALL
      SELECT      921, 'Mountain Tire Tube',    4.99     UNION ALL
      SELECT      980, 'Mountain-400-W Silver', 769.49   UNION ALL
      SELECT      988, 'Mountain-500 Silver',   564.99   UNION ALL
      SELECT      793, 'Road-250 Black',        2443.35  UNION ALL
      SELECT      762, 'Road-650 Red',          782.99   UNION ALL
      SELECT      923, 'Touring Tire Tube',     4.99     UNION ALL
      SELECT      966, 'Touring-1000 Blue',     2384.07
) Products
The above query will distribute the result set in three (3) different groups by distributing the rows. In this case the rows cannot be distributed evenly, so it will be divided as 4 + 3 + 3. If the number of rows were 11, it will be divided as 4 + 4 + 3.
Result Set:
ProductID     Name                    Price         Partition
———–   ———————   ——-       ———–
921           Mountain Tire Tube      4.99          1
923           Touring Tire Tube       4.99          1
871           Mountain Bottle Cage    9.99          1
845           Mountain Pump           24.99         1
804           HL Fork                 229.49        2
988           Mountain-500 Silver     564.99        2
980           Mountain-400-W Silver   769.49        2
762           Road-650 Red            782.99        3
966           Touring-1000 Blue       2384.07       3
793           Road-250 Black          2443.35       3

(10 row(s) affected)

Using <PARTITION BY> with NTILE():
PARTITION BY clause adds one more level of partitioning to the result set. In the above example, no PARTITION BY clause is specified, hence the rows are distributed as:
10 Rows –>  Partition 1 = 4 Rows,
            Partition 2 = 3 Rows,
            Partition 3 = 3 Rows
Now, let’s say we add another column to Result Set – Product Category. Our sample data will be:
ProductID   Name                  Category    Price
———– ——————— ———– ——–
804         HL Fork               Accessories 229.49
871         Mountain Bottle Cage  Accessories 9.99
845         Mountain Pump         Parts       24.99
921         Mountain Tire Tube    Parts       4.99
980         Mountain-400-W Silver Accessories 769.49
988         Mountain-500 Silver   Parts       564.99
793         Road-250 Black        Accessories 2443.35
762         Road-650 Red          Parts       782.99
923         Touring Tire Tube     Accessories 4.99
966         Touring-1000 Blue     Parts       2384.07
Now to divide the result set based on Category we can use PARTITION BY clause, the partitions will be as follows:
10 Rows –>  Group 1, Category = Accessories = 5 Rows -> Partition 1 = 2 Rows,
                                                        Partition 2 = 2 Rows,
                                                        Partition 3 = 1 Row,
            Group 2, Category = Parts = 5 Rows       -> Partition 1 = 2 Rows,
                                                        Partition 2 = 2 Rows,
                                                        Partition 3 = 1 Row
SELECT      Products.ProductID, Products.Name,
            Products.Category, Products.Price,
            NTILE(3) OVER
            (PARTITION BY Products.Category ORDER BY Products.Price)
            AS [Partition]
FROM
(
      SELECT 804 AS [ProductID],
      'HL Fork' AS [Name], 'Accessories' AS [Category],
      229.49 AS [Price] UNION ALL
      SELECT 871, 'Mountain Bottle Cage', 'Accessories'9.99    UNION ALL
      SELECT 845, 'Mountain Pump',        'Parts',       24.99   UNION ALL
      SELECT 921, 'Mountain Tire Tube',   'Parts',       4.99    UNION ALL
      SELECT 980, 'Mountain-400-W Silver','Accessories'769.49  UNION ALL
      SELECT 988, 'Mountain-500 Silver',  'Parts',       564.99  UNION ALL
      SELECT 793, 'Road-250 Black',       'Accessories'2443.35 UNION ALL
      SELECT 762, 'Road-650 Red',         'Parts',       782.99  UNION ALL
      SELECT 923, 'Touring Tire Tube',    'Accessories'4.99    UNION ALL
      SELECT 966, 'Touring-1000 Blue',    'Parts',       2384.07
) Products
Result Set:
ProductID   Name                  Category    Price        Partition
———– ——————— ———– ———— ————–
923         Touring Tire Tube     Accessories 4.99         1
871         Mountain Bottle Cage  Accessories 9.99         1
804         HL Fork               Accessories 229.49       2
980         Mountain-400-W Silver Accessories 769.49       2
793         Road-250 Black        Accessories 2443.35      3
921         Mountain Tire Tube    Parts       4.99         1
845         Mountain Pump         Parts       24.99        1
988         Mountain-500 Silver   Parts       564.99       2
762         Road-650 Red          Parts       782.99       2
966         Touring-1000 Blue     Parts       2384.07      3

(10 row(s) affected)

No comments:

Post a Comment