Author Topic: How to Perform SQL Server Row-by-Row Operations Without Cursors  (Read 6781 times)

Offline admin

  • Administrator
  • Sr. Member
  • *****
  • Posts: 296
    • View Profile
How to Perform SQL Server Row-by-Row Operations Without Cursors
« on: January 03, 2009, 09:51:23 PM »
SQL cursors have been a curse to database programming for many years because of their poor performance. On the other hand, they are extremely useful because of their flexibility in allowing very detailed data manipulations at the row level. Using cursors against SQL Server tables can often be avoided by employing other methods, such as using derived tables, set-based queries, and temp tables. A discussion of all these methods is beyond the scope of this article, and there are already many well-written articles discussing these techniques.

The focus of this article is directed at using non-cursor-based techniques for situations in which row-by-row operations are the only, or the best method available, to solve a problem. Here, I will demonstrate a few programming methods that provide a majority of the cursor’s flexibility, but without the dramatic performance hit.

Let’s begin by reviewing a simple cursor procedure that loops through a table. Then we’ll examine a non-cursor procedure that performs the same task.

Code: [Select]
if exists (select * from sysobjects where name = N'prcCursorExample')

   drop procedure prcCursorExample

go

CREATE PROCEDURE  prcCursorExample

AS

/*

**  Cursor method to cycle through the Customer table and get Customer Info for each iRowId.

**

** Revision History:

** ---------------------------------------------------

**  Date       Name       Description      Project

** ---------------------------------------------------

**  08/12/03   DVDS       Create            ----     

**

*/

SET NOCOUNT ON

 

-- declare all variables!

DECLARE  @iRowId  int,

         @vchCustomerName     nvarchar(255),

         @vchCustomerNmbr     nvarchar(10)     

 

-- declare the cursor

DECLARE Customer CURSOR FOR

SELECT    iRowId,

          vchCustomerNmbr,

          vchCustomerName

FROM      CustomerTable

 

OPEN Customer
 

FETCH Customer INTO @iRowId,
                    @vchCustomerNmbr,
                    @vchCustomerName

 

-- start the main processing loop.

WHILE @@Fetch_Status = 0

   BEGIN

   -- This is where you perform your detailed row-by-row

   -- processing.

   -- Get the next row.

   FETCH Customer INTO @iRowId,
                       @vchCustomerNmbr,
                       @vchCustomerName             

   END

CLOSE Customer

DEALLOCATE Customer

RETURN


As you can see, this is a very straight-forward cursor procedure that loops through a table called CustomerTable and retrieves iRowId, vchCustomerNmbr and vchCustomerName for every row. Now we will examine a non-cursor version that does the exact same thing:

Code: [Select]
if exists (select * from sysobjects where name = N'prcLoopExample')

   drop procedure prcLoopExample

go

CREATE PROCEDURE  prcLoopExample

AS

/*

**  Non-cursor method to cycle through the Customer table and get Customer Info for each iRowId.

**

** Revision History:

** ------------------------------------------------------

**  Date       Name      Description           Project

** ------------------------------------------------------

**  08/12/03   DVDS      Create                -----

**

*/

 

SET NOCOUNT ON

-- declare all variables!

DECLARE     @iReturnCode       int,

            @iNextRowId        int,

            @iCurrentRowId     int,

            @iLoopControl      int,

            @vchCustomerName   nvarchar(255),

            @vchCustomerNmbr   nvarchar(10)

            @chProductNumber   nchar(30)

           

-- Initialize variables!

SELECT @iLoopControl = 1

SELECT @iNextRowId = MIN(iRowId)

FROM   CustomerTable

 

-- Make sure the table has data.

IF ISNULL(@iNextRowId,0) = 0

   BEGIN

            SELECT 'No data in found in table!'

            RETURN

   END

-- Retrieve the first row

SELECT           @iCurrentRowId   = iRowId,

                 @vchCustomerNmbr = vchCustomerNmbr,

                 @vchCustomerName = vchCustomerName

FROM             CustomerTable

WHERE            iRowId = @iNextRowId

 

-- start the main processing loop.

WHILE @iLoopControl = 1

   BEGIN

     -- This is where you perform your detailed row-by-row

     -- processing.     

     -- Reset looping variables.           

            SELECT   @iNextRowId = NULL           

            -- get the next iRowId

            SELECT   @iNextRowId = MIN(iRowId)

            FROM     CustomerTable

            WHERE    iRowId > @iCurrentRowId

            -- did we get a valid next row id?

            IF ISNULL(@iNextRowId,0) = 0

               BEGIN

                        BREAK

               END

            -- get the next row.

            SELECT  @iCurrentRowId =   iRowId,

                    @vchCustomerNmbr = vchCustomerNmbr,

                    @vchCustomerName = vchCustomerName

            FROM    CustomerTable

            WHERE   iRowId = @iNextRowId           

   END

RETURN



There are several things to note about the above procedure.

For performance reasons, you will generally want to use a column like "iRowId" as your basis for looping and row retrieval. It should be an auto-incrementing integer data type, along with being the primary key column with a clustered index.

There may be times in which the column containing the primary key and/or clustered index is not the appropriate choice for looping and row retrieval. For example, the primary key and/or clustered index may have already been built on a column using uniqueindentifier as the data type. In such a case, you can usually add an auto-increment integer data column to the table and build a unique index or constraint on it.

The MIN function is used in conjunction with greater than “>” to retrieve the next available iRowId. You could also use the MAX function in conjunction with less than “<” to achieve the same result:



Code: [Select]
SELECT    @iNextRowId = MAX(iRowId)

FROM      CustomerTable

WHERE     iRowId < @iCurrentRowId

Be sure to reset your looping variable(s) to NULL before retrieving the next @iNextRowId value. This is critical because the SELECT statement used to get the next iRowId will not set the @iNextRowId variable to NULL when it reaches the end of the table. Instead, it will fail to return any new values and @iNextRowId will keep the last valid, non-NULL, value it received, throwing your procedure into an endless loop.  This brings us to the next point, exiting the loop.

When @iNextRowId is NULL, meaning the loop has reached the end of the table, you can use the BREAK command to exit the WHILE loop.  There are other ways of exiting from a WHILE loop, but the BREAK command is sufficient for this example.

You will notice that in both procedures I have included the comments listed below in order to illustrate the area in which you would perform your detailed, row-level processing.

Code: [Select]
-- This is where you perform your detailed row-by-row

-- processing.   
 

Quite obviously, your row level processing will vary greatly, depending upon what you need to accomplish. This variance will have the most profound impact on performance.

For example, suppose you have a more complex task which requires a nested loop. This is equivalent to using nested cursors; the inner cursor, being dependent upon values retrieved from the outer one, is declared, opened, closed and deallocated for every row in the outer cursor. (Please reference the DECLARE CURSOR section in SQL Server Books Online for an example of this.)  In such a case, you will achieve much better performance by using the non-cursor looping method because SQL is not burdened by the cursor activity.


Here is an example procedure with a nested loop and no cursors:

Code: [Select]
if exists (select * from sysobjects where name = N'prcNestedLoopExample')

   drop procedure prcNestedLoopExample

go

CREATE PROCEDURE  prcNestedLoopExample

AS

/*

**  Non-cursor method to cycle through the Customer table **  and get Customer Name for each iCustId. Get all
**  products for each iCustid.

**

** Revision History:

** -----------------------------------------------------

**  Date        Name       Description       Project

** -----------------------------------------------------

**  08/12/03    DVDS       Create            -----

**

*/

 

SET NOCOUNT ON

-- declare all variables!

DECLARE     @iReturnCode          int,

            @iNextCustRowId       int,

            @iCurrentCustRowId    int,

            @iCustLoopControl     int,

            @iNextProdRowId       int,

            @iCurrentProdRowId    int,

            @vchCustomerName      nvarchar(255),

            @chProductNumber      nchar(30),

            @vchProductName       nvarchar(255)

           

-- Initialize variables!

SELECT     @iCustLoopControl = 1

SELECT     @iNextCustRowId = MIN(iCustId)

FROM  Customer

 

-- Make sure the table has data.

IF ISNULL(@iNextCustRowId,0) = 0

   BEGIN

            SELECT 'No data in found in table!'

            RETURN

   END

 

-- Retrieve the first row

SELECT      @iCurrentCustRowId = iCustId,

            @vchCustomerName = vchCustomerName

FROM  Customer

WHERE       iCustId = @iNextCustRowId

 

-- Start the main processing loop.

WHILE @iCustLoopControl = 1

   BEGIN

        -- Begin the nested(inner) loop.

        -- Get the first product id for current customer.

            SELECT  @iNextProdRowId = MIN(iProductId)

            FROM    CustomerProduct

            WHERE   iCustId = @iCurrentCustRowId

           

         -- Make sure the product table has data for
         -- current customer.

         IF ISNULL(@iNextProdRowId,0) = 0

            BEGIN

              SELECT 'No products found for this customer.'

            END

         ELSE

            BEGIN

           -- retrieve the first full product row for
           -- current customer.

              SELECT  @iCurrentProdRowId = iProductId,

                       @chProductNumber = chProductNumber,

                       @vchProductName = vchProductName

              FROM  CustomerProduct

              WHERE           iProductId = @iNextProdRowId

             END

 

            WHILE ISNULL(@iNextProdRowId,0) <> 0

           BEGIN

        -- Do the inner loop row-level processing here.

        -- Reset the product next row id.

                SELECT  @iNextProdRowId = NULL

                                   

       -- Get the next Product id for the current customer

                SELECT @iNextProdRowId = MIN(iProductId)

                FROM  CustomerProduct

                WHERE iCustId = @iCurrentCustRowId

                AND  iProductId > @iCurrentProdRowId

 

     -- Get the next full product row for current customer.

             SELECT     @iCurrentProdRowId = iProductId,

                        @chProductNumber = chProductNumber,

                        @vchProductName = vchProductName

             FROM  CustomerProduct

             WHERE           iProductId = @iNextProdRowId

           END

 

      -- Reset inner loop variables.

            SELECT       @chProductNumber = NULL

            SELECT       @vchProductName = NULL

            SELECT       @iCurrentProdRowId = NULL

 

       -- Reset outer looping variables.   

            SELECT           @iNextCustRowId = NULL

           

       -- Get the next iRowId.

            SELECT           @iNextCustRowId = MIN(iCustId)

            FROM  Customer

            WHERE           iCustId > @iCurrentCustRowId

 

       -- Did we get a valid next row id?

            IF ISNULL(@iNextCustRowId,0) = 0

               BEGIN

                        BREAK

               END

 

        -- Get the next row.

            SELECT    @iCurrentCustRowId = iCustId,

                      @vchCustomerName = vchCustomerName

            FROM  Customer

            WHERE     iCustId = @iNextCustRowId

           

   END

 

RETURN

In the above example we are looping through a customer table and, for each customer id, we are then looping through a customer product table, retrieving all existing product records for that customer. Notice that a different technique is used to exit from the inner loop. Instead of using a BREAK statement, the WHILE loop depends directly on the value of @iNextProdRowId. When it becomes NULL, having no value, the WHILE loop ends.

Conclusion

SQL Cursors are very useful and powerful because they offer a high degree of row-level data manipulation, but this power comes at a price: negative performance. In this article I have demonstrated an alternative that offers much of the cursor’s flexibility, but without the negative impact to performance. I have used this alternative looping method several times in my professional career to the benefit of cutting many hours of processing time on production SQL Servers.