Cursor bad, set-based good

I just had another reminder, as if one was needed, that a cursor should be your last resort to get something done. I was looking to shave some time off my data warehouse build job and noticed a step that looked like it was taking a lot longer than I would expect. I found it was doing 12 cursors in a row on the same table. For each row in the source table it would try and insert into the target table and the insert was wrapped in a TRY…Catch to suppress duplicate row insert errors and keep going. Why? Because a few rogue cases had two rows instead of the one row they should have had. (For the record, I didn’t write this, I just inherited it.)

I changed each cursor to a straight INSERT target-table SELECT … FROM source-table using a row_number function to pick out just the first row for each case. I thought it would go a fair bit faster. In fact, it went from 7 minutes to 15 seconds. I had to check there was still the right number of rows in the target table!

I wish I had time to see how much of the overhead was the cursor and how much was the TRY…CATCH. I suspect each TRY…CATCH only makes a tiny difference but thousands of them will add up.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s