I’ve been working on an archiving solution for one of our products but was suffering from a disastrous performance problem with a relatively simple SQL statement. Here’s the problematic SQL :
delete from c
from #CodeToArchive cta
join Code c on c.oid = cta.oid
#CodeToArchive is a temporary table containing one column of primary keys to be deleted from the table Code. With the temporary table only containing about 100 rows this statement still took over 5 minutes to complete!
Running an execution plan highlighted that the delete from Code wasn’t the problem :
…but lower down the problem became apparent. The Code primary key is a foreign key in a number of other tables and one in particular, the Reward table, was causing the problem :
Whilst all other foreign keys were all using Nested Loops and Index Seeks, Reward ended up a Merge Join with an Index Scan despite the fact there was an index on the foreign key. That fat arrow is highlighting a transfer of 24 million rows totalling about 600Mb!! The estimated and actual row counts both matched which suggested the statistics were all up-to-date.
Lots of head-scratching and an epiphany later and things were sorted but we had to fool the query optimiser! The problem was the Code foreign key on Reward was full of nulls (and before you ask, this particular piece of application functionality wasn’t yet being used in this installation so we couldn’t just remove the foreign key) and this was fooling the query optimiser into thinking it had to search the entire index.
The solution was to give the optimiser a hand into realising there’s not quite so much data that it needs to look at, so a join to reward with a where clause did the trick :
delete from c
from #CodeToArchive cta
join Code c on c.oid = cta.oid
join Reward r on r.code = c.oid
where r.code is not null