Today was really a new one. I got back from a week off and found our main production server’s tempdb data file had gone from its usual 200MB to 36GB. Ironically I spent Friday at the most excellent SQLBits VI and one of the sessions I attended was Christian Bolton talking about tempdb issues – including runaway tempdb databases. How just-in-time was that?!
I looked into the file growth history and it looks like the problem started when my index maintenance job was chosen as the deadlock victim. (Funny how they almost make it sound like you’ve won something.) That left tempdb pretty big but for some reason it grew several more times. And since I’d left the file growth at the default 10% (aaargh!) the worse it got the worse it got. The last regrowth event was 2.6GB. Good job I’ve got Instant Initialization on. Since the Disk Usage report showed it was 99% unallocated I went into the Shrink Files dialogue which helpfully informed me the data file was 250MB.
I’m afraid I’ve got a life (allegedly) so I restarted the SQL Server service and then immediately ran a script to make the initial size bigger and change the file growth to a number of MB. The script complained that the size was smaller than the current size. Within seconds! WTF? Now I had to find out what was using so much of it. By using the DMV sys.dm_db_file_space_usage I found the problem was in the version store, and using the DMV sys.dm_db_task_space_usage and the Top Transactions by Age report I found that the culprit was a 3rd party database where I had turned on read_committed_snapshot and then not bothered to monitor things properly.
Just because something has always worked before doesn’t mean it will work in every future case. This application had an implicit transaction that had been running for over 2 hours.