In my course of work, I monitor several replications regularly. When everything goes well, it just takes literally a couple of minutes. But when a problem strikes, it may eat up half my day.
Just this morning, when I thought everything was looking dandy, there was one replication that has its status turned to ‘Performance Critical’. That’s a step below ‘Error’ but over ‘Running’ on the severity error ladder. What does it mean? And how did I solve it?
It all began with me troubleshooting a ‘transaction log is full’ error. Our company policy states that no database in UAT and DEV should have a FULL recovery. This precaution is meant to save some spaces that normally would be used by the transaction logs. Somehow this particular database is in full recovery and had been actively logging until it ran out a space. So I set the recovery to SIMPLE and shrank the log file.
And it appeared. Although shrinking the log files should remove the problem as it removed the multiple virtual log files, the latency increased. My solution to this is to reinitialize the subscription and that works well. Or to the extreme, recreating the publication (remember that I do this thing in UAT!). You may also adjust the threshold setting, however I am not awfully keen as it means to reset the performance standard.
Reference: SQL Server 2005 BOL “Monitoring Performance with Replication Monitor”