DPM 2012 Console Crash and RollbackSnapshotTempDB
I started the week with a series of alerts that the Access Service on my System Center 2012 Data Protection Manager (DPM) server was down. A quick look at the Application log on the server showed an error from MSDPM with Event ID 999. According to the details:
An unexpected error caused a failure for process ‘DPMAMService’. Restart the DPM process ‘DPMAMService’.
and…
Paths that begin with \\?\GlobalRoot are internal to the kernel and should not be opened by managed applications
along with “SQL” mentioned in several places in the exception details.
Restarting the service was successful, but after some period of time, it would crash again. If I tried to run the console, MMC crashed. According to the logs, backups were running normally; I just could not access DPM. Next step? Consult the all-knowing oracle… the Internet.
A TechNet forum post (MMC crashing when opening DPM 2012 management console) matched my issue, and Vincent K offered a solution. He had experienced the same issue and opened a ticket with Microsoft. He offers their fix in a post on his blog, IT Hassle. The solution was to correct the references to GlobalRoot by running this SQL query against your DPM database. Only after taking a backup of course.
UPDATE tbl_IM_ProtectedObject SET PhysicalPath = REPLACE (CONVERT(nvarchar(MAX),PhysicalPath),'\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy', 'c:\') where PhysicalPath like '%globalroot%'
I did that, updating the one record found, and as promised, the problem went away. MMC opened and DPMAccessManager started and continued to run… for one day. Then the problem came back.
I ran a query on the SQL table looking for GlobalRoot, and found that the problem record was back. The details showed that this object was a database on a SQL server for SharePoint. The DB name wasn’t one I recognized, starting with “RollbackSnapshotTempDB” and followed by a GUID. I learned that this is a database that SQL creates when VSS initiates a backup. I won’t go into the details here because I’m not a SQL guy, but that’s OK because the CSS SQL Server Engineers have a post on their blog that explains it well.
I opened Management Studio, and there was the culprit:
For whatever reason, this database was left behind. The CSS blog post makes it clear that it is safe to delete this database, so I did. You need to take the database offline before you delete it, otherwise SQL complains because it can’t find the database in the file system. Now I wait and see if the problem reoccurs. Hopefully this was a one-off issue.