Recently, we came across many incidents relating to large content database and when examined the Site Collection's Storage Matrix report, the reason was identified as too many versions added for large documents. Here, would like to add pointers towards identifying this problem and how to rectify the same.
About Versioning:
Versioning is available in SharePoint List and Documents libraries. See Plan Versioning Technet (https://technet.microsoft.com/en-in/library/cc262378.aspx). This article talks about changes to document libraries versioning since SharePoint 2016 RTM.
- In SharePoint 2016, the major version limit is set as 500 on libraries when Document Libraries are created. i.e Versioning is enabled by default when Document Libraries are created. And the "Require documents to be checked out before they can be edited" settings are also disabled by default during document library creation. So, any edits to properties or contents of the existing files in a library will create new major versions silently.
- As Administrators or Users can leave this unnoticed for long, the library will grow very large due to too many versions per document. Each version of same document adds same document size to occupy in content Database per version.
If document is edited by opening/changing the content, usually a major version is published. When document is not opened or checkout, but the properties of the document is updated (Edit Properties), then also a version gets created. Usually a minor version. However, all the above will contribute to multiplying the size of the document in content DB.
We can control the number of Major and Minor versions once enabled the Versioning settings per library. However, reducing version limit from 500 to a small value will not immediately delete any existing file versions. It follows below mentioned conditions.
- The items newly added after setting the version limit will follow the Version limit. However, the previously added items will not see any reduction in number of versions immediate.
- The items which are already having many versions beyond the new limit, will start deletion of versions only after the item is published to next Major Version.
Note: The items which contains too many of versions, when dropped version Limit to a very lower value, will initiate large SQL transactions (only after the moment we Publish that item to next Major Version) which may take hours to complete and will eventually time-out when performed from GUI. So, we should use PowerShell to clean-up versions, then only set lower values at this affected Libraries versions limit post cleanup of large versions.
Suggestion:
- Check the Storage Matrix page on each site collection to identify large document libraries and ensure only required number of versions are allowed.
- Check current number of versions using below mentioned SQL query. (Run this against a backup of production Content Database. Do not run the query directly against Production Database)
SQL Query to get Top 100 most versioned documents: Run against the restored copy of Content DB
SELECT TOP 100 Webs.FullUrl As SiteUrl, Webs.Title 'Document/List Library Title', DirName + '/' + LeafName AS 'Document Name', COUNT(dbo.AllDocVersions.UIVersion)AS 'Total Version', SUM(CAST((CAST(CAST(AllDocVersions.Size as decimal(38,2))/1024 As decimal(38,2))/1024) AS decimal(38,2)) ) AS 'Total Document Size (MB)', --below is the issue AVG(CAST((CAST(CAST((AllDocVersions.Size) as decimal(38,2))/1024 As decimal(38,2))/1024) AS decimal(38,2))) AS 'Avg Document Size (MB)' FROM Docs WITH(NOLOCK) INNER JOIN AllDocVersions ON Docs.Id = AllDocVersions.Id INNER JOIN Webs On Docs.WebId = Webs.Id INNER JOIN Sites ON Webs.SiteId = SItes.Id WHERE Docs.Type <> 1 AND (LeafName NOT LIKE '%.stp') AND (LeafName NOT LIKE '%.aspx') AND (LeafName NOT LIKE '%.xfp') AND (LeafName NOT LIKE '%.dwp') AND (LeafName NOT LIKE '%template%') AND (LeafName NOT LIKE '%.inf') AND (LeafName NOT LIKE '%.css') GROUP BY Webs.FullUrl, Webs.Title, DirName + '/' + LeafName ORDER BY 'Total Version' desc, 'Total Document Size (MB)' desc
Use below mentioned script to identify a specific libraries' document versions.
Add-PSSnapin *Microsoft.sharePoint.Powershell* $SPweb = Get-SPWeb http://WebURL/ # Run below to find Library Name # $SPweb.Lists | Select Title $SPlist = $SPweb.Lists["Documents"] # Use the library title in this command. "Web is : " + $SPweb.Title "List is :" + $SPList.Title + " with item count " + $SPList.ItemCount foreach ($SPitem in $SPlist.Items) { $currentVersionsCount= $SPItem.Versions.count $ItemName= $SPItem.Name $ItemName + " " + "version count is :"+ $currentVersionsCount }
We can use scripts like the below example to cleanup additional versions observed in a specific library.
Add-PSSnapin *Microsoft.sharePoint.Powershell* $SPweb = Get-SPWeb http://WebURL/ $versionsToKeep = 75; # This is based on the highest value obtained from Script/SQL query results mentioned above. $SPlist = $SPweb.Lists["LibraryName"] "Web is : " + $SPweb.Title "List is :" + $SPList.Title + " with item count " + $SPList.ItemCount foreach ($SPitem in $SPlist.Items) { $currentVersionsCount= $SPItem.Versions.count if($currentVersionsCount -gt $versionstoKeep) { for($i=$currentVersionsCount-1; $i -gt $versionstoKeep; $i--) { $SPItem.versions[$i].delete() } } "Completed. Use SQL Query to check versions status" }
Applied to : SharePoint 2016.
POST BY : Binu K Raj [MSFT]