A daily Workflow Auto Cleanup job runs to permanently delete workflow instances and related task entries that still exist 60 days after a workflow is completed or canceled. Workflow history items themselves are not deleted, but the entry point to view them on the status page for a particular instance of a workflow will no longer be available. As the workflow history and task lists grow in size, site performance may be compromised.
While perusing storage metrics I noticed a workflow history list that was taking up 4 GB of space. Upon further investigation, I found the workflow history list contained more than 8.2 MILLION items. This prompted me to develop PowerShell scripts to 1) locate and provide information regarding our workflow history lists and 2) remove old workflow history items older than 4 months. During discovery I found several other workflow history lists with millions of items.
The following PowerShell script will locate and display all workflow history lists. The script provides insight regarding site title, URL, list name and item count for each discovered workflow history list.
# Title: Locate-WorkflowHistory.ps1 # Version: 1.0, 28FEB18 # Author: James Sanders # Purpose: Find all workflow history lists Add-PSSnapin Microsoft.SharePoint.PowerShell -EA SilentlyContinue $WebApp = Get-SPWebApplication ForEach ($SPSite in $WebApp.Sites) { ForEach ($Web in $SPSite.AllWebs) { For ($i = 0;$i -ne $Web.lists.count;$i++) { $List = $Web.Lists[$i] If ($List.BaseTemplate -eq "WorkflowHistory") { $O = New-Object PSObject $O | Add-Member NoteProperty Site $($List.ParentWeb.Title) $O | Add-Member NoteProperty URL $($List.RootFolder.ServerRelativeUrl) $O | Add-Member NoteProperty Title $($List.Title) $O | Add-Member NoteProperty Count $($List.ItemCount) $O | FL } } $Web.Dispose() } }
The first thing to determine is how much workflow history you want to keep. The next step is actually purging the workflow history list. Information Management Retention Policies can be configured on the affected list(s), but they don't work very well when there are loads of items in the list(s). I recommend purging the list(s) with PowerShell and THEN creating the policies.
The following script will purge a workflow history list. The list is purged in batches of 100 items at a time to minimize performance impact.
# Title: Cleanup-WorkflowHistory.ps1 # Version: 1.0, 27FEB18 # Author: James Sanders # Purpose: Clean up large workflow history lists # Configure environment # The $webURL variable should point to the site with the workflow history list that you want to purge # The $cutoffDate determines which items to keep. As configured, the cutoff date will be # current date - 120 days. Any items older than 120 days will be removed $webURL = "https://portal/site/subsite" $cutoffDate = (Get-Date).AddDays(-120) # Add the PowerShell Snap-In Add-PSSnapin Microsoft.SharePoint.PowerShell -EA SilentlyContinue # Open the web site Write-Host "`n- Opening web site $webURL" $web = Get-SPWeb $webURL -ErrorAction SilentlyContinue if (!($web)) { Write-Host -ForegroundColor Red "- Unable to open web site $webURL" Exit } # Open Source List Write-Host "- Opening workflow history list" $list = $web.lists["Workflow History"] if (!($list)) { Write-Host -ForegroundColor Red "- Unable to open Workflow History list" $web.Dispose() Exit } DO { # Execute query # Using an SPQuery so that results can be limited ($rowlimit variable). $cutoffDateF = "{0:MM/dd/yyyy}" -f $cutoffDate Write-Host "- Looking for items where 'Date Occurred' <= $cutoffDateF" Write-Host $rowlimit = 100 $caml="<OrderBy><FieldRef Name='ID' Ascending='TRUE' /></OrderBy><OrderBy><FieldRef Name='ID' /></OrderBy>" $camlQuery = New-Object Microsoft.SharePoint.SPQuery $camlQuery.RowLimit = $rowLimit $camlQuery.Query = $caml $items = $list.GetItems($camlQuery) If (!($items)) { Write-Host -ForegroundColor Red "- No items to process" Break } # Delete workflow history items that are before the cutoff date # As we are directly deleting and NOT recycling, we process the items in reverse so as to not throw a "Collection was modified; enumeration operation may not execute..." error! # If we were recycling, we could use a ForEach loop. $itemTotal = $items.Count $itemCurrent = 1 $itemsDeleted = 0 Write-Host "- Items to process: $($itemTotal)" $timeStarted = Get-Date For ($i=$items.count-1; $i -ge 0; $i--) { $item = $items[$i] $itemCurrentF = $itemCurrent.ToString().PadLeft($itemTotal.ToString().Length,"0") $dateOccurredF = "{0:MM/dd/yyyy}" -f $item["Date Occurred"] Write-Host "- Processing item $itemCurrentF of $($itemTotal) ... " -NoNewline If ($item['Date Occurred'] -lt $cutoffDate) { $item.Delete() Write-Host "Deleted - $dateOccurredF <= $cutoffDateF" $itemsDeleted++ } Else { Write-Host "Ignored - $dateOccurredF => $cutoffDateF" } $itemCurrent++ } # Statistics $timeFinished = Get-Date $timeDuration = ("{0:hh\:mm\:ss}" -f ($timeFinished - $timeStarted)) If ($timeDuration.Length -gt 8) { $timeDuration = $timeDuration.Substring(0,12) } Else { $timeDuration += ".000" } $timeTotalSeconds = [Math]::Round(($timeFinished - $timeStarted).TotalSeconds) $deletedPerSecond = [Math]::Round($itemsDeleted / $timeTotalSeconds) Write-Host Write-Host "Batch Started: $timeStarted" Write-Host "Batch Finished: $timeFinished" Write-Host "Batch Duration: $timeDuration" Write-Host Write-Host "Total Seconds: $timeTotalSeconds" Write-Host "Items Deleted: $itemsDeleted" Write-Host "Deleted/Second: $deletedPerSecond" Write-Host } UNTIL ($itemsDeleted -le 0) # Clean up $web.dispose()
After the workflow lists have been purged, create Information Management Retention Policies to keep them cleaned up.