If you’re like me and you don’t read directions or search the web before using an application then you’ve probably messed up your iTunes library once or twice.  I recently did this when I tried to use a central iTunes library on a second computer.  I selected the checkbox in the Preferences-Advanced tab that copies music to your iTunes folder when adding to your iTunes library.  As soon as I added the shared iTunes library folder, I created duplicate copies of over 3000 songs.

If you’re on Windows XP or Windows Vista then you can download Powershell 1.0 from Microsoft and use the following script to automagically remove all the duplicates.  The script uses a hash of the file’s contents to determine if they are the same so file names do not matter.  iTunes appends a “ 1” to every file copied making it impossible to simply look for the files by file name.  Using the iTunes duplicate feature would have taken me forever so I resorted to Powershell.

That said, credit goes to The Powershell Team Blog for the MD5 algorithm and to Jason Stangroome’s Blog for the duplicate file script.  I then modified the script to produce immediate output, fail silently and keep a running count of the files to be deleted so I could figure out when the script would finish.  I find the PowerGUI tools easier to use than the straight command line so you might want to download the tool as well before you get started.  Anyway, on to the script:

		 1: function Get-MD5([System.IO.FileInfo] $file = $(throw 'Usage: Get-MD5 [System.IO.FileInfo]'))
		 2: {
		 3: # This Get-MD5 function sourced from:
		 4: # http://blogs.msdn.com/powershell/archive/2006/04/25/583225.aspx
		 5: $stream = $null;
		 6: $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider];
		 7: $hashAlgorithm = new-object $cryptoServiceProvider
		 8: $stream = $file.OpenRead();
		 9: $hashByteArray = $hashAlgorithm.ComputeHash($stream);
		 10: $stream.Close();
		 11:  
		 12: ## We have to be sure that we close the file stream if any exceptions are thrown.
		 13: trap
		 14: {
		 15: if ($stream -ne $null) { $stream.Close(); }
		 16: break;
		 17: }
		 18:  
		 19: return [string]$hashByteArray;
		 20: }
		 21:  
		 22: $fileGroups = Get-ChildItem 'x:' -Recurse `
		 23: | Sort-Object Name -Descending:$true `
		 24: | Where-Object { $_.Length -gt 0 } `
		 25: | Group-Object Length `
		 26: | Where-Object { $_.Count -gt 1 };
		 27: 
		 28: $currentcount = 0;
		 29: $totalcount = $fileGroups.Count
		 30:  
		 31: foreach ($fileGroup in $fileGroups)
		 32: {
		 33: foreach ($file in $fileGroup.Group)
		 34: {
		 35: Add-Member NoteProperty ContentHash (Get-MD5 $file) -InputObject $file;
		 36: }
		 37:  
		 38: $currentcount = $currentcount + 1;
		 39: $filename = $fileGroup.Group[$fileGroup.Count-1];
		 40: 
		 41: Write-Output "Current: $currentcount of $totalcount [$filename]";
		 42: 
		 43: $fileGroup.Group[$fileGroup.Count-1] | Remove-Item -Confirm:$false -ErrorAction:SilentlyContinue;
		 44: }

You’ll notice that this code will delete the last file in a set of duplicates so that if you have multiple duplicates, you might have to run the code more than once.  You need to replace the ‘x:’ in the Get-ChildItem call with the path of your iTunes library.

Lastly, this script will work for any directory with duplicate files.  It does not have to be duplicate iTunes stuff so put the script in your tool belt for use when you need to find duplicate files or pass it around to others that might be able to use it for their own needs.