Oops, sorry about that. Here’s the relevant bit:
It’s true what Danny says, it’s impossible to briefly sum up the whole of our clumping and identification algorithms. There is a lot to it! I can answer some of your specific questions though:
- Directory structure is very important. See the other post for details.
- Track number and media number tags are very important.
- We use both ALBUMARTIST and ARTIST tags. Ideally for compilations you’d have the overall artist name (or perhaps Various Artists) in former and the specific track artist in the latter; however, this data is typically overridden by richer data from our metadata services, so I wouldn’t go to the effort of grooming these tags, aside from ensuring they are consistent amongst the files of an album.
- We do use the album’s TOC.
- The tags we use for clumping and identification are: album, artist, albumartist, track name, track number, and media number.
Identification by TOC (possible when your files are properly organized into directories, none are missing or extra, and have proper media and track numbers) are the most reliable. When we get a TOC match, the other tags (artist/album/title) aren’t needed, so if you get the organization and numbering correct, the other tags don’t matter as much.
Note that TOC matching is unreliable in the case of albums with very few (less than four) tracks. For these, it’s crucial to get the album and artist tags correct to facilitate identification.