What can I do to improve my metadata matches?

I’ll continue where Danny left off. There are two phases involved in getting metadata for your files. The files are first grouped into clumps that have a high likelihood of being a complete album. Then we use the filenames, tags, and track lengths to identify each of the clumps.

Your question pertains mostly to the clumping phase, as identification results will vary as a consequence of how the files are clumped. For example, if you have the Pink Floyd Discovery box set, which is just composed of re-releases of studio albums, how the files are organized will dictate whether the resulting identification pulls info from the box set or from each of the original albums. In particular, if you don’t have media numbers we’ll assume each disc is a separate entity while if you have correct media numbers we’ll be able to identify the entire box set.

As Danny alluded to, the single most important factor for getting a good identification is to make sure the files for a given album (or album set) are together in a directory, they have proper media numbers and track numbers, no files are missing, and no files have duplicates.

When something doesn’t identify, the first thing I look for is Are there any files missing? Are there any extra files that don’t belong with this album? Then I look at the media and track numbers. These might occur in the file names (e.g. “05-03 Summer '68.flac”) or in the file’s tags. We check both but in general the tags are more authoritative. If things aren’t identifying, you might look for a mistake in the tags or a disagreement between the tags and filenames.

Whether the files of a multi-disc set are together in one directory or separated into subdirectories is a matter of preference. If you separate the discs, make sure the subdirectories include a parseable indication of the disc number (“Disc 5”, or “Atom Heart Mother [disc 5]”, “Atom Heart Mother CD5”, etc) and that it agrees with the media number in the tags and/or filenames.

So this scheme will work:

Music/
    Miles Davis - The Complete Columbia Album Collection/
        01-01 Track.flac
        01-02 Track.flac
        ...        
        02-01 Track.flac
        02-02 Track.flac
        ...        
    Pink Floyd: Discovery/
        01-01 Track.flac
        01-02 Track.flac
        ...        
        02-01 Track.flac
        02-02 Track.flac
        ... 

And this will work just as well:

Music/
    Miles Davis - The Complete Columbia Album Collection/
        CD1/
            01 Track.flac
            02 Track.flac
            ...        
        CD2/
            01 Track.flac
            02 Track.flac
            ...        
        ...
    Pink Floyd: Discovery/
        CD1/
            01 Track.flac
            02 Track.flac
            ...        
        CD2/
            01 Track.flac
            02 Track.flac
            ...        
        ...  

One thing that likely won’t work is having all the discs of a box set separated into album-level directories, like this:

Music/
    Miles Davis - The Complete Columbia Album Collection CD1/
        01-01 Track.flac
        01-02 Track.flac
        ...        
    Miles Davis - The Complete Columbia Album Collection CD2/   
        02-01 Track.flac
        02-02 Track.flac
        ...        
    ...
    Pink Floyd: Discovery CD1/
        01-01 Track.flac
        01-02 Track.flac
        ...        
    Pink Floyd: Discovery CD2/
        02-01 Track.flac
        02-02 Track.flac
        ...        
    ...

Don’t do that!

Whether you have box sets or not, you want to get the directory organization and track/media numbers of your files correct. The other tags usually don’t matter as much except in the case of albums with very few (less than four) tracks. For these, it’s crucial to get the album and artist tags correct to facilitate identification.

6 Likes