laurence dougal myers

I have a reasonable collection of music in a variety of compressed formats. I like to arrange my music in a tree structure on the filesystem, sorted by genre. I also like to name the album directories in a particular format. I had acquired a backlog of albums to rename & sort, so, one day when I was procrastinating, I wrote a music sorting script.

Win32 Binaries

/sw/musicsort_bin_20110619.zip

Python Source

/sw/musicsort_src_20110619.zip

This script (and the binaries) is license under the MIT License.

An explanation follows.

My current tree structure looks something like this:

  • Music
    • Albums
      • Genre1
      • Genre2
      • Genre3
    • Misc
      • Genre1
      • Genre2
      • Genre3
My naming conventions for the album directories is like this: Artist - Year - Album [Aux] (Music Type & Bitrate)

e.g.

Slum Village - 2010 - Villa Manifesto (MP3 192)

The "Aux" label is optional, and can store information like whether the album comes from vinyl, or the label/pressing info, or the CD number in a multi-disc set.

While most tagging software can handle the tag-based information okay, it doesn't satisfy my requirements for the Music Type & Bitrate. My logic is like this:
Music Type will roughly be the file extension, "MP3" or "OGG" or "MP4" or *shudder* "WMA". (MP4 gets a little complicated, because there are a few different file formats/extensions, for what is basically the same compression technology, e.g. AAC/MP4/M4A)

Bitrate goes like this:

  • If all files have the same bitrate, assume it's a single constant bitrate (CBR).
    • e.g. (MP3 192)
  • If there is a mixture of different bitrates, and they're all divisble by 8000, assume it's a mixture of CBR. Bitrate value should note that it's mixed and provide the minimum and maxmium bitrates.
    • e.g. (MP3 mixed, 128-320)
  • If all files have a mixture of different bitrates, and no bitraters are perfectly divisible by 8000, assume a variable bitrate (VBR). Also output the average bitrate across all files.
The script has been programmed so that if any value is missing (i.e. artist, year, or album), it will prompt the user to enter a new value for the missing value.

If there are multiple encountered values for any value (such as year), it will ask the user to choose one of the found values, or to enter their own value. The exception is the "artist" field - multiple values will instead default to a value of "VA" (for Various Artists).

Once the script has generated an output folder name, the user is prompted to confirm if it's okay, or to enter "override" values for any of the information fields.

Because "auxilliary" information is not necessarily available from the tags, this info can be specified at runtime, seperately from other "override" values.

If the user enters any override values, the script will again prompt the user to confirm the output folder name. This will loop as long as the user keeps entering override values or auxilliary information. If the user does not enter "Y" or "Yes" in the confirmation prompt, the script will not attempt to move the folder/music files.

The script can be run across a single input folder, or across all sub-folders in the input folder. If the latter, you can skip processing folders by declining the prompt to confirm the output folder name.

If looking in a folder with no files and only one sub-folder, the script will automatically try to process that sub-folder.

Once the user confirms the output folder name, the script will prompt the user to choose a destination directory. The list of destinations is determined by all sub-folders in a previously defined destination root directory (specified in an external configuration file). For my purposes, I have a root "Album" directory, with a list of sub-folders named by genre. I also ignore any directories beginning with an underscore character, e.g. "_to_sort". This lets me have an "incoming" directory of albums that require sorting.

The destination directory, and any other information that should not change at runtime, is specified in a seperate configuration file.

This was my first attempt at working with Unicode strings properly. I'm using a Windows (NTFS) system, which stores filenames as Unicode, and the album data could contain Unicode strings. The script will convert any input from the command line, such as initial arguments or any override values, to Unicode, based on the current system encoding.

Future work:

  • Support analysis of files without any tags, trying to guess values based on the filename.
  • Support for lossless compression (should be pretty simple really, just need to add the right calls to Mutagen).
  • Make all MP4 files use the same music type (rather than AAC/MP4/M4A).
  • Correct "bitrate" naming of albums that use a mixture of VBR and CBR.
  • Support for cancelling the script - currently the only way to cancel is to press CTRL-C, which quits the whole thing.