Managing Your Collections
on Your Computer

3.  Managing Your Artists

Perl is a pearl

Before doing any programming, you have to decide what language to use.  With enough sweat, almost any language can do almost any task, but choosing the wrong one for the task will make you work way harder than you have to.  You wouldn't want to write a data-base program in Postscript, or a Pascal compiler in COBOL (I did the latter, and it took me a solid year!).

These days I use Perl for most of my programming.  I don't even need any reference books for it, because there are tons of books and web pages about it, and because it's a permissive language that doesn't constantly nag me about types, but just goes ahead and does what I tell it.  Perl is native to Unix and comes with every distribution of Linux, but it's available for Windows and Apples too.  So anything I write in Perl ought to work on just about any computer.

Making the ARTIST.DAT files

When we look up musicians and groups, we do it by their names, ignoring certain words like "the", "a", "an", ignoring special characters such as the apostrophe, paying no attention to capitalization, and treating foreign characters as ordinary English letters.  If we want our programs to do the same, we have to tell them how to do that. 

Rather than trying to set up rules, and then endlessly tinkering with them to get them right, it's easier to record the name two different ways.  We want a key that reduces the name to something the computer will put in the right order automatically, for instance "sinead_oconnor", while we store the data of her actual name "Sinéad O'Connor" somewhere else where it can be gotten when needed.

In our music project, the name of the directory where all of an artist's music is stored is the key, and we'll store his actual name in a file inside that directory, called ARTIST.DAT.  Thus all the albums of a certain Spanish group will be in the directory named "ole_ole", along with a file called ARTIST.DAT.  Inside that file is the text "Olé Olé", which our browser will display as "Olé Olé".

The Perl program MAKEARTISTDAT will do most of the work for us; you can click on its name to look at it, then save it to your computer.  Don't forget to make it executeable (in Linux) with the command "chmod +x makeartistdat", or (in Windows) by renaming it to MAKEARTISTDAT.PL.  If you want to rename it to something shorter, such as MAD or MAD.PL, that's entirely up to you as well.

MAKEARTISTDAT is a very short and simple program.  All it does is go through all the artist directories, and create a file ARTIST.DAT if it doesn't exist already.  After creating the file, it take the name of the directory, replaces underscores with spaces, capitalizes the first letter of every word, UNcapitalizes "A", "An", "And", "Of", and "The", and stores that in the DAT file.  That takes care of most of your artists right there.  When I ran it with my present collection, I had 265 artists, and only 58 of them needed further editing.  So 78% of my artists were done as soon as I'd run this program.  But the fact that 22% still needed work shows why we need to separate the name key from the name data.

Editing the ARTIST.DAT files

In Linux, we can go through all the ARTIST.DAT files at once, changing the ones that have special characters, apostrophes, or just need "The" put before the group name.  We don't even need a program to do this; we can simply call up a terminal and type:

find -name artist.dat | sort > temp;chmod +x temp;kwrite temp;./temp

(For your convenience, I've saved this string of commands here so that you can download it and use it.  Or you could select it from this page, paste it into your command line, and run it on your own computer without bothering with files.)

Assuming you're in the music directory when you run this, this string of commands will find all the ARTIST.DAT files, sort them in alphabetical order, and write them into a file called TEMP.  Then it makes TEMP executeable, and opens TEMP with an editor.  When kwrite comes up, you'll see lines like:

./0/4_non_blondes/artist.dat
./a/abba/artist.dat
./a/abc/artist.dat
./a/ace_of_base/artist.dat
./a/aimee_mann/artist.dat
./a/alan_parsons_project/artist.dat
./a/alejandra_guzman/artist.dat
./a/amanda_marshall/artist.dat

Use the search-and-replace command to change "./" to "kwrite ", giving:

kwrite 0/4_non_blondes/artist.dat
kwrite a/abba/artist.dat
kwrite a/abc/artist.dat
kwrite a/ace_of_base/artist.dat
kwrite a/aimee_mann/artist.dat
kwrite a/alan_parsons_project/artist.dat
kwrite a/alejandra_guzman/artist.dat
kwrite a/amanda_marshall/artist.dat

When you're done, quit kwrite, and TEMP will execute, opening each file with kwrite, one at a time.  For most of them (78% in my case), you can tell at a glance that they're OK (4 Non Blondes, Ace of Base, Aimee Mann, Alejandra Guzman, Amanda Marshall, etc.).  After you change "Abba" to "ABBA", hit Control-S to save the change and Control-Q to exit.  Likewise change "Abc" to "ABC", "Alan Parson Project" to "The Alan Parsons Project", and so forth.  The whole thing took me about 20 minutes, and since they're in alphabetical order, you can tell you're making progress as you go.

Keep paper and pen handy, and write down any mistakes you make, and any oddities you notice, as you go along.  That way you can fix them afterwards.  In my case, I forgot to change "Linda Mccartney" to "Linda McCartney" the first time through, so I went back and changed that one file after doing all the others.  Also, I noticed that I had both a "tom_petty" and a "tom_petty_and_the_heartbreakers" directory, so I moved the one album in the former to the latter, and deleted the extra directory.  I noticed also that I had sound-effects clips under "sounds".  I don't want to lose them, but they aren't music!  So I took them out of "Music" and put them in their own directory, "Sounds".

If you're curious, you can see the whole list of ARTIST.DAT files here.

Cleaning up

Kwrite is a nice little editor, and one of its nice features is that it automatically saves the last version when you change a file.  When you changed "Monkees" to "The Monkees" in the file ARTIST.DAT, the old version was saved as ARTIST.DAT~.  ARTIST.DAT contains "The Monkees", and ARTIST.DAT~ contains "Monkees".  Occasionally this lets you recover from stupid mistakes; just erase the new file and rename the old one, and you're right back where you were.

Once you're finished editing the ARTIST.DAT files with "weird" characters (as far as the computer's concerned), you'll want to delete all the backup files.  You can do this with one command:

find -name artist.dat~ | sort > temp;chmod +x temp;kwrite temp;./temp

This is the same as our previous command, except we're looking for ARTIST.DAT~ instead of ARTIST.DAT.  When kwrite opens TEMP, search for "./" and replace it with "rm -f ".  This is the UNIX/Linux delete (remove) command.  When you leave kwrite, TEMP will be executed and all the backup files will be erased.

As before, I've saved the above command here and the resulting TEMP file here.

Making the Artist page

In our finished system, we want to be able to look up songs by artist, album title, or song title.  But we don't want to create these lists by hand.  We want some kind of program to create these lists for us, and bring them up to date or replace them when we get a new CD.  With 450 CDs, let alone 3000, anything else is just too much work.

Let's start with the page listing all the artists.  To begin with, we'll ignore artists on compilation albums, though we'll get them later, I promise.  What we want is an HTML page listing, in alphabetical order, the artists represented by the directories under "Music".  When we click on a name on that page, it should bring up another page that lists all the albums and singles by that artist, and eventually, appearances on compilation albums too.  What we want first is the Artist page, and we'll get to the ones inside each artist's directory later.

The first rule of programming is know what you want to do.  Before writing a line of Perl, I created by hand a small HTML page listing three or four artists, and made sure it worked right and looked the way I wanted.  Then I printed it out.  This way I knew exactly what my Perl program should produce.

The program Makeartisthtml is only a little longer than the one we used to create all the ARTIST.DAT files, and we don't need to do any manual editing afterwards, I promise you.  It creates ARTIST.HTML in the music directory, replacing any previous ones, and writes the HTML header stuff and a navigation bar for jumping directly to a particular section.  Then it opens each section in turn and prints a section header ("A", "B", "C", etc.) if the section isn't "0" and if the section isn't empty (we don't need a section header if there are no artists in the section right?).  Then it reads creates a link to the INDEX.HTML file that will be in each artist's directory, so that later, when we reach that stage, you can click on "Carly Simon" and jump to the page listing all her albums and songs, with pictures of her and pictures of the album covers, and click on the name of an album or a song to play it.  After writing all the section headers and links, the program writes another navigation bar at the bottom of the page and the ending HTML stuff.  You can see the page the program produces for the CDs I have down here, by clicking here.

Next

Next time we'll talk about scanning CD covers and other pictures for your music collection.

Table of Contents
Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5
Chapter 6 Chapter 7 Chapter 8 Chapter 9  
Copyright © 2005-2007 by Green Sky Press.  All rights reserved.  Backgrounds and images are copyright by their respective authors, who retain all rights.

Valid XHTML 1.0 Strict This page has been validated against XHTML Strict and viewed under Konqueror, Firefox, Opera, and Internet Explorer at a screen resolution of 1024 × 768.  If you find any bugs, please contact me at the e-mail address on the home page.