Controlled Vocabularies

Part i

Years ago, I wrote some relational database applications for corporate use in a sales and marketing environment and learned very quickly that users always enter data differently, even into the same field, even by the same person on separate occasions. It is amazing how many different ways there are to enter a name such as Lytham St. Anne’s for example, with the various spellings of Ann and Anne both with and without an apostrophe, Saint, St, St. with and without spaces and full stops, not to mention mispellings of Lytham! Then of course, there can be various capitalisation combinations of 1, 2 or all 3 words. When it comes to organising, searching for and ordering data, this quickly becomes a nightmare as you need to include every possible variation and anticipate mispellings of each term, otherwise any statistics and search results will always be inaccurate. 

So when I set about creating my first digital asset management system (DAM as it became fashionably known) I tried to ensure I would always enter keywords, location names, copyright details etc in exactly the same way and in the same format. That would ensure that everything would be neat and tidy, would look more professional but most importantly would also speed up cataloguing, indexing, ordering, retrieval and searching.

I adopted a system of entering location details in the format Town, County, Country in a top-down way so that I could find images by country, then by county and then by town, which would make it easy to find all images taken in any given place. In that scenario, it is obviously essential that the names are always entered in exactly the same way. 

None of the software available at the time, unfortunately, offered a means of drop-down selection boxes so it was necessary to periodically index and check that everything was in the right place. I could have written my own database but it seemed pointless re-inventing the wheel and besides, I was sure a better solution would soon become available as digital photography became more widely accepted and image management became more of a problem for more people. I devised a system where original images were kept in one folder then when processed, the files were sent to another and then had seperate folders for those which had been sharpened and from which prints would be made, those which had been watermarked from which web galleries would be built. It worked quite well but the number of dfferent versions needed for each image was stupid and if I needed to go back to the original, all the cleaning and adding keywords etc had to be repeated. I tried a number of available software solutions and found each had their strengths and weaknesses. Some were good at building web galleries, others at sorting images, another at creating a viewer for offline use and so on but none of them seemed to understand the concept of the digital workflow and of what was required by anyone taking photography seriously. Fewer still seemed to appreciate how important correct captioning and efficient keywording was. 

My take on it was that I didn’t mind spending the time entering all the relevant information initially as I would gain in the long term – I certainly did not expect to need to re-enter all of that information in the future, so I tried to plan ahead. 

As a nature and wildlife photographer, I was starting to supply images to specialist libraries such as the Science Photo Library, so it was essential that every image was correctly captioned with both English and Latin or scientific names and that the keywords included species as well as genus, common names and other appropriate details. I soon got fed up with constantly having to look up species and scientific names and enter them every time I took a picture of a specific bird, flower or mammal so I devised my own system of nested keywords. Cumulus by Canto was the only software at the time which allowed nested keywords, as neither iView Media or Portfolio or any of the other many applications I tried, had this facility. I set up a complex nested structure, starting top-down and adopted a nomenclature of always using capital letters for species but lower case for generic terms – for example, ‘House Sparrow’ would be captialised but ‘sparrow’ would not so that it was easy to see which needed further clarification or naming. So for example, when quickly keywording I might enter sparrow but later on find all the images of sparrows and then separate them into species such as House Sparrow, Field Sparrow, Tree Sparrow etc. A separate keyword branch was used for male, female, juvenile, adult, summer, winter plumage etc. Those I wasn’t sure of were keyworded ‘unknown bird’ or ‘unknown mammal’, so I could easily retrieve them later for proper identification. 

After trying several different solutions for processing RAW images, a subject I will cover seperately, I bought the Adobe Camera Raw Plug-in for Photoshop when it first became available. I liked the colours that ACR produced and so when Canto decided to cease development of the single user version of Cumulus in favour of enterprise solutions, I started to use Adobe’s Bridge in conjunction with ACR

Things started to change as the market matured and software companies started to realise that there was in fact a gap in this lucrative market. When Apple launched Aperture, at first look it seemed ideal and I had a chance to try it out for a while but I was disappointed – I really wanted to like it but I did not take to it at all. It seemed rather awkward with lots of unecessary eye-candy and I didn’t get on with the keyword facility which for me, is such an important part. My opinion was that this software seemed more suited for wedding, press or event photographers who need to process lots of images quickly rather than fewer images but with more accurate details and precise colours. Then within a week or so, Adobe announced the Lightroom project and made the software available as a free public beta. I jumped at the chance and immediately downloaded it. I fell in love with the interface as soon as I started using it and began importing a few hundred images just to try it out. As newer beta versions were released, it wasn’t long before I had imported my entire library and was using it on a daily basis, despite Adobe’s cautions, as it promised to be (almost) everything I had been hoping for. I was able to provide some ideas and feedback about the concept of an hierarchical keyword system and found others were facing similar issues. As the later betas arrived I was amazed to find that, on re-importing my images, it recognised my old Cumulus keyword structure almost intact with just a little tidying up required. I think I must have been able to offer some useful feedback to Adobe during the beta testing as I was then invited to join the Pre-Release program. It seems that at last, I had found a software solution to suit my particular needs. 

After much time and effort, my keyword structure evolved and became much more sophisticated so that when entering ‘sparrow’ I could select from a drop-down list the type of sparrow and Lightroom would automatically enter the scientific name, genus, species, family, order and so on, allowing me to quickly, easily and accurately keyword images without constant repetition. More importantly, I knew that the keywords I selected would always be entered in exactly the same way, so that on searching for sparrow, I would be offered all sparrows in my library but searching for House Sparrow, would only show images of that species and none will be missed due to being spelled incorrectly. 

Part II