Controlled Vocabularies

Years ago, I wrote some relational database applications for corporate use in a sales and marketing environment and learned very quickly that users always enter data differently, even into the same field, even by the same person on separate occasions. It is amazing how many different ways there are to enter a name such as Lytham St. Anne's for example, with the various spellings of Ann and Anne both with and without an apostrophe, Saint, St, St. with and without spaces and full stops, not to mention mispellings of Lytham! Then of course, there can be various capitalisation combinations of 1, 2 or all 3 words. When it comes to organising, searching for and ordering data, this quickly becomes a nightmare as you need to include every possible variation and anticipate mispellings of each term, otherwise any statistics will always be inaccurate and search results inaccurate.

So when I set about creating my first digital asset management system (DAM as it is now fashionably called ;) ) I tried to make sure that I would always enter keywords, location names, copyright details etc in exactly the same way and in the same format. That would ensure that everything would be neat and tidy, would look more professional but most importantly would also speed up cataloguing, indexing, ordering, retrieval and searching.

I adopted a system of entering location details in the format Town, County, Country in a top-down way so that I could find images by country, then by county and then by town, which would make it easy to find all images taken in any given place. In that scenario, it is obviously essential that the names are always entered in exactly the same way. None of the software available at the time, unfortunately, offered a means of drop-down selection boxes so it was necessary to periodically index and check that everything was in the right place. I could have written my own database but it seemed pointless re-inventing the wheel and besides, I was sure a better solution would soon become available as digital photography became more widely accepted and image management became more of a problem for more people.

I devised a system where original images were kept in one folder then when processed, the files were sent to another and then had seperate folders for those which had been sharpened and from which prints would be made, those which had been watermarked from which web galleries would be built. It worked quite well but the number of dfferent versions needed for each image was stupid and if I needed to go back to the original, all the cleaning and adding keywords etc had to be repeated. I tried a number of available software solutions and found each had their strengths and weaknesses. Some were good at building web galleries, others at sorting images, another at creating a viewer for offline use and so on but none of them seemed to understand the concept of the digital workflow and of what was required by anyone taking photography seriously. Fewer still seemed to appreciate how important correct captioning and efficient keywording was. My take on it was that I didn't mind spending the time entering all the relevant information initially as I would gain in the long term - I certainly did not expect to need to re-enter all of that information in the future, so I tried to plan ahead.

As a nature and wildlife photographer I was starting to supply images to specialist libraries such as the Science Photo Library, so it was essential that every image was correctly captioned with both English and Latin or scientific names and that the keywords include species as well as genus and other appropriate details. I soon got fed up with constantly having to look up species and scientific names and enter them every time I took a picture of a specific bird, flower or mammal so I devised my own system of nested keywords. Cumulus by Canto was the only software at the time which allowed nested keywords, as neither iView Media or Portfolio or any of the other many applications I tried, had this facility.

I set up a complex nested structure which started top-down again and I adopted a nomenclature of always using capital letters for species but lower case for generic terms - for example, 'House Sparrow' would be captialised but 'sparrow' would not so that it was easy to see which needed further clarification or naming. So for example, when quickly keywording I might enter sparrow but later on find all the images of sparrows and then separate them into species such as House Sparrow, Field Sparrow, Tree Sparrow etc. A separate keyword branch was used for male, female, juvenile, adult, summer, winter plumage etc. Those I wasn't sure of were keyworded 'unknown bird' or 'unknown mammal', so I could easily retrieve them later for proper identification.

continue to Part 2

Read more about my Digital Darkroom