Pages

Monday, October 4, 2010

The Mathematics of f/stop Aperture Numbers

A sequence is a set of numbers that can be constructed using a formula known as a recurrence relation.  The most obvious of these is the set of natural numbers (integers). How? Start with any integer (positive or negative). To get the next integer, simply add 1!

Some sequences are very obvious to decipher, while others require more mathematical manipulation, such as the Fibonacci sequence. Sequences often show up in pure mathematics, number theory, and computer science. [More about sequences].

One of the most widely used sequences is the f-number or f-stop (f/stop) series of numbers in photography. No matter where you stand as a photographer, you will be faced with these numbers. Often times, you will have to rely on your calculations to determine how many stops there are between two stop numbers. You will have to memorize them or just rely on your camera - unless you know the mathematics behind these numbers.

In this article, I will explain the method I use to remember the f-stop sequence. All that is needed is to remember the first two numbers. I will first quickly present the method so that you don't have to read this entire article. I will then present the mathematical formalism for the way f-stop numbers are constructed.

How to Remember the f/stop Numbers

I will start with the most common f-stop numbers. These are given by the following set of numbers


Let us write these in the following way


Looking at each row separately, you will quickly notice that these form what is called a geometric sequence. A geometric sequence is a set of ordered numbers, in which any number is obtained by multiplying the previous number by a constant value. This constant value is known as the common ratio. Guess what the common ratio (in our case) is? It is 2 (see proof below). Here's another graphic for that


As you can see, you only need to remember the first two f-numbers, i.e. 1 and 1.4. Separate them into two sets: the even set (first row) and the odd set (bottom row). Then construct the entire f-stop range just by multiplying by 2.

Remarks:
  1. Note that each set presents jumps in two stops, not one stop. f/1.4 lets in four times more light than f/2.8. Being multiplied by two should emphasize that fact - for f/1.4, the lens diameter is twice as much as that for f/2.8.  (As will be explained below, to go one full stop at a time, you'd have to multiply by Sqrt(2)~1.4, e.g. 1.4*1.4 ~ 2 and so on...)
  2. Looking at the odd set (bottom row), you can notice that 5.6x2 = 11.2, not 11. So why do we choose 11? To the best of my knowledge, it is just a convention to keep the numbers easy to remember. The actual f/stop used by the lens is 11.312).
As will be shown below, if we start by f/1.0 as the smallest possible f/stop, the next full stop is 1.0xSqrt(2) = 1.414. This is the first f/stop that corresponds to the bottom row. Now
1.414x2 = 2.828
2.828x2 = 5.686
5.686x2 = 11.312
11.312x2 = 22.624
22.624x2 = 45.248

and so on...

Mathematical Analysis

For those of you who are mathematically inclined,  the analysis that follows provides the rationale behind the construction of the stop number.

In photography, the lens aperture is that opening in the lens (or on the camera body) that determines the amount of light that is to be admitted to the light sensitive medium (film or CCD ...). The surface area of this opening can be adjusted by the use of a diaphragm. The action of closing or opening the diaphragm is called stopping down the lens (whether full or half). We define the f-stop number as


In science & engineering, S is referred to as a dimensionless number, meaning that it does not have any units associated with it. In our case, since the focal length and the diameter both describe a length (m, cm, mm...), their ratio is dimensionless because the units can be simplified (just like simplification of numbers).

The advantage of using a dimensionless quantity is that any results drawn form an experiment on a specific device (lens in this case) will equally apply to any other device with the same dimensionless number. For example, you've probably heard of the Mach number "M". The Mach number is a dimensionless quantity used in aerodynamics and describes how fast an object is moving in a medium (air) compared to the speed of sound in that medium. It is probably the most popular dimensionless number on the planet! (I think the f-stop number should be added to the list of dimensionless number). Now any experiment carried out on a model jet with M = 3 for example, will illustrate exactly what happens when the real jet is flying at M = 3 in the atmosphere (shockwave structure, pressure and temperature distributions ...).

Here's an example in photography. The amount of light REACHES THE SENSOR using a 28 mm lens with S = 2 is exactly the same as that of an 80 mm lens with S = 2, although the diameters of both apertures are different (if all factors that affect image brightness are held constant). (If you can't prove this for yourself, let me know and I'll write up my proof).

This dimensionless number is a very useful tool for determining properties of lenses (and therefore the light coming through) without referring to diameters or any other data.

In cameras, when we set the f/stop number, we are essentially setting the Diameter of the lens aperture. This can be computed by knowledge of the focal length. Therefore, for a given stop S, the diameter opening of your lens is


For example, a lens set at a focal length of 70mm and a stop number of 5.6 has an aperture diameter of


Note that S is inversely proportional to D which explains why as the stop number increases, less light enters the camera since D decreases (f/32 lets in less light than f/16).

Let us now compute the stop number (S) required to let in Twice as much light, for a given lens set at a fixed focal length.

We start by computing the diameter that will let in twice the amount of light. This is equivalent to saying that the aperture surface area has to be twice as much to let in twice as much light. For example, at the same focal length, an aperture with a surface area of 10 mm^2 will let in twice as much light as an aperture with a surface area of 5 mm^2. This is how it looks mathematically

We now set


or

finally

In other words, for an aperture to let in twice as much light, its diameter must increase by approximately 41%.

Now that we have a relation between the diameters of both apertures,  we can use the f/stop equation to deduce the recurence relation between the stop numbers. This is done as follows:


Thus, to let in half as much light, we multiply the previous stop number by the square root of two ~ 1.4. Alternatively, to let in twice as much light, we multiply by the reciprocal of the square root of two ~ 0.7. In general, for a stop number Sn, we have


where S{n+1} is the stop number that lets in HALF as much light as Sn, while S{n-1} is the stop number that lets in TWICE as much light as Sn. For example, if we are a stop number of Sn = 5.6, we have


In its present form, our recurrence formula depends on a fixed stop number. It would be useful if we can write our recurrence formula based on some initial reference stop number S0. For this, we do the following


With this general recurrence formula, we can calculate the stop number at any given number of stops from a starting number S0. For example, if our lens is set at S0 = 5.6 and we want to calculate the stop number corresponding to 3 stops (i.e. letting in 3 times as much LESS light), we have


This means that n is in fact the number of stops from S0. It is simply a counter of stops.

Smallest Stop Number


The question now arises as to what is the smallest stop number that a lens can achieve and how difficult is it to manufacture such lenses. I do not have any experience with lens manufacturing (although I have been contemplating learning that skill lately), but the mathematics could give us a hint. Looking at the equation for the aperture diameter


one could argue that the longer the focal lens of a camera, the more difficult it is to achieve small stop number. Here is why. Let's say that you have a 50mm lens, for a stop number S=1, this means that the aperture diameter is equal to the focal length, i.e. 50 mm. Threfore, the actual diameter of your lens MUST be at least 50mm! If you add the barrel and the internal mechanisms, the actual diameter will be even larger than 50mm.

Look at the Nikon 50mm F1.4D lens for example. Its maximum aperture diameter is D = 50mm/1.4 = 35.7mm while the lens' actual diameter is a whopping 64.5mm!

Let's take another extreme. For a lens with a focal length of 500mm, a stop number of 1 means that the aperture diameter is 500mm! Imagine carrying a lens that's half a meter in diameter (of 1.65 ft!). These  become impractical (unless you're dealing with a telescope). For example, the maximum aperture diameter for the Nikon 600mm f4.0 is D = 600/2 = 150mm. The lens has a diameter of 165mm and weight of about 5kg!

Bottom line is that the longer the focal length of a lens, the more difficult it is to manufacture it with wide apertures. That is also why fast lenses are very expensive! And that's why fast TELEPHOTO lenses are even more expensive! There are important design considerations to take into account in that case...

It is however convenient to choose S = 1 as the smallest stop number and start from there. In this case, our recurrence formula becomes


and here's how the famous f/stop numbers are generated:


Remember, each full stop lets in twice as much or half as less light. The above equation is for reducing the aperture size, i.e. letting in less light. (the converse recurrence relation can be easily derived).

Intermediate Stops

Now what about half stops, one-third stops or one-fourth stops? How are these numbers constructed? If a full stop lets in half as much light, does a half stop let in 75% light? Let's look at that.

Keeping in mind that the f-stop sequence is a geometric sequence (multiplicative), any value sought within an interval has to obey the rules of a geometric sequence. Let us insert a "partial stop" in the middle of a full stop interval


or


but, we know that


form which we can computer P, the common ratio for the half-stop sequence


The same principles applies for deriving one-third, one-fourth,... one-mth stops (divide the interval into m sub intervals). In general, for a one-mth stop increment, we will have


For example, for a 1/3 stop increment from S0=1, we have


There is an even easier way to derive non-integer stop formulas. Using


we notice that there is no restriction on "n" being an integer. As discussed previously, n is simply a stop counter. So, for example, if you want 1/3 stops, you simply substitute n = 1/3. If you want 2.4 stops, use n = 2.4 etc...

Number of Stops Between Two Stop Numbers

One can derive an equation for the number of stops between two stop numbers using the formula


By taking the natural logarithm of both sides of the equation, we get


For example, to calculate the number of stops between f/22 and f/1.4, we set


so that f/1.4 lets is 8 stops away from f/22 and lets in ~ 256 times more light than f/22! (at the same focal length. At other focal lengthes, the amount of light that reaches the sensor is ~ 256 times more).

Voila!

Wednesday, August 18, 2010

How to Enable PHP Inside HTML Pages

php code can be embedded within html pages within the tags:
<?php echo "my php code"; ?>

However, and depending on the server that you are using, this must be enabled. On apache2, you can edit the .htaccess file to include:
AddType application/x-httpd-php .htm .html

I added this line to the .htaccess on my website's root directory.

Voila!

Monday, August 16, 2010

Picasa Installation Error on Ubuntu

So Picasa has failed to install on my Ubuntu box. The caveat is that it also prevented me to install any other packages and gave the following error whenever i tried to install something:
ubuntu previous installation hasn't been completed
I suspected it was Picasa, but to verify, I opened up a terminal and typed the following:
sudo apt-get -f install
and I got the following error message:
The package Picasa needs to be reinstalled, but I can’t find an archive for it.
The solution to this is to cleanup the incorrectly installed package (Picasa or else) via the following:
dpkg --remove --force-remove-reinstreq picasa
That solved the problem for me. I haven't tried reinstalling Picasa though.

Friday, August 13, 2010

How to Change Computer Name or Hostname in Linux

The hostname is stored in the file
/etc/hostname
You can edit that with your favorite editor. I use emacs:
sudo emacs /etc/hostname
You will also need to match that in
/etc/hosts

Wednesday, August 11, 2010

Extract Multipart rar Files in Linux

Open up a terminal and browse to the directory where the multipart rar files are and type
unrar x thefile.part1.rar

Tuesday, August 10, 2010

How to Kill a Process in Linux

First, identify the PID for the process that refuses to end. Open up a terminal and type
top
To close top, type
Ctrl + c
now you're back in terminal. To kill the process, type
kill -9 PID

Hope that works!

Saturday, July 31, 2010

Avoid Computer Injury - Repetitive Strain Injury (RSI) Software

I've recently succumbed to the unforgiving punishment of continued computer use. I purchased my first computer in 1995. It had an Intel Pentium 133 MHz processor. I don't recall the memory size. It had a 4GB hard disk drive (called Quantum BigFoot). And so began my computing journey. I was inseparable from this machine. I vaguely remember a day passing without me using a computer. Of course, I had no hint that muscle injuries could occur due to extended and imporper use of computers. I don't think that everybody agrees the proper computing practices, but no one argues that regular breaks and stertches must be carried out during computer usage sessions.

Looking back at things, I averaged  about 10 hours a day using a computer. So, for the past 15 years, that's: 5,479 days or 54,790 hours!!! To be a bit conservative, I will take an average of 7 hours a day (to account for occasional breaks). That would only take it down to 38,353 hours... that's a lot of hours.

After looking at these numbers, and the stiffness in my shoulders, I decided that it is time to look for options. Of course, forcing myself to take breaks never worked, so the ideal candidate was a software that lurks in the background and tracks my computing levels to suggest micro and macro breaks.

Micro breaks seem to be the most important. These are very short breaks (10 to 30 seconds) that are to be taken between intervals of intense keyboard usage activity (~50 words per minute). In general, taking a micro break every 2 to 4 minutes is a good option. Ive looked at a few software and here's what I liked so far:

MacBreakz - I use that on my Mac. It tracks your activity and suggests breaks with stertching exercises. I like it a lot. It is very cheap ($25 for a single license) and it is worth every penny.

Wellnomics Workpace - Windows only. I used it for a while. The user interface is a bit ugly, but the software has great features. It can perform statistics and does real time keyboard tracking. It is expensive: $69 for a single license.

Workrave - Windows/Linux, and most importantly open source! It may not have all the advanced features of Workpace, but for the price, I'm taking it.

So here you go. My advice, don't underestimate this problem. Take the proper measures to reduce the impact of computer usage on your muscles and ultimately your career.

Friday, July 30, 2010

emacs, vim, svn, doxygen, and friends - Stocking your Unix Toolbox

Charles Reid, a doctoral candidate at the University of Utah and a very good colleague of mine, has kindly agreed to post some lectures from his summer "Scientific Computing Workshop" on PMAN. Charles is not only knowledgeable in all Unix related stuff (Unix, Linux, OSX), but also a very rigorous researcher. You can find his original workshop series on his website: http://charles.endoftheinternet.org/.


and here's part 2

Thursday, July 29, 2010

LaTeX Symbols - Detexify

say you're looking for some weird LaTeX symbol but you don't want to go through an exhausting list of possible candidates. You just want to draw that symbol and have someone tell you what the syntax for that is.

Well, Detexify does this exact thing!!! Go ahead and try it (and train it, and by all means donate to these guys).

This is by far one of the most useful things I found on the net (thanks to this article on fellow blog walking randomly).

Wednesday, July 28, 2010

How to Add a Contact Me Page to your Blog or Website

Turns out to be quite simple using Google docs! Here are the steps:
  1. Go to Google docs
  2. Create a new form

  3. Customize the form to your liking by filling it with required info (Name, email, website, message...)
  4. To add new items use the "Add item" button on the upper left. Use this to add more entries. For the message, add a paragraph text.
  5. Finally, here's how it could look like. Don't forget to name the form and put in a short description
  6. Now, you can get the code to embed that form in your website or blog. Just go to the "More actions" button on the upper right and choose embed.
  7. Copy the embed code to your website or blog. In blogger, you can create a new page for that form and embed the code in it. See how it looks like on mine.
This form will automatically create an associated spreadsheet in your Google docs where it can store all the submissions going through the form. You will want to be notified of those submissions via email for example. To set this up, here's what you do
  1. While still editing the form, choose "See responses -> Spreadsheet"
  2. In the spreadsheet, go to: "Share -> Set notification rules" and specify how you wish to be notified
Voila!

Tuesday, July 27, 2010

How to Place Two Figures Side by Side in Latex Multicolumn Class

If you are using a multicolumn LaTeX class template, then placing two figures side by side so that they spawn the entire width of the page is done by simply using the {figure*} environment
\begin{figure*}
    \subfigure{...}
    \subfigure{...}
\end{figure*}
Voila!
You may want to move the code for your figure around so that it doesn't show up on the last page. Also, try using [ht] for figure placement
\begin{figure*}[ht]
    \subfigure{...}
    \subfigure{...}
\end{figure*}
This is a typical case with the IEEE Transactions LaTeX template which was the reason for searching the web for this. I learned it from here.

Monday, July 26, 2010

Inexact Differentials

In a previous post, I discussed the proper techniques to integrate an exact total differential. The major point to be drawn from exact differentials is that their parent function is independent of the path of integration. For example, the work done by gravity is independent of the path taken. It only depends on the end points of the path. This has to do with the fact that the gravitational force can be expressed as the gradient of a scalar. We call this type of force a conservative force field.

In general, many physical processes cannot be represented by conservative fields and therefore, their total differentials are inexact. One can think of the total differential as the a small increment taken on an arbitrary path. A very popular example of an inexact field is the work (and subsequently heat) in thermodynamics.

The work done by or on a system is in general dependent on the path taken. It is a summation of infinitesimal steps along the path. In contrast, the internal energy of the system is independent of the path taken. This has to do with the macrostates of a system. A macrostate of a system is a state where external parameters are specified. These include volume, temperature, pressure, mean total energy.

Then, for the mean energy U, the total differential is simply the difference between two known macrostates (remember, that the energy is specified for a macrostate). In contrast, the work done cannot, in general, be written as the difference between two well defined quantities. You can find more details on this in Prof. Richard Fitzpatrick's online textbook on thermodynamics.

So how do we integrate inexact differentials? Simple. If the path is known then the integration can be carried out along that path!

However, we will now show that if the inexact differential is multiplied by some function of the independent variables, one can construct an exact differential. To show this, I will follow the exposition given by Prof. Richard Fitzpatrick (http://farside.ph.utexas.edu/teaching/sm1/lectures/node36.html).

Consider the inexact differential equation
where I have used the symbol \delta to denote an inexact differential. An immediate consequence is that
Furthermore, the integral of F over a closed path is not equal to zero
To make further headway, let us consider the solution of
or
Dividing by H dx, we get
This equation describes the slope of some set of curves at every point in the x-y plane. These curves can be written as
where c is a constant labeling parameter. Think of this a set of controur lines for \Gamma. Note that Gamma is a function of (x,y), the constant on the RHS merely says that Gamma is constant on a given contour line. We now form the total differential of \Gamma
Now we want to connect the total differential of Gamma to the ratio dy/dx. To achieve this, we divide the previous equation by dx
upon substitution of dy/dx, we get
or
then
where sigma(x,y) is an arbitrary function of the independent variables. Then
Upon substitution into the original inexact differential, we have
therefore
and thus, by multiplying the inexact differential by a proper factor, one arrives at an exact differential. If this factor exists, it is called an integrating factor (its reciprocal in fact is the integrating factor). Such a factor may not exist in higher dimensions however.

In thermodynamics, for a reversible process, the entropy is written as
Note that the total differential of Q is inexact. But when dividing it by the temperature, one arrives to an exact differential. In this case, the temperature is an integrating factor and the total differential of entropy is exact.

Voila!

Saturday, July 24, 2010

Fallacies in Scientific Research: Appeal to Popularity

This is my second post on logical fallacies in scientific research. Today's subject discusses how the "Appeal to Popularity" fallacy can hinder the research environment. This one in particular is a bit tricky because, at the face of it, an individual may use it as evidence.

Definition: Appeal to popularity is a logically fallacious argument in which an individual is lead to believe that something is true (valid, moral...) because it is widely accepted or used. The person arrives at this belief without any reference to evidence supporting the validity of the claim.

Examples:
  • The majority of people use brand X car. Then it must be the safest car.
  • Laptop Y is very popular among university students. Therefore, it must be the best laptop.
  • The majority has opposed this law. It means that the law is bad.
This fallacy is a very delicate one as I mentioned previously. There are two points in every one of the above statements: the "factual" part and the illogical inference. It may be true that the majority favors brand X or Laptop Y, but inferring that it is a good product is wrong. There is no immediate link between these two points. 

It may also be true that car X is one of the safest cars, but it is not because everybody owns one. Such a statement should be validated by data, experimental tests between a variety of cars and so on. Interesting, for the most part, one can revert the above statements and obtain a valid argument. For instance, because car X is one of the safest cars, it has a wide customer base.

At the face of it, it seems that by appealing to popularity, one is using statistical data. This becomes a problem in Scientific research. As usual, examples from personal experience:
  • Fluent is the most popular CFD code used. Then it is the best CFD software out there.
  • The Finite Volume Method is the most popular discretization technique. Then it must be the best.
  • Everybody is getting funding from the industry. Then, this is the best source of funding. 

Again, these are all invalid arguments for making decisions especially in scientific research. To stretch things a bit, these arguments may be massaged a bit to lend them some credibility by isolating the statistical component of each argument and using it as data input for making decisions. Here's how I think these should be amended:
  • Fluent is the most popular CFD code used. We should list it as one of the software to consider for purchase. But first, we must compare its performance to the other software we are considering for this particular problem and then make an informed decision.
  • Fluent is the most popular CFD software. We should consider it in our modeling efforts to reach a wider audience. (I'm not too fond of this particular way of putting it as this borders on the marketing side).
  • The Finite volume method is a very popular discretization method. Based on the literature we reviewed, the method was successfully used to simulate a wide range of physical phenomena. There's also a large amount of evidence that the method is particularly suited for transport phenomena. We should consider it as a viable method for solving our hypersonic design problem.
  • I don't have any comments on the last one.
When it comes to science, our conclusions should be entirely based on the data. But when it comes to decision making, data is only a part of the process. There are existing and expected experiences that come into play and those may not be entirely rational. The problem is not also in the statistics. If the statistics point to the fact that 80% of the simulation science is done using the finite volume method, then, in the context of science, this should only mean that we should consider the finite volume method as option and test its performance for our problem. Appeal to the number by itself is meaningless. What percentage have reported positive results in this case? If the argument was: 75% of the scientists have reported positive results for using the finite volume method for compressible flow problems, then things are quite different. This is no longer appeal to popularity, it is an appeal to evidence.

There are many other details about this logical fallacy. For an excellent discussion, please visit the wikipedia entry for this fallacy.

References:

http://en.wikipedia.org/wiki/Argumentum_ad_populum
http://www.nizkor.org/features/fallacies/appeal-to-popularity.html

Friday, July 23, 2010

Numeric Limits in C++

You can use the "limits" class template to obtain machine specific numeric limits. Here's a sample code:
#include <iostream>
#include <limits>

using namespace std;
int main()
{
    //print maximum of various types
    cout << "Maximum values :\n";
    cout << "------------------\n";
    cout << "short : " << numeric_limits<short>::max() << endl;
    cout << "int : " << numeric_limits<int>::max() << endl;
    cout << "long : " << numeric_limits<long>::max() << endl;
    cout << "float : " << numeric_limits<float>::max() << endl;
    cout << "double : " << numeric_limits<double>::max() << endl;

    //print minimum of various types
    cout << "\n";
    cout << "Minimum Values: \n";
    cout << "------------------\n";
    cout << "short : " << numeric_limits<short>::min() << endl;
    cout << "int : " << numeric_limits<int>::min() << endl;
    cout << "long : " << numeric_limits<long>::min() << endl;
    cout << "float : " << numeric_limits<float>::min() << endl;
    cout << "double : " << numeric_limits<double>::min() << endl;
    //
    return 0;
}

Thursday, July 22, 2010

Remote Desktop Through a Router

If you're trying to remote desktop to a computer connected to a wireless router, you'll most likely need to "forward" certain ports to allow this connection to go through. For windows, and if you haven't changed that setting, the default port if 3389. So here's how you can do it in general:
  • Go to your router's configuration page
  • Look for "Port Forwarding"
  • Add port number: 3389
You will be asked to specify which IP address to open up the port for. You can get that from the ipconfig command on the command prompt
  • Open up a command prompt
  • Type: ipconfig (and press enter!)
  • Write down the IPv4 address that shows up and use it in the port forwarding setting
Voila!

Wednesday, July 21, 2010

How to Get CPU Info on Linux

Try:
cat /proc/cpuinfo
There's a lot more to this. You can learn about the /proc directory from this post: http://www.linux.com/archive/articles/126718

Tuesday, July 20, 2010

2D Arrays in C++ using New

In a previous post I discussed how to create 2D arrays in C. Here's the version for C++. Pointers can be easily used to create a 2D array in C++ using the operator "new" . The idea is to first create a one dimensional array of pointers, and then, for each array entry, create another one dimensional array. Here's a sample code:
double** theArray = new double* [arraySizeX];
for (int i = 0; i < arraySizeX; i++)
   // allocated arraySizeY elements for row i
   theArray[i] = new double [arraySizeY];
Voila!

To follow my previous post on 2D arrays in C, create a function called Make2DDoubleArray that returns a (double**) and then use it in the code to declare 2D arrays here and there
double** Make2DDoubleArray(int arraySizeX, int arraySizeY) {
 double** theArray = new double* [arraySizeX];
for (int i = 0; i < arraySizeX; i++)
   theArray[i] = new double [arraySizeY];
   return theArray;
}
Then, inside the code, i would use something like
double** myArray = Make2DDoubleArray(nx, ny);
Voila!

Of course, do not forget to remove your arrays from memory once you're done using them. To do this
// first delete inner entries
for (i = 0; i < nx; i++) delete[] myArray[i];
delete[] myArray;

Monday, July 19, 2010

How to Organize your Digital Photos

With the advent of digital photography and the widespread use of digital cameras, it has become very exciting and easy to take photos... that is, lots of photos! Soon enough, your hard disk will be spawned with JPEGs. Whether you are scanning your old prints or importing photos from your digital camera(s), you will notice a dramatic increase in the number of photos on your disk in a short period of time. Within a year, I accumulated around 7,000 photos!

However, with the wealth of photos you have on your disk, it will become a difficult task to keep track of what's in there... even your computer will have a hard time processing and displaying thumbnails and photo information (assuming you put all your photos in one directory!).

I thought of a lot of ways to organize my photo library and have gone through several paradigms. What I describe in this article is the technique I use to organize my photos. I think it is very efficient and so far it has served me well. For example, I would want to quickly find photos sharing a certain theme (sunset, fruit, cloud...) or photos about a friend (Matt for example). Sometimes I would want to find photos relating to a certain event (concert, birthday...). All of these can be done with little effort, provided you are consistent with your workflow and take the time to organize your photos.

The fundamental concept of my approach is to separate the file storage from the user. This is accomplished by dividing the organizational task between myself and a photo organizer (such as Picasa). In this plan, photos are kept intact, and stored in systematically named folders. It is the software interface that will take care of all the organization. This also makes use of the property tags in jpg files.

Please understand that in order to obtain the results you expect, you have to put enough effort from your part (such as keywords and captions). There is no software (yet) that automatically knows what your pictures are about, or that figures out the names of the people in a certain group photo!

Step 1: Create a Master Directory

First and foremost, I create a master folder called "My Pictures". I place this folder on an external hard disk to save space on my local disk.

Step 2: Create Sub Directories

Since I use several cameras (and you probably do as new models come around), I create a directory for each camera (Nikon N80, Nikon D50, Canon PowerShot 540 ...). I do this because I like to have the photos from each camera in a separate location for comparison purposes.

In each "camera" directory, I create sub directories tagged by year. For example, Photos taken in 2005 will be placed in the "2005" directory. I chose to categorize by year to keep track of the evolution of my skills in photography. This is not a big deal since your photos already contain the date taken in the picture metadata. However, it will prove to be good practice to categorize your photos per year as years go by!

For photos that come from third parties, I also categorize them by year but place them in a directory called "other"

Step 3: Importing Photos

So far we have only scratched the surface. The question is where to import the photos form your digital camera? For example, you can import photos related to a certain event into a directory with the name of that event. That works fine, but it will create a set of incoherent directory names, and sometimes ambiguous names. Just today [Jan 28, 07], I have reached a consensus on where to import my photos. The idea is simple and is inspired from film photography. When using film cameras, I usually use films with 25 exposures (they're the only ones left out there!). Therefore, it would seem logical to put each film in a separate directory. When extrapolated to the digital world, I simply split the photos on my digital camera into 25-file sets and put each 25 files in a separate directory. I name each directory according to the following convention
[camera name]-[year] [roll number]
For example, "D50-06 0012" corresponds to roll 12 taken with the D50 in the year 2006.
I make sure I keep enough digits for roll 9999! Sometimes, my camera would have 450 photos. I take 30 files at a time and put them in their respective rolls, tout simplement!
For photos that come from third parties, I just put them in directories with the following name convention
[year] [occasion] (Example: "05 Graduation")
You can choose to place any reasonable number of photos in a roll (25-100). Just don't put 1000 photos in a single directory. Your system will become sluggish when you are manually browsing your photo directories and displaying thumbnails (this won't be a big problem with new hyper threading and duo-core processors). It is a personal preference to put any number of photos in each directory, but the objective is to have the same number of photos in all directories.

At this point, we still haven't attached any useful information to the photos. This should be the task of the photo organizer software. I use Google Picasa to organize all my photos. Picasa is a free photo organizer. It is super easy to use, super fast and provides you with all the functionality you need to edit, search and organize your photo library. In summary, what we are doing here is organizing photos in rolls (on the hard disk) and then creating albums and collections in Picasa.

Step 4: Picasa (or any other photo organization software) [NOTE: THIS SECTION WAS WRITTEN BASED ON PICASA 2. SOME FEATURES MAY HAVE BEEN UPDATED IN NEWER VERSIONS]

Now comes the fun stuff. After you install Picasa, it will give you the option to scan some directories on your hard disk. Select whichever one you want (Desktop for example). Once Picasa starts, we want to change that. Go to "Tools/Folder Manager" and point Picasa to the directory in which you have placed all your photos (i.e. the "My Pictures" master directory). Make sure you select "Scan Always" on your right so that Picasa scans your photos as you add them. Also, remove any other folder that Picasa is currently scanning (select that folder and then select "Remove from Picasa").

Once your photos are in place, Picasa will import them automatically. It will place them in a master collection called "Folders". Under Folders, you will be able to see all those Rolls you have created.

Step 5: Creating Albums (or collections or whatever your software calls these)

You can create new collections and move these folders into them (note that these actions do not alter the actual location of the photos. that's what's nice about Picasa). For example, I have a collection for each of my cameras. Every time I import a new roll, I right click on it and move it to its rightful collection. Select the folder you wish to move, right-click, select move to collection and choose new collection (assuming the collection doesn't exist yet). I hope future versions of Picasa will have an easy way to create a new collection!

Type in the name of the new collection, and it will be created. Then you can move folders around. Of course, once you create a new collection, it will be added to the list of existing collections. Here's an example of how things should look like.



Step 6: Albums

The best thing to do when you import new photos is to automatically sort them into albums. This is especially true when you are importing photos that pertain to a certain event. Just create a new album (File/New Album), select the photos that correspond to that album, and drag and drop them on that album. For example, here's how I add some photos to the "Sunsets" album.



Step 7: Keywords

Keywords are very important when you want to search for pictures with a specific theme. You can create any keyword you want (animals, plants, sunsets, people etc...) and attach it to any photo (or set of photos). This makes it very easy to search for photos with a specific theme. Select a set of photos that share a similar theme. Click the "View" menu and select "Keywords". Add the keyword that you like and click "Add" to confirm.


You can also select a set of photos and use "Ctrl + K" to invoke the Keywords dialog. It is good practice to use shortcuts.


Step 8: Captions

Captions are like titles, but are embedded in the photo's metadata. Adding captions is time consuming, but gives you the most accurate description of your photos in the future.
In Picasa, double click on a photo to enter preview mode. Just below the photo you will see a message inviting you to type a caption. Simply click there and type the caption you desire. To speed up the process of creating captions, notice that when you are in preview mode, you can simply type on the keyboard and Picasa will automatically interpret what you type as a caption. When you finish typing, press "Enter" and then move to the next photo by clicking the right or left arrow on you keyboard. Type again, press Enter and so on...

I prefer taking the time to add captions to every set of photos I import rather than having to do that one year later with 2034 photos. (Although I have to do that right now with all my old photos)! I add captions to 50-100 old photos every day. I am going over all my old photos adding keywords and captions to every single one of them. I enjoy doing this because I get a chance to review old work and most importantly, be able to find the photo I need at any time I want!

Once you start adding keywords, captions and albums, it will be very easy to locate specific pictures. Picasa will search keyword and caption content as well as album names whenever you search for something. Digital photography is such a thrill and organizing your digital photos is your own personal project. Happy sorting!