Why You Should Expose to the Right (And What that Means)

Histogram

Lately I've been exposed (pun emphasized) to a lot of technical articles.  I've noticed a lot of unacceptable noise in the shadow regions, which I've come to attribute to two factors: dynamic contrast features of the camera, combined with non-linear distribution of data in the histogram.  The problem with dynamic contrast actually led me to discover the Expose to the Right technique.

1)  D-lighting / i-Contrast  (ie dynamic contrast)

It goes by different names, but it is a method for balancing the highlights and shadows in a scene to achieve a better dynamic range.  The camera dials down the exposure a bit in order not to blow the highlights, then brings up the shadows.  It results in a pseudo-HDR effect, but note that it's only an approximation.

This is a great option for JPG shooters, because JPG is only 8-bit and would be discarding some of the (12-bit) data from the sensor anyway.  Dynamic contrast effects can give you some pretty nice JPGs without having to mess around in Lightroom, Aperture, or Camera RAW for an hour.  The hardware manufacturers do have some good processing algorithms.

But note what I said earlier: this accomplished by underexposing the image.  In-camera image review will show a properly exposed image (because this is a JPG preview processed as above) but the RAW file doesn't actually have any of this processing applied.  When you open it in a RAW converter like Lightroom, of course it won't look as vibrant as the preview (which is what you expect with RAW).  But with dynamic contrast you'll also notice that it's darker than you were expecting... a LOT darker depending on how much the camera had to dial back the highlights; maybe equivalent of -1 to -2EV.

The problem is that when you turn the exposure back up you'll see a lot of noise in the darker midtones... someplace you wouldn't normally expect.  As captured by the camera, more of the image lives in the lower portions of the histogram.  This is a problem because...

2) Image data is distributed non-linearly across the dynamic range.

Yes, it's a mouthful.  Let me explain.

As Michael Reichmann explains in his wonderful article, "Optimizing Exposure" a 12-bit DSLR sensor can record to 4096 tonal values (bits) per pixel. Let's just think in black and white for now, and say that is 4096 luminance/brightness levels.  That's actually pretty simple math (2^16 = 4096).

But what I didn't realize was how a camera allocates bits across the spectrum of tone values.  If we assume the sensor has 10-stops of dynamic range, then each stop has half the tonal values of the last.  I'll let Michael speak:

If we assume a 10 stop dynamic range this is how this data is distributed...

  • The brightest stop = 2048 tonal values
  • The next brightest stop = 1024 tonal values
  • The next brightest stop = 512 tonal values
  • The next brightest stop = 256 tonal values
  • The next brightest stop = 128 tonal values
  • The next brightest stop = 64 tonal values
  • The darkest stop = 32 tonal values

So you see that if you use dynamic contrast, you are moving the histogram down towards the lower end, where there is less detail to record the tonal values.  This will mean rougher lighting transitions, not to mention worse signal-to-noise (S/N) just because you're working more in the darks (this doesn't have to do with the tonal values, just the underexposure issue itself).

So how do we solve both problems?  We try to get an exposure which uses more of that upper range, shifting the exposure to the right end of the histogram.  This is what we mean by ETTR (Expose To The Right).  It's tricky to do without blowing out the highlights, but just keep in mind that a small amount of clipping won't kill you.  Since buying my SB-700 speedlight I've also become a strobe enthusiast, and can say that this might help you balance the highlights and shadows better.

Where ETTR will really help is on shady, overcast, or dark shots where you don't have to contend with direct sunlight-induced clipping.  You'll be presented with a tighter histogram distribution, which you can bump up +1 or +2 EV during shooting, so you get it in the upper third of the histogram.  Then create a RAW processing preset with the inverse exposure compensation.

This is an effective way to reduce the noise in your shots, but I'd still like to know why tonal data is non-linear.  What if the data allocation scaled according to the histogram, so that if you were only using the lower third, it would allocate 3072 tonal values to that, and 1024 values across the rest of the range?

What do you think?  How could manufacturers better manage the way tonal values are recorded?

Leave your thoughts in the comments.