One of our readers, Mike Baker, sent the below email to me today. I thought it was a great and interesting analysis of why downsampling an an image reduces noise, so I decided to share it with you (with his permission, of course). Trying to digest this stuff makes my head spin, but it is a great read. You might need to read it several times to understand what he means, especially with all the mathematical formulas (I had to):
You recently commented about downsizing a high-resolution image to a lower-resolution in order to reduce the apparent noise. While I knew that this is an effective way to reduce noise visible in the images, I had not thought in much detail about the technical reasons why this works.
After a long evening’s thought on the subject, and running a few questions past my friend and fellow engineer, I believe I have a (reasonable, though perhaps not perfect!) handle on the subject…
If the image signal and the image noise had similar properties, averaging neighboring pixels in order to reduce the resolution would not improve the signal-to-noise ratio. However, signal and noise have different properties.
There is (in general) no relationship between the noise in neighboring pixels. Technical junkies call this “no correlation”.
Correlation is the long-term average of the product of two signals N1 x N2. If two signals have no correlation, then the mean of their product is zero.
The signal in neighboring pixels has a high degree of correlation. If you add uncorrelated signals, then their “power” is added, meaning the combined signal is the square root of the combined power.
N_comb = sqrt(N1^2+N2^2) and for N1 = N2 = N we get N_comb = sqrt(2)*N, where N1, N2 are root-mean-square (RMS) values of the noise.
However, if signals are highly correlated, then their sum is effectively the sum of their magnitudes:
S_comb = S1+S2 and for S1=S2=S we get S_comb = 2*S
So, if we add the content of two neighboring pixels, we get:
SNR_comb = S_comb/N_comb = sqrt(2)*(S/N)
So, the signal-to-noise increases by square root of two, which is about 40%.
Now, you may say that the signal in neighboring pixels is not always 100% correlated. The correlation between the signals depends on the image content. If the image content is very smooth, the correlation is high. If the image content varies very fast, the correlation is low. Of course, noise will be more noticeable in smooth areas and the effect of resampling the image will be stronger.
Adaptive noise filters take into account the absolute signal-to-noise and the image content. They reduce the resolution more in areas that are smooth and have poor signal-to-noise and keep the original resolution in areas that have strongly varying image content and high signal-to-noise. You can think of it as a joint optimization of SNR and resolution.
Now, we also need to look into the different sources of noise:
- The first source of noise is dark current which is caused by electrons that accumulate in the individual pixel well, even if there are no photons entering (lens cover on). Dark current becomes dominant for very long exposures. For normal exposures the errors from trapped electrons are negligible.
- The second source of noise is the read-out noise. This is essentially generated by two sources: A) Noise added by the amplifier and B) Noise generated by the analog-to-digital converter. It is a fixed amount of noise that is added to each image during read-out. When you choose the ISO setting on your camera, you essentially set the read-out gain and therefore the read-out noise. The higher the ISO, the higher the read-out gain and the less read-out noise. Of course if you pick an ISO which is too high you will get signal saturation. So for low-light situations always pick an ISO that is no higher than needed to capture the image you want.
- The third source of noise is called “quantization noise” and is a bit harder to understand. It has to do with the fact that (in low-light conditions) we don’t sample a smooth, continuous flow of photons but rather discrete bunches of photons. The problem is, that a source of light does not produce a stream of photons that are spaced equally in time. So, if you image a low light source that sends out (on average) 100 photons per second, you may receive 90 photons for the first second, 105 for the second etc.. The average error will be on the order of the square-root of the number of photons (or electrons in the pixel sensor well). A typical sensor well contains between 20,000 and 60,000 electrons when fully charged. The maximum amount depends on the pixel size. A sensor well with 20,000 electrons has an error of approx +/-141 electrons when fully charged or +/-0.7%. A well with 60,000 electrons has an error of approx +/-245 electrons when fully charged or +/-0.4%. While we may be able to reduce dark current and read-out noise by cooling the sensor, there is essentially nothing we can do about it. If we keep on shrinking the pixels, we will have smaller and smaller electron wells and less and less electrons trapped.
The above errors of 0.7% or 0.4% appear rather small and we would not be able to notice them. However, in low-light situations, sensor wells will be only partially filled. If we only manage to trap 1000 electrons, the error becomes 3%. If we only trap 100 electrons, the error becomes 10%.
Notice that the term “quantization noise” has nothing to do with the signal quantization by the analog-to-digital converter. It has to do with the fact that your signal actually arrives in quantums of energy.
What do you guys think? Anyone wants to challenge Mike’s analysis? :)
Hi Nasim,
(un)Fortunately i am up to buy a new 1st body besides my current D800 which needs repairs (disfunction silent shutter en damaged top LCD). I’m a pro user in the theater photography business where silent operation and low iso noise are key specs for me. I was waiting for the Sony A7sII, which is now here, to compete with D810 according to Low ISO noise and silent operation. There is another strong competitor as well, the A7RII. I’d like to make a decision based on a comprehensive review like you make them…but of course this is not available yet. To put it in another way, i prefer your real world reviews/comparisons above formula’s. What you see is what you can get. So, as for ISO noise, i am very interested in a comparison on image quality between the mentioned camera’s, including downsampling to match the A7SII lower pixels. Or is it already possible to predict ISO quality in this phase based on theory which could result in a advice? Most crucial for me is the ISO range between 800-6400. pm: I count on a dedicated AF Nikon G to Sony E adapter in nearby future, but don’t know if that is realistic.
Regards!
Hi Nasim,
I just want to ask if all cameras does “downsampling”? What I mean is, how does all cameras produce “medium” and “small” images from their native size of “large”. For example, a dslr has a 24mp sensor. So when set to “large”, the sensor will utilize all pixels. Now what i want to know is how does the “medium” and “small” images produce? Does the camera use only 12mp when set to “medium” and there is a mechanism that block the rest of mp in the sensor? or does the camera uses all of the 24mp and downsample it to the users setting of “12mp” medium and “6mp” small.
Hoping for your response.
Regards.
Hi Nasim,
I got to this post after reading your article on the new D800 to try to find out more about how ‘downsampling’ reduces noise. Afraid like most people before me, the technical explanation is way too complicated for me to understand.
What I am more interested in is how to ‘downsample’ and if possible using Lightroom 3.5 or PSE10. Perhaps you have written on this subject already but I have not found it yet on your site. Can you advise?
Regards.
Mike’s analysis is correct. However his use of terminology may be a little confusing. What he calls “quantization noise”, caused by the quantum nature of light, is usually called “photon shot noise”. The term “quantization noise” is usually used to refer to the noise introduced by the quantization process in the A-D converter. Since an A-D converter converts infinitely variable analog signals to a discrete number of steps, it has to approximate the value of the analog signal. This approximation introduces an error, and these errors are the source of quantization noise in the common use of the term.
I would say it is mostly correct: “It is a fixed amount of noise,” I don’t know what the “it” refers to. However, there is no “the amplifier.” In CMOS sensors, each photosite/pixel has its own amplifier and there is variance between them. There are also multiple A/D converters (ADC). Also, it is dangerous to refer to added noise as “a fixed amount of noise,” first because it isn’t fixed (one could say its order is, but not its value), and second because it could then be confused with fixed pattern noise. Read out noise is not reduced for higher ISO speeds/values. It’s only reduced *relative* to the amplified signal. (If we could reduce noise by amplification, we could dispense with shot noise.) This effect is further enhanced because downstream noise (often basically noise from the ADC) is added after in-camera amplification, so downstream noise is not amplified by ISO gain/amplification.
Also, in some sense the analog signal from electron charge is not infinitely variable, as electron charge is also quantized. Perhaps it cannot be determined exactly, or represented exactly in base two, however.
I ask: “Ansel, where are you when we need you? How did you manage your genius without knowing about this?”
Ansel replies: “Take heart my son. Do not get lost in the forest for the beauty of the trees require no technical equations. Your vision is worth more than the N1 or N2. Beauty has no formula.”
Me: Wheww! Thanks, Ansel. I was beginning to worry. I feel OK, now.
Peter,
From what I have read about him, he used the best equipment available at that time and a Pentax light meter. Not to mention he was an expert in Photoshop of that time, a darkroom. He was a gifted photographer no doubt. But he also used the best tools available to him. But in the end, yes its the vision that matters.
Actually, Adams had 3 exposure meters: one SEI and two Westons.
I see. I got that info from here and he is a well known photographer as well…
kenrockwell.com/tech/meters.htm
Here’s where I got my info, near the bottom of the opening page:
I read Rockwell every day, but he does make mistakes now and then. His camera recommendations need to be triple checked based on personal experience. He has clear biases.
Peter,
You are right about him. And I don’t want to talk about him here. Let’s leave it at that. :)
Nasim, I have a question from you.
It is a common beleif among the photographers that the too many pixels are bad because of the increased noise. If the noise can be reduced by downsampling the images then is there _any_ harm in having more megapizels? If you want details then don’t downsample the image but if you want less noise then simply downsample the image. This way you will get the same noise if the sensor had less number of megapixels.
Lets have a scenario where there are tow similar sized sensors with similar sensor technology.
sensor 1: 12 MP, Noise = N1 at a particular setting
sensor 2: 24 MP : Image taken at similar setting and the image down-sampled to 12 MP. Lets say Noise now is N2 (for the 12 MP image)
My question is: do you expect N1 to be similar to N2 or do you expect N1 to be lower than N2?
‘megapizels’, ‘beleif’, ‘the too many’
too many mistakes in a couple of lines :)
That depends on which source of noise dominates.
When photon shot noise (noise resulting from the quantum nature of light, which Mike calls “quantization noise”) is the dominant noise source, then downsampling a higher resolution sensor should result in the same image scale signal to noise ratio. This is because increasing the number of photosites does not increase the amount of noise (which is a fundamental property of the light coming to the sensor) but simply redistributes it amongst more buckets; and down-sampling combines these buckets to get the same signal and noise as a lower resoluion sensor would have.
When read-out noise dominates, then a down-sampled image from a higher resolution sensor will have more noise than the image from a lower resolution sensor, since the noise is generated by each photosite (and amplifier and ADC conversion) so more photosites means more noise.
Comparison of the NEX-5N (16 MP) and NEX-7 (24 MP) SNR measurements on DXOmark shows no degradation in image-scale SNR despite the reduction in photosite size which suggests that the present state of the art allows photosites as small as 4 um to be dominated by photon shot noise.
Actually, there’s another problem with having more megapixels, and that’s that the full well capacity is reduced, which also reduces the dynamic range. [Plus, downsampling’s not quite as good as having cleaner data to begin with. Ask the astrophotographers…] There are probably also problems with readout speed as well, so the maximum burst rate would be reduced. File sizes would be larger too.
wow, this is way too much for me.. I need a drink now!!
Have one for me, too. In fact, have 3-4 for me.
Hey thanks for posting this – it is a good read for people who like engineering.
For the normal people I’ll try to make an easier explanation (Mike, please forgive any inaccuracies this simplification introduces):
If you downsample an image to lower resolution you are essentially averaging neighbouring pixels.
The value of each pixel consists of two parts: signal (the good part) and noise (the bad part).
* The signal is kindof smooth, and changes slowly across pixels. So the average of two neighbouring signal values is more or less the signal value itself (e.g. the average of 50 and 52 = 51).
* The noise basically random for every pixel. It can be positive or negative, and on average it is zero. Therefore the noise of neighbouring pixels tend to cancel out if you average them. (e.g. the average of -10 and +8 = -1)
Here is the example to make it easier to understand:
You start with two pixels that have values of 40 and 59. They should actually be 50 and 52 if there was no noise, but you and the camera have no way of knowing that. But because the noise is random and zero-mean, you know that it will tend to cancel out. So if you downsample the image by 50% you average these to pixels to get (40 + 59)/2 = 99/2 = 49.5. Bingo – this is very close to the ideal average value you would have gotten if there was absolutely no noise, (50+52)/2 = 101/2 = 50.5.
So in this example you almost eliminated the noise at the cost of halving your resolution. Instead of two pixels with noise values of 10 and 8, you now have one pixel with a noise value of 1.
PS: This is the same reason you get really nice noise-free images if you average a lot of noisy photographs (as long as nothing in the scene moved, and your camera stayed perfectly still). The signal does not change over time, but the random noise tends to cancel out, and what you are left with is a smooth noise-free image. People use this technique for astronomy photographs. In practice this means that you have to let your camera track the stars (they move overhead), so of course it’s easier said than done.
Nice simpler explanation!
Great explanation. Thanks, though I mis-read the date and thought this was 2021 not 2012!!
Uuuugh! Uuuugh! Is it April 1?
Buy NIK Define 2 and skip this article.
Nasim, you were batting 1000, but this article dropped your average down to .987.
This article is more than “getting into the weeds”, it’s getting into the molecules.”
Everyone learns this on day one of any intro to photography class… :)