There was a time not long ago when I spent time on photography forums. I joined quite a few debates about artificial intelligence, and in these debates, some people compared new AI tools with other automated camera features. “How is AI any different from autofocus?” is a question I saw in various forms.
It’s a fair question to ask. For most of the history of photography, focusing had to be done manually. Some photographers decried the use of autofocus when it appeared as something technological that took away from the craft of photography. Today, many people say the same thing about AI – isn’t AI just another tool like autofocus that helps us make better pictures? The answer may not be obvious at first, but I believe the answer is a firm no.
Is Autofocus Like AI?
In one respect, autofocus is similar to some generative AI algorithms. After all, autofocus is an automation that moves the photographer one step away from the mechanical process of making a photo. And perhaps there is some merit to using manual focus at times to understand the physical movement of the compound lens, at least when you’re not using a lens that is focus-by-wire.
As an aside, removing automation is an interesting journey that can certainly give one a new perspective on the entire photographic process. At least, I’ve experienced this personally, as for a very long time I used only manual focus on my first camera:
But there is a crucial difference between autofocus and AI-generated images that goes beyond mere results. Yes, these days you could generate an image of a bird using AI algorithms, producing a thing that, at first glance, looks like art. But this thing, instead of being art in the sense of being a representation of a human soul, is instead a reflection of the machine that compiled it. It is a step – not unlike others taken in recent years – to replace human connection by machine-mediated interaction.
AI tools were not created by big tech companies to help you with your photography or to take some of the tedium out of creating local masks, but to replace you, sell you things, and ultimately, replace the need for human uniqueness and produce media that is the psychological equivalent of refined sugar that renders humanity powerless to resist the rise of the high-tech corporate control of society.
Of course, that’s not unique to AI. To some extent, social media and other algorithms have had this effect for some time. But AI (of all kinds, not just in the photography world) is a force multiplier that increases the efficacy of this dehumanization. It is intended to replace people’s jobs, sources of enjoyment, free time, and so on to the point where it can no longer be easily countered by human resistance. The same is certainly not true of autofocus.
Noise Reduction
There is another reason why AI is different than autofocus, and I’d like to illustrate this point with AI noise reduction. Of course, I’ve seen the results, and AI noise reduction does a pretty good job. But, I never use it in my own photography. Why? One reason is because I don’t want to support AI tools in general due to reasons I’ve already mentioned.
But there’s another reason, that to me, is perhaps more important. When I use autofocus, I know basically what autofocus is doing. I know what it means for a lens to be focused on a particular subject. Even if the autofocus has been trained using machine learning algorithms, the end result is something I could replicate myself, at least if my reaction time were a bit faster. It places the plane of focus a particular distance away from the camera.
The same is true of demosaicing algorithms (the process of combining individual color subpixels into a traditional pixel that we see on screen). Tone curves, black and white conversion, and even traditional denoising, are all processes that are basically understandable. Moreover, such processes are possible to replicate by writing a computer program by hand, with no need to input countless photos other than the one you are planning to edit.
AI noise reduction is different, because its algorithm cannot be reconstructed without knowing the millions of images used as input. The final result obtained with it is no longer equivalent to a final result obtained using basic photographic process. Rather than algorithms that operate solely on the pixels on hand, they depend upon recognizing higher-level content based on their large databases of starting images.
Instead, AI noise reduction recognizes higher-level concepts such as feather detail, eyes, hair, shapes, and may even replicate patterns from other images on the small scale of dozens of pixels. Some photographers may criticize the accuracy of some of these reconstructions (such as upscaling an image with low-resolution text, resulting in gibberish letters). But as algorithms become more powerful, and require more energy, it is inevitable that it will bring higher and higher-level reconstructions, so that missing patterns such as feather detail will be interpolated. Hints of such interpolation is already present in today’s software. Eventually, it may be possible to transform an ISO 20,000 image into one that looks like it was taken at ISO 100 with very advanced reconstruction.
This goes beyond what I consider photography. Moreover, the process of higher level construction and interpolation encourages an approach to photography in which we no longer are in control of the final result – instead, we provide some starting point, and AI adds most of the artistic touches to reach the final product. This process is relatively primitive now, but as time goes on, it will only become more apparent. On the surface, AI noise reduction algorithms are not the same as generating AI images from scratch, but given my outlook on the future, I hope you can see why I avoid them, too.
Conclusion
Although there are many different AI algorithms, and some are benign at first glance, for me, the most crucial questions are: What is the ultimate spirit and aim behind the algorithm? And does the algorithm utilize higher-level content-aware interpolation, even if only at a very local pixel level? If yes, then I’m not interested.
As for autofocus, even if it uses a large database of images to focus more quickly and accurately, it does not change the pixels of my image in a way that I couldn’t do myself. Furthermore, it does not come with the same risk of broad societal change associated with general-purpose AI. While I still think it is useful at times to turn autofocus off and not neglect your manual focus skills, I simply don’t see autofocus and AI as having the same capacity for dehumanizing photography.
When it comes to my own photography, I find it interesting that my favorite shots are those that wouldn’t have needed AI noise reduction in the first place, let alone AI-generated “fixes.” Shoot in conditions that demand excessive noise reduction, and you’re probably capturing worse light in the first place. Yes, probably about 10% of my shots could be improved with AI noise reduction algorithms, but even then, I wouldn’t consider any of them “5 star shots.”
Would AI noise reduction save me some time when I have to resort to local denoising with masks? Probably. But I’d rather spend that time doing it myself. Spending time working on my favorite images is a rather peaceful and enjoyable process anyhow, and in this modern would, I think we could all use a little less efficiency.
Much debate and controversy in the media to “AI” refers to Artificial intelligence software, which can generate novel images informed with prior information – artificial “machine-generated” imagery.
Then there’s the AI algorithms in postprocessing software. These are trained with a high probability of success at distinguishing the real patterns comprising the composition and colours in an existing image, and emphasize these, while minimizing random noise in patterns.
The software using pattern recognition algorithms to significantly improve the performance of Autofocus in modern cameras is very different from the above applications of AI. Some modern AF modes use Deep-Learning algorithms to detect categories of patterns representative in the shapes of real world objects including Eyes, Headlamps etc of vertebrates, vehicles etc. The compiled software comprises algorithms – neural networks – trained on supercomputers using huge image libraries to recognize categories of objects. The net benefit is to improve the efficiency of focusing a lens using its motor driven mechanism to align some of the optical elements precisely.
To elaborate….. For decades, Nikon has been using pioneering versions of Deep-Learning software in its matrix metering algorithms. This began with their keystone invention launched in the FA in 1983 – culmination of R&D they started in 1977. More recent SLRs and DSLRs have had more advanced technology to recognize and correctly expose different categories of illuminated scenes, based on trained software.
Tones and colour data has increasingly inform some Autofocus software. The Nikon D5 AF system (also D500 and D850) exemplifies one recent successful stage in this progress, where colour information also informs 3D Tracking AF mode, together with shape information. The D6 advances on this technology significantly. The design of these AF modes are fundamental steps progressing to develop the more comprehensive AF in current Mirrorless cameras.
Autofocus is the more recent feature in cameras to benefit from progress from R&D, which began to benefit exposure decisions much earlier. It’s chalk and cheese to try and equate using AI software in real-time to generate artificial images on the one hand, with on the other, the Deep-Learning software used in an electromechanical device, which automates control of the optic framing a photographic subject. Deep-Learning algorithms are part of this AF software, and are a merely one part of the integrated technology, which improves the efficiency of the device at detecting patterns to focus. The advantages are obvious for moving subjects.
Modern digital cameras represent effective progress to automate exposure decisions using advanced software. And latterly, the software has advanced to work together with optical innovations to control the mechanical devices efficiently – communicating with an electronic motor(s) in order to adjust the optical elements in a camera lens to produce a sharp image; thanks to high speed compact motors, modern AF modes are integral to align optics very fast in real-time with unprecedented precision. Subject Detection modes are extremely useful to discern the subject in the noise using software algorithms trained on similar pattern categories.
In summary, advanced AF systems are impressive in how fast and efficiently they’re becoming at finding, recognizing and following the photographer’s subject.
The most advanced autofocus system in a modern ILC obviously cannot create an image using its Deep-Learning algorithms, so trying to describe any causal relationship of modern AF to mainstream AI is at most nebulous.
I agree, modern AF is quite different than generative AI, and I never attempted to argue otherwise, except in the sense that ideas from machine learning AF might inspire techniques in the generative space.
Indeed, one thing that is true is that advanced machine learning AF can never cross the blurry line between pure denoising and reconstruction. Denoising algorithms can. AF systems can only be maximized along one single variable: hit rate. The most advanced AF system is still just an AF system that locks on perfectly and tracks perfectly every time. Whereas, advanced denoising algorithms could easily start to manufacture detail, and create artifacts that would not have been in the original had it been shot with perfect light.
Is AI autofocus still photography? I think to some extend when you shoot a bird in flight as you need to follow the bird while it fights, and it can move out the frame. But, i also think that today many people can take excellent pictures of a swan on a lake, and certainly on a tree branch: point tree camera in the direction of the bird, informs the camera you want to take a picture of a bird, and the autofocus will recognise the eye and a sharp picture. In the past people had to manage to have the focus point on the eye. Things are no longer as they were, indeed.
I have a question about noise reduction!
The Nikon Z6 as one of the best low light still camera on the market full frame sensor handled noise better than most the 45 MP sensors. When I reduce in RAW the image size to M Medium in Nikon Z7 from 45 MP to 25 MP is then the same situation?
Or in the case of the Nikon Z7 is it better to shoot with 45 MP and later improve the photos with AI like DXO pure raw or others?
You shouldn’t see much difference downscaling 45MP to 24MP though it’s better to do that in post, instead of using Raw medium.
To be honest though, for most types of photography I really think DxO is overkill when you’re shooting with a full-frame sensor anyway. It seems a bit over the top that we’re even talking about such advanced noise reduction in shooting with a modern camera like the Z7.
Maybe this adds something to the question of AI versus classic algorithms – this is a screenshot from a Topaz advertisement that I just saw:
Look closely, and you’ll see it’s not even the same mountain any more! It changed lots of little peaks and rocky areas that didn’t exist at all in the original. Personally, I don’t want my upsampling tool to invent a different subject than the one I captured, but that is what AI algorithms do at some level.
(To be fair to Topaz, it seems like the “redefine” algorithm used for this image is meant to give looser realism in exchange for more detail.)
I think this is a great example. Makes you wonder whose mountain(s) the algorithm uses as template(s). And even though this is more extreme than in most cases, I think people need to be aware that all of these AI algorithms, even do-noising, will make stuff up (to varying degrees). Steve Perry has a really good video on this, where he shows how Topaz, DxO, and even Adobe de-noise invent detail that was not there to “improve” the image during noise reduction.
Based on the photos in the article, I can see the point that not requiring AI is both ideal and also attainable. In such cases, I’d agree that it’s the preferable way to go. For myself, I take a lot of my professional photos in towns and cities that are used for editorial purposes and are often meant to convey a sense of ‘I want to visit this place’.
Using generative AI to get rid of the distracting ‘visual pollution’ that is inevitable in such locations hasn’t really changed the outcome of the editing process, but it has dramatically reduced the time needed to get there.
I do try to keep the edits to a minimum, and won’t take out things that that are unique to the location, like perhaps an odd looking tree or a store sign. But a crooked street sign or a bit of litter? They’re out.
Like to add some more thoughts on AI.
One of the problems of AI is that when using it the outcome is often spot-on, but sometime completely off.
There is een example of AI recognising a car on a photo but with only one pixel changed it cannot see that car anymore.( while we humans naturally still see that same car)
So AI can be helpful to achieve a goal in shorter time; but one always has to realise the outcome may be completely wrong and it should aways be double checked by a skilled person.
At this moment AI is used by the Israeli government to put their bombs where they think the Hamas military will be … an strong example of wrong use with a deadly outcome.
Some good points. What I’d like to say is that even if AI can help achieve a goal in a shorter period of time, is that actually helpful? Human life is about activity, and we should be careful about automating that activity lest we automate away the good parts of life.
I have read that AI can be used to speed up, and is often even better, in making diagnoses in case of illnesses.
The above also counts here: the outcome should be double checked by specialists.
But in this case it could save lives.
I think we need to distinguish creative pursuits, like writing or photography, from primarily technical applications, such as protein structure prediction (Nobel Prize this year) or image analysis for, say, cancer diagnosis (though the latter should obviously always be supporting, not replacing, a human expert).
I sense that you’re seeking to impose a bright line (something like, “never use a generative AI tool, because it is dishonest/dehumanizing/etc.”) because you mistrust machines or people to correctly navigate the uncanny valley between the extremes of unpoisoned photographic naturalism and machine-hatched artifice.
I do similar things sometimes. But with AI, I feel comfortable with the fuzzy borders of right/honest vs. wrong/artificial. Part of getting good at something is making such judgments about what is professional or not professional. I distrust ideology and absolute rules of censorship more. (Even if it’s just self-censorship.)
An analogy that doesn’t involve AI illustrates this. When I got interested in photography, at first I wanted to use just artificial light, concentrate on technique, and avoid making local adjustments of exposure, contrast and color in post. Verisimilitude. But I found instead that photography taught me a different lesson about human perception and verisimilitude.
For example, when I took pictures at a favorite scenic outlook, I couldn’t use a camera to capture the scene as I actually perceived it.
The camera image either over-exposed the bright sky or under-exposed the dark foreground. In real life, my eyes took in the sky first, and my irises closed down; my eyes then took in the foreground, with two oddly-dressed girls on a bench, and my irises opened. The two components merged perceptually within my mind’s eye. With a single image, I could capture that image correctly only by using either gradient filters or software local adjustments to reduce the contrast between sky and foreground.
Lat summer, the smoky forest fires in Canada also imposed a haze on the vista that I remembered with clarity. I felt ok about using software to reduce the haze … closer to the way I perceived it: some haze plus some remembered clarity seemed both true to life and less distracting.
Perception is fungible and reality is only as we perceive it (both in the moment and in memory over seconds or days or months or years). No bright line here, only artistic interpretation. Verisimilitude is a pretty loosy-goosy target, and I now seek accuracy over precision; that is, I will try to stay true to my perception, but accept that there are multiple possible artistic manipulations to achieve it. I’m going to use AI as an honest tool, and have no plans to unleash an artistic SkyNet in my tiny corner of this avocation.
These examples are simplistic, but I’m sure any thoughtful photographer can supply many, nuanced alternative examples. I feel like I’m in control, not being manipulated by technology.
First, I was never debating whether photography is objective or subjective. The subjectiveness resides in every manner of expression including photography. Therefore, I never believed that AI is on one side of some imaginary line between pure photography and something else (whatever that something else may be). Photography like all arts has always had the elements of subjective and thus a photograph represents not only a scene but the subjective experience of the photographer and that is straightforward and was never up for debate.
Secondly, I never argued that people using AI noise reduction will unleash an “artistic SkyNet” as you put it.
Rather, my argument is that there _are_ some qualitiative differences between AI and other automations in photography and that some people may wish to consider them when they consider their personal definition of what photography is. It’s disingenuous to call it self-censorship because the word censorship connotates a forced restriction. It’s just a natural choice that everyone has to make. Just like some people use sky replacement and others don’t. Yes, I do think there is a line to be drawn but I don’t think the line is clear-cut at all, and it is absolutely not a line along the subjective-objective dimension!
Beyond individual choices however, I do think that AI technology in itself as a whole cannot really be wisely used by people, as a whole. In fact, my main dislike of AI has nothing to do with individual choices. Certainly, some like yourself and other photographers can responsibly use AI. Nevertheless, I am categorically against what AI represents and how it grows due to individual instances of the prisoner’s dilemma: some use it, so others feel pressured to use it to keep up, a sort of arms race. Therefore, the progress of AI is rather deterministic and outside of group human control. It’s a specific, observable phenomenon that I don’t want to support because the end result is detrimental to humanity.
Of course, you may feel like you are in control. The debate is not over whether individuals have some control, it is whether the range of control of individuals is shaped and molded by technology over time so that the larger community has no control.
You commented, “Therefore, the progress of AI is rather deterministic and outside of group human control. It’s a specific, observable phenomenon that I don’t want to support because the end result is detrimental to humanity…. The debate is not over whether individuals have some control, it is whether the range of control of individuals is shaped and molded by technology over time so that the larger community has no control.”
My first reaction is that you DO fear an “artistic SkyNet” and seem to long for a collective response (if I am wrong on that, sorry; but if not then what should happen instead of just an individual boycott of AI?)
My second reaction is sympathetic. My analogy is that, at age 71, I see a level of disinformation and public manipulation in our media that is beyond my wildest imagination, both due to the latest developments in free speech and in the free press (social media and Fox News … I do a lot of my own fact-checking, down to the source-document level, and the reality is dismal). So what do I do? Rely only on my vote? What collective response could address it without worse alternative consequences?
If there’s a collective problem beyond individual control, what then do you do?
Well, what I fear is not exactly a SkyNet, but more of a technological dystopia that is not so overt as SkyNet.
You asked: “but if not then what should happen instead of just an individual boycott of AI?”
I think the danger of AI warrants an international ban on it. Not likely, I think.
You asked, “If there’s a collective problem beyond individual control, what then do you do?”
Therein lies the rub as they say. We have no mechanisms in society to handle collective problems whose solutions do not exist in an economic framework. There’s no oversight and it is indeed out of control.
I don’t agree with you. I don’t feel threatened by AI. It’s the natural evolution of our technology. I am very pleased with the AI noise reduction in Lightroom. I routinely use the AI powered ‘Subject detection’ autofocus which is a real game changer in my opinion. I don’t have a problem with using AI to remove a Coke can from my photo. I did it with the rubber stamp tool in the past and it easier and faster with the AI tool for the same outcome. I am not afraid of new technology and the future. AI will/does have risks as well as benefit, like all technology.
Thanks for sharing your perspective. I think everyone must define acceptable parameters for their art, and yours and mine certainly differ.
In terms of your last sentence and second sentence, ultimately the difference between your viewpoint and mine reduces to this: you think that the natural evolution of our technology is a balance of risks of benefits whereas I think technology has come to a point of diminishing returns and is more detrimental than otherwise. In short, I do not think a future with endless new advanced technology is a good future at all. I don’t think it will impact me personally to a great deal, but I think it will be worse for the world. Of course, that depends on how you weight the costs and benefits, and I suspect your weightings will be quite a lot different than mine.
Excellent article and very sharp thoughts, Jason. Thank you very much. I was probably very naive on this, I though these algorithms were mainly analysing the noise patterns in the image itself and removing those. I guess this is the way the first ones worked. Do you think now these AI tools really e.g recognize the feather pattern of a sparrow and replace noise with what a sparrow should look like (taken from a database) ? But then I wonder, if I unplug my PC from the net, is AI denoising going to be less „effective“ because the database cannot be accessed ?
I agree with you that generative AI is not for me, although I would be ready to accept some machine learning-based denoising if, indeed, it is denoising and not image reconstruction. But it may be hard to make the distinction in practice.
The algorithms are still mostly just analyzing noise patterns but they do so based on reconstructing patterns based on other images, so how to remove noise along a feather line will be different than how it’s removed in a smooth area, and that is in itself dependent on knowledge from other images. It’s a little more complicated than a direct database lookup, but basically, it does use knowledge from previous images to replace textures dependent on context.
If you unplug your PC from the net, the algorithm will still work the same. That is because the “database”, which is more properly the numerical weights in a very vast trained model, has already been derived from the training images. The training images (e.g. of birds and other noisy/less noisy images) were already processed on a much larger computer and then the final model (which is like a compressed database of noise-reduction strategies) is entirely self-contained in the program.
AI noise reduction is not exactly the same as generative AI in a vacuum, and it uses a very different sort of model than text-to-image generators. That being said, it is like reconstruction on a very fine scale, but it doesn’t seem (at least at this point) to interpolate with anything from other images like text-to-image programs do. But of course, it recognizes structures in the image based on data from other images.
So yeah, it’s a fine line and I still think AI noise reduction isn’t outside the realm of photography, only outside the realm of my personal photography.
I’m pretty much okay with text-generative AI, since basically they work as a more profound search engines that can analyze and gather data from various sourses, so you don’t need to spend too much time googling too much. But as for AI they put even in modern smartphone cameras I’m so pissed. Like, one day I wanted to make a shot of a vintage silk blazer, it was a bit wrinkled and it was probably when first time I’ve noticed that camera tries to ‘smooth’ out those wrinkles (doing shitty job, lightly speaking). And I can’t even turn this effect off! (Like, there is a tumbler in the settings, but it won’t save my ‘choice’ for some reason).
The interesting thing is that one of the reasons you have to spend so much time Googling is because the very existence of Google’s methodology caused a massive growth in creating endless websites, often with duplicate and low-quality information. So, one can say that the introduction of mass automation with advertising at Google was one of the main causes of people having difficulty finding things, and now AI is the solution to the problem created by a previous technology.
ChatGPT (or whatever generative AI you might be using) is not a “more profound” search engine. It will make stuff up.