Over the last few months, there’s been an explosion of AI-powered tools for photographers. While there were already a few AI features in Photoshop, the most recent release introduced Generative Fill, which is on another level. Today, I’ll be taking a look at Generative Fill – from its potential to its limitations.
Table of Contents
What Is Generative Fill?
Generative Fill is a quintessential example of AI (artificial intelligence) imagery. It’s using machine learning to generate entirely new images, pulling from an expansive dataset based around Adobe Stock images and openly licensed work.
There are essentially two different implementations of Generative Fill in Photoshop’s Beta at the moment. One of them (which I expect to be more useful to photographers) basically functions as “content aware fill on steroids.” It’s capable of filling in huge areas, all without falling into the trap of repeating patterns that typify other fill or healing processes. Note how smoothly it removed this car despite the complex background:
That’s a really impressive result, considering the complex patterns, challenging lighting, and areas without any source material. Photoshop essentially dreamed up what the most likely background would be like, and it filled in the missing area really well.
But generative fill can do much more than that – and this is where it starts to be less useful for pure photography and perhaps more useful for commercial/advertising work. Namely, Generative Fill can add entirely new elements to your scene, based on a short text description of keywords that you write. This functionality, commonly referred to as “text-to-image,” generates entirely new content based on your keywords.
You can see below that it can create a white pickup truck in the same area of the image, if I type “white truck” after making my selection. The truck that Photoshop generated definitely has some of the weirdness you’ll see in AI-generated photos, but it’s close enough to a real-looking truck that it would only take a minute or two of manual fixes to make it look pretty seamless. The crazy part is that Photoshop even generated the shadow of the truck correctly, as well as the wide-angle (slightly stretched) perspective of my lens! The truck has some weird-looking parts, but it doesn’t feel like a cardboard cutout stuck onto the image. It feels like part of the scene.
How is Photoshop doing this? In short, it uses context from the other areas of your image to generate its best approximation of the subject. That includes all sorts of complex things: perspective, reflections, lighting, and the appearance of shadows.
Even so, the tool is not without its limitations. Generative Fill is still in beta, and it’s also inherently a bit random. One limitation right now is that the generated area is limited to a maximum of about 1000 pixels long. If your selected area is bigger than that, the generated material is upsampled and stretched to cover the gap, which can make it look blurry and out of place. You can work around this by making multiple, smaller selections, however.
Another limitation with Generative Fill is its reliance on cloud processing. Because the tool runs on Adobe’s servers, you need to have an active internet connection when using it, and there’s a delay in processing speed. It’s still fast compared to some AI image generators, and it even provides 3 variations each time, but this isn’t the near-zero processing Photoshop users might expect from other tools.
While Generative Fill’s text-to-image functionality might be the more eye-catching aspect that captures headlines, for many photographers, the content-aware filling side of things will probably be more useful. Many times before, I’ve faced a daunting amount of work cloning and spot healing unwanted distractions in a difficult photo. When I’ve revisited those pictures with this tool, I was amazed at how it eliminated the distraction with one click and a sloppy selection.
Replacing Content Aware Fill and Heal Brushes
In combination with the new Remove Tool that Photoshop also added (which effectively covers smaller areas), Generative Fill is now high on my list of Photoshop tools when I need to remove large or complex distractions.
It’s honestly so effective that I think photographers will need to reckon with their personal ethics and artistic responsibility when it comes to creating images for contests, client work, and even just personal photos. The ease and degree to which you can alter a scene is really impressive, and a bit spooky.
The ethical aspects will make a good article itself. But to put things into context, let’s look at some more examples of how dramatically this tool can change a photo. Here’s one that amazed me:
What would you do if you wanted to remove all the cars from this photo? Using previous tools in Photoshop, it would be a very time-consuming task. Just look at the wide variety of textures around the cars – you would have to rebuild plants, trees, the road, walls, and so on. While it’s totally doable, the effort would be pretty massive. Yet with Generative Fill, it took 5 selections and just a single run of generative fill for each car, totaling maybe 2 minutes of work.
Here’s the result – pretty incredible at web resolution, and even when zooming into the full-res photo, essentially flawless:
A lot of photographers will find themselves removing telephone poles or streetlights against complex backgrounds, so I wanted to try that next. Here’s an example of a photo that would take a lot of manual editing to remove the light pole, considering the fine patterns in the glass:
Again, this would have been doable with Photoshop’s previous tools, but it would have taken some time. Generative Fill got it very close to “optimal” right away, though, even with a loose selection that I made:
Canvas Extensions
Another thing that works surprisingly well is the ability of Generative fill to expand outside the frame. Maybe you’ve already seen people experiment with this feature by expanding classical paintings or even album covers to give humorous “context” to the original pieces. For photographers, the more useful situation would be to fill in some extra canvas on any side of the photo.
Again, the ethical questions arise that will leave many photographers outright refusing to use this tool. But if you’ve ever dealt with a picky client who suddenly changes their mind about something way too late to fix it, this could be a lifesaver. What if you shot a horizontal photo, but they suddenly switch to requesting a vertical photo? Generative Fill can have your back:
This application of Generative Fill works best, at least so far, for web-resolution images. Zooming in extensively will usually show a few areas that don’t look totally right. It also depends on the image – sometimes, the canvas extensions look great. Other times, Photoshop will hallucinate some unwanted new objects in the generated areas. And that brings me to the next point I want to make: this type of technology isn’t always perfect
Challenging Situations for Generative Fill
If you play around with Photoshop’s beta for Generative Fill, you’ll soon realize that there are some things that it just doesn’t get right. Most of these limitations involve generating new content from scratch, but some of them also apply to spot-healing type applications. The biggest area where it struggles? Text.
Generating text is a major limitation of AI imagery in general. The software knows how to create text-looking gibberish, and sometimes it generates real letters of the alphabet in an unintelligible order – but if you want the AI content to actually say something, you’re almost entirely out of luck.
A related area where Generative Fill struggles is if you need to generate a logo or anything like that. I’d imagine this is a tough issue from both a copyright perspective and a training data perspective. The more obscure you go, the worse time you’ll have. For example, you can get Generative Fill to make a reasonable facsimile of an American flag on a flagpole, but flags of other countries can range from decent replicas to complete misses.
Remember the truck that it generated in my earlier example? That’s another limitation of Generative Fill – it was definitely a “truck-shaped object,” and it looks fine out of the corner of your eye. But close inspection reveals a lot of issues, like differently-sized tires, strange door handle positioning, and no license plate.
That’s often the case when generating objects from scratch in Photoshop. More complex subjects tend to have more issues, but even something simple like a bench doesn’t look quite right upon close inspection:
The software also has trouble combining multiple concepts at once. For example, if I ask Generative Fill to generate “dog sitting on a park bench,” it gives me the following illustration:
Pretty awful!
Granted, you can improve matters by splitting this prompt into two concepts – first generating a bench, then a dog on that bench. It still looks iffy, but at least it’s better:
However, that goes to show that it will still take some time before the “generative” side of things works as well as the “fill” part. As photographers, that’s probably how we’d want it anyway – but for commercial applications, it may still take a while before you can get totally convincing results for any variety of complex prompts.
The Future
It’s still very early days for this type of tool. Adobe specifically has it limited to the beta branch of Photoshop, and has disclaimed it for commercial use. But a lot of questions remain about Generative Fill’s future (and other content-generation tools like it).
One open question is just how copyright is handled when you use this tool in an image. Will you be able to copyright the work if 5% is AI generated? What about 80%? Different locales are also considering regulation and disclosure around AI content – this could affect commercial photographers heavily. I suspect that Adobe will be able to navigate copyright-related issues better than most AI companies, thanks to their large stock library, but it’s hard to know the full implications just yet.
For artistic purposes, there’s inevitably going to be a massive divide in the art community as a whole, and photography specifically. If you can generate an amazing sunset on demand, does that devalue the photographer’s work in returning to a location multiple times to get the perfect light? What if the photo is mostly real, but a tool like Generative Fill was used to extend the canvas on one side by an inch or two?
Then there’s a question of how Adobe is going to price this. Server time isn’t free, particularly for the high-powered GPU compute necessary to run these models, and Adobe hasn’t shied away from moving to pricing models unpopular with the community. It remains to be seen if this tool will be hidden behind a paywall one day or not. The tech still has ways to go before I’d pay for “credits,” but it may get there before long.
I think Pandora’s Box has already been opened, and AI-generative tools like this one will only get better. We can expect it to work at higher resolutions in the future, with better handling of details and more natural image generation. Even in the current form, however, this is a very powerful tool for photographers – especially if you find yourself spending a lot of time spot healing for your type of work. It’s remarkably effective at eliminating distractions that would be too time-consuming to just clone out.
Have you tried out Generative Fill yet? Do you see it playing a significant role in your editing workflow in the future, or are you anti-AI? I’m curious to hear your thoughts in the comments below.
It used to be that a photo could be trusted – a phot doesn’t lie. But not anymore and not for a long time.
Drab, low contrast day? Enhance the color and fabricate a dramatic sky.
Troublesome telephone poles, unwanted people, cars, signs, trash, etc cluttering your composition? Remove them artificially, the lazy way.
I already assume that most photos on the web are fabricated to varying degrees. When a fake photo competes with a real one it wins more eyeballs, increasing the pressure to fake them even more.
The simulated pickup truck and the dog on the bench look ridiculous by the way.
When this text to image business comes to cell phones that’s when you’ll see it explode. Instagram filters already smear a persons face so much that they’re unrecognizable from their actually appearance so there’s a lot of appetite for this kind of fake junk.
Just call it “fake photography”, that’s the proper term. “Photo has been faked”.
I took a large group photo at the beach. I used it to remove beach chairs that were laying in front of the group and blocking two peoples legs. It did a great job of removing the chairs and adding legs to the people. However it changed the clothing the woman was wearing and put red shoes on her instead of leaving her barefoot. I didn’t pick this up but she was very confused when she saw the photo.
The problem is not the tool. It’s the humans using it. I talked to a friend, which is a 25 year veteran matte painter in London. He’s not scared at all. Says that the clients are so specific, that AI will never cope. What he does not understand, is that his industry went from documenting reality with an artistic twist, to full blown comic style pictures. Just look at the blockbusters. They’re all 90% CG. Even Castaway with Tom Hanks the sea was 100% digital… The current development speed of AI will replace his job in the next 2 – 3 years. If not earlier. Period. Because it’s about speed and time-to-market.
But what scares the living daylights out of me is that history can literally be re-photographed at this point. The same as Wiki pages are edited by activists nowadays, and not historians. Let’s be honest, the truth is being actively hidden in the world by activist journalists, which are paid by very few oligarchs, which own all the newspapers. With AI there’s no re-education of journalists necessary. The propaganda can be written into the algorithm. If we consider that the past has always been told by the victors, I have to assume that this has great potential to be a big playground for AI (aside from porn). Literally replacing the activist with AI very very soon…
I somewhat disagree. I think the problem is the tool itself, as well as the humans using it. The problem is that the tool is too alluring, and gives certain people advantages. But if some people use it, other people feel pressured to use it. It’s human nature. Advanced tools work that way: they play upon human instincts and grow themselves into society.
I truly do believe tools have inherent natures just like we do, and that some tools like AI just should never have been invented. However, I DO agree with the rest of your comment, namely that negative effects that will result from AI.
All these tools that are made to easy change photos only work on the level of web images ( at the moment)
On pixel level it is mostly a wast of time , but in some cases it can help you getting a pixel perfect result sooner.
Nothing new in manipulation- it has always been there since the early days of photography ,in all sorts of form, and AI is just another step making it more easy.
Journalistic photos are the ones the may not be manipulated. Still even making a photo one second earlier or later can make a huge impact on the contents and the message the photo sends out.
hard to find photos of laughing people in the Ukraine, in the press, but of course people will laugh. Only the press thinks an image of people looking said is more appropriate.
Images that are supposed to tell a truth are the ones you have be be careful with.
New Brave World, predicted by Aldous Huxley in 1931, has just arrived!! So be IT!!!
I hate AI applied to photography. This new technology is simply a nefarious form of cheating to fake and outwit. We will never know if what we see is real or a lie.
I am glad that I sold almost everything from my photo equipment. I left my trusty Fuji and 2 lenses and will not be buying anything new for a long long time.
Photography as a commercial art form is as dead as dinosaurs. Best of luck to you that are trying to stay in business but my advice is to look for other ways of paying your bills in near future 😉
It’s honestly part of why I’ve gone back to film for my personal work. No digital steps in the process, 100% tangible. I spend too much time in front of screens and seeing fake photos as-is, even before AI started appearing which will make it 10x worse.
I understand the commercial reasons for using these tools, but that’s not why I got into photography in the first place.
I am not a fan of manipulated photos but don’t confuse photography with reality. Photography is merely an image of reality, not reality. Shooting black and white film is already playing with reality, especially if is not panchromatic. Or shooting with a polarizing filter.
I disagree with that. If you show a person a black and white of a scene, they’ll basically recognize it as a black and white of a scene. Or if you show someone a mild edit of a photo, and then show them the real scene, then they’ll basically say, “it’s about right”.
There is a fundamental difference between that sort of reality-approximation and what is happening here. Here, if you use generative fill or some other tool, and show the person the scene along with the real photo, they’ll be confused.
The difference is qualitative. Of course photography cannot represent reality exactly, but then neither can your eyes. But that is not the point. The point is that even digital photography with fairly mild edits will reflect a basic level of truth from the original image that the viewer can rely upon and use their mild to extrapolate what the scene might have looked like. On the other hand, these drastic edits displayed here simply do not do that. They cross the line from a cohesive reflection of truth into the bizarre, destroying that subtle essence of photography that makes it so special.
(Of course, other drastic tools such as manually and severely altering a scene does something similar, but AI makes it so easy that it will have profound effects on photography.)
For me, what the camera captures in a RAW file is not exactly what I saw originally. Post processing for me brings the image back to what my eyes saw in the first place. A little more processing will help direct the viewer’s eye to what I think is important about a scene.