General / 18 October 2024

On Prompting, Automated Prompting, Creative Processes, and The Soul of Art

 I've developed a new more interactive process for proompting, and it has me asking questions about what all of this means. 

Let's discuss.

Historically, prompting has been the least important part of my process. I’ll write something out, usually about 130 words, and warp it over a piece of art I’ve already created, usually with the intention of making something more or less controlled by bending the noise. 

I came to the 130 word average length through trial and error, focusing more on a keyword based approach than trial and error, trying to figure out what worked best, to get the results I wanted out of the machine. For me, the process here has been mathematical, because what you’re doing where you prompt is weighing your personal priorities, against what you know the machine can do, and getting each piece into the right place, where they do the right thing, relative to the weight of the rest of the prompt. Easy.

It’s a lot of the same mental pieces you would use for songwriting, because you’re applying the rules of the medium, and then trying to find something that hits the emotional reaction you’re looking for, within the genre constraints you have. Here, the analogy is that the quirks of the ai are basically the genre constraints, but the metaphor isn’t quite one to one.

So when I started this freelance project I’ve been on, I realized quickly that what they wanted was something closer to a fantastic realism than the kind of stuff I usually do. And that’s fine. I’m always flexible as an artist, especially when I’m being edited. I actually really enjoy the experience of doing it, because when done well, feedback creates this really interesting creative loop where you’re not the only voice, and I welcome that. Collaborations are life affirming!

But that presents two interesting problems that need to be solved, if they’re going to fit into what I would consider my normal pipeline and workflow combination. First is that it means a much heavier focus on prompting, which isn’t a bad thing, and doing my noise painting process differently, so that I can reliably produce more predictable results.

See, the way noise painting works, I can use it as either way to explore, or as a way to control. And the differentiator there, at least for me, is in what I do with the bend, and how. Controlled noise paintings have a different kind of texture to them. It’s more like a blanket thrown over the object I’m trying to create, smoother, blurrier in spots, and there, we use lines as cues, rather than uncanny textures because the task is different.

An uncanny texture will bend the prompt in the paintover. Do it right, and you can prompt “a bushel of flowers” into a bushel of flowers shaped like a hand or some other object. “Portrait of a woman” can twist into a sculpture against the backdrop of twisted flowing abstract objects. 

And so on.

Depending on your variance, you can get to some interesting places. 

But when the objective is control… the way lines are used is different, because they map out details, and corners, often hint at perspective. Colors and color intensities get used for depth or texture. It’s hard to explain that scientifically, because the process is different for everyone. It’s going to depend on your model, your semantic similarity (cfg), and how blurry you’re making the input image with the denoising setting. All of these matter a lot.

As random as noise painting looks, there is absolutely a rhyme and a reason to it. At least when you’re doing it to subvert and repurpose the denoiser. There are other ways, types, and reasons to use adversarial inputs.

Nothing is right or wrong. At least insofar that nothing is always right.

And that brings us back to the first rule of creative ai. “Automate everything but what you love.”

This is why experimentation is important.

So, when I get a project that shifts my focus as an artist, obviously, my first thought is to ask… “What kind of batshit crazy new process can I imagine to get closer to the objectives we’re trying to hit with this project?” Part of that is that new processes are fertile ground for creative insight. And part of it is that art gets to be too easy without them. Boredom is the enemy of every artist, and complacency breeds boredom.

If you’re ever sure about your skills at any given moment, you’re either doing the wrong thing… or you’re doing the thing wrong.

And that brings us back to proompting. What does a heavier focus on prompting look like for an artist like me, who usually uses conventional or “traditional” processes as a starting point? I wondered about that.

Part of what got me thinking in this direction is my friends over @Rundiffusion. They’ve got a closed beta going for a new product that’s more than innovative going on, and they asked me to test it, find bugs, do some general QA. And it’s really  cool. I’m enjoying the hell of that process too.

But yeah, so I’m thinking about prompting, and iterative workflows, and apps that would help me get to this place where I’m resculpting realism using words against the machine. And it occurs to me that I could use chatGTP as a voice of reason here.

And, as I got into it, I started using it to generate and review prompts. Here's how the process works.

  1. Have a discussion about the objectives of the process with the machine.

  2. Generate some very specific prompts. 

  3. Use the prompts to generate images off site in my model of choice (I use several), and add additional manual steps like noise painting and inpainting as needed.

  4. Feed the images generated back into the machine, and further discuss.

  5. Rinse, and repeat with every round of client feedback.

But here’s the problem with the process.

It doesn’t feel like art to me. At least, not by itself, omitting the manual steps that get tacked onto step 3. 

Normally, when you prompt, the relationship you have with the renderer is a lot more direct. You give it instructions, like code. And you see its output, what you tell it to. Like code, words don’t always mean what you think they do, and the machine will frequently take you literally. 

But this? It’s almost like a code generator. A very eager code generator, if that makes sense.

There’s an extra step there, and you’re not making all of the granular decisions anymore. 

It feels like editing, or art direction, rather than the sort of tactile sense you get as an artist when you’re making something on your own. Yet another layer of abstraction from the canvas itself. It’s still a creative process, and the spiritual connection is still there because there’s effort and sweat going in, but it’s not as hands-on as prompting usually is.

Does it make the process less meaningful?

I’m not so sure about that.

You remember the first rule of creative ai, right?

 “Automate everything but what you love.”

And I had to think about that. But I realized that prompting isn’t the part about ai that I really get excited about. In fact, at the first chance to automate it away, I did. For me, the thing that’s exciting about creative ai is the line correct, the drawing, the painting, warping, manipulation, the subversion process, the great and powerful bend.

The ad-hoc process at step 3 really brings it home for me, making it really tightly controlled and fidgety, which I like. That part is unquestionably art in the traditional sense.

Although, I have respect for people who prompt well. It’s a kind of poetry, a form of storytelling, and unquestionably a valid artform in and of itself, with or without the machines to process it. I know people who do things that I still consider magical. Especially at the advanced level. 

I like to push back on the “anybody can do it” critique, because it’s really obvious when someone is advanced prompting, and when they’re not. And I feel like, if ai critics had the kind of artistic eyes they seem to think they do… they would realize that too. Or maybe they do, but they’ll never let on.

It’s sad when people literally let ideology cloud their vision.

So I don’t really care what they think.

Reductive takes with no room for process or nuance aren’t really valid opinions to begin with.

But the question I’m struggling with now is, where do I stop as an artist… and where does the machine begin?

Does it matter?