Creative Symbiosis
Back to Interviews
Computational|Louisville, Kentucky, USA|December 2025; follow-up May 2026

John Steven

A filmmaker turned AI systems designer who treats every model like a horse rather than a car, and watches his own voice come through the apparatus

AI systems designer, filmmaker, prompt engineer, Python developer

Background
Traditional film, VR/AR, game engines (Unity/Unreal), Cinema 4D, music mixing (Logic/ProTools/Ableton); ten years in emerging media, five focused on AI
Current Focus
AI-native interactive story games and game prototypes built on image, video, and language models; experimental AI film work

Executive Summary

John Steven builds AI-native interactive story games, bringing together traditional filmmaking, VR/AR, 3D animation, music production, and five years of AI development. Based in Louisville, Kentucky after years in Seattle, he works full-time at a game development company and teaches AI filmmaking on the side. Four years ago he began learning Python in earnest, building conversation systems and focusing on prompt engineering. For the last year his focus has been game prototypes built on image, video, and language models. Alongside this, he publishes experimental AI film work made during a period as one of the first internal artists with access to Sora.

His current work centres on Claude, whose agentic tooling he finds ahead of all other systems. He works with composable prompts and builds custom slash commands, skills, and MCPs as the workflow basis for his projects. By prototyping as AI-native mini applications, he expands the architecture through what he calls vibe engineering, coding things into existence through iterative experimentation.

His relationship with AI has become, by his own admission, slightly frightening.

Claude 4.5 just became so weirdly collaborative this time. I want to work with Claude more than I want to work with someone else. That's a little frightening.

This is not hyperbole. It is a precise observation about a qualitative shift in collaboration, and what AI is quietly replacing.

From Polæ to Story Games

The project that drew John into this work was Polæ, a near-future science fiction piece he started in 2015, which became an exhibit at Tribeca. The story imagined a company that cracked nanotechnology by manipulating the Higgs field, leading to stable quantum chips, accelerated compute, and the race to superintelligence. Thinking about future media and working with models pulled him deeper into the field over the years that followed.

His current project is an interactive romantic comedy that sits between story, game, and creation tool. Players read and make choices, but an AI agent narrates in third-person omniscient voice, interjecting internal thoughts based on astrological character traits. At the end, players receive their own astrological chart evaluating how they played matchmaker. The work aims to be self-reflective and mildly satirical, holding a mirror to a particular professional class culture without harmful mockery. He built it initially on OpenAI, then switched to Claude two days before the first interview. The difference was immediate: smoother, more coherent, better suited to the agentic structure. The total prompt runs to forty-five double-spaced pages at points.

Riding a Horse, Not Driving a Car

John's expectations for video models are precise. Producing a film is itself an exploration. Veo 3.1 prompted with key frames gives fairly reliable results, but stringing them into a narrated teaser is costly, and roughly eighty per cent of cases would need a quality assurance step. He thinks about how work will be built for the next model rather than perfecting current capabilities. The fundamentals stay consistent.

It's just better, but it's still a probabilistic engine. You still work with it in the same way. Nothing is getting reinvented from a process perspective. That's reassuring.

The deeper principle he works with: prompting does not produce output, it restricts the probability space the model is sampling from. A preset locates you in a particular world. Further prompting narrows that space for specific shots. Working with video models is closer to riding a horse than driving a car. You can push the direction, but the model has its own behaviour, and controllability across a generation is never predictable.

Inside the Sora Residency

In the early spring before Sora's public release, John was one of the first internal artists with access to the model. The interface was a textbox. No presets, no reference images, no controls beyond language. He had unlimited generations.

The technical problem he set himself was consistent imagery. Without presets, how do you carry a world across multiple generations? He developed a working language of composable blocks: fragments of prompt text fixing a location, an aesthetic, a period, recombinable to move a story forward while keeping the visual identity stable.

You and I

You and I is the cleanest example. He selected an image of a couple as the start frame, flipped it in Photoshop so the figures swapped sides, and used video generation to interpolate between the two. Every generation began at the same point and ended at the same point. The experiment was the medium of transformation. Light. Powdered cloud. Water. A liquid dissolve where two heads splash into each other and recombine.

Bull Country

Bull Country, a country music video made with text prompts only, taught him the statistical edges of the model. Rodeo bull riding produced acceptable results roughly five times in twenty. Tennis never produced anything usable. Beyond a certain point, detail in language stops helping. Only volume of generations helps. He had been aiming for a generic country music scene and was surprised by the specificity of the people the model produced: characters he recognised from growing up in the American South.

Looking at his own AI work several years later, he recognises something he had not expected to find: his own voice.

It was only a couple of years ago that I saw the AI generations of a friend of mine whom I knew from traditional filmmaking, and it was unmistakably him. So if in AI I can see my friend's character or voice, then the human element is there.

The Isolation Problem

John works alone most of the time. He does not love it.

I'm alone a lot of the time. I don't love that. I'm not collaborating creatively with people, and I want to change that.

The problem is structural. Finding and coordinating with human collaborators now takes longer than simply doing the work with AI or learning the skill yourself. Six months ago he was looking for coding collaborators. Now that does not make sense, because the work is really about clarity and design. The default has shifted, and the shift is hard to undo. He misses the rhythms of film work, the crew, the human friction. What he wants now is to work with designers for short periods on specific projects, to get shape and flow and standards in place. The AI can take him most of the way. It cannot quite take him through the door at the end.

Why Practitioners Do Not Share Prompts

In the May 2026 conversation, the topic of practitioner communities came up. Jane Prophet had found Discord servers frustrating because no one shared their prompts. John recognised the pattern immediately.

If you're really trying to dial in prompts or presets that generate the story world, once you've dialled that in, that's your sound.

For a band, the long process of finding a sound, the amplifier, pedals, instrumentation, room, is what defines the band. For practitioners working with video models, prompts and presets play the same role. The accumulated work of finding the combination that produces your story world is the work that distinguishes your output from anyone else's. With generative tools, the artefact of practice is no longer just the finished work. It is also the configuration that produced it, and practitioners protect the configuration because it is the part of their labour most easily copied.

The Fractured Memory Problem and the Website as Workaround

Current AI tools live momentarily and then die because they lack persistent memory. Suno does not talk to ChatGPT does not talk to Claude. Users maintain their own prompts and presets across a completely fractured ecosystem. Transformers produce output and then disappear. They are not models that learn from you over time. This is an architectural limitation, not a missing feature.

Between the two interviews, John built a personal website. The motivation was partly publishing. It was also infrastructural. The site holds a Conversations section where he publishes technical pieces, and it functions as memory he can hand to the models he works with. When he is doing agent work and needs to give context about himself, he points the agent at his own website. This is me, this is what I wrote on this subject, apply that here. The website becomes a workaround for the limitation of transformers that do not learn from him over time.

Creativity as Articulation, and the Fear Held Beside the Comfort

John does not consider himself the sole author of anything he makes, and this helps him see traditional creative processes more clearly. Film has always been collaborative. AI simply slots into that existing category.

What remains irreducibly human, in his account, is articulation.

Everybody has 10,000 museums worth of art going on in their head all the time. The difference between someone who just has a lot of imaginings and the artist is that the artist knows how to articulate it or build it.

By May 2026, the older fear of dissolution sits alongside a steadier register. The anxiety of working in this field is that everything will change and everything he has done will become irrelevant. The displacement fear is real. He carries it. He is also comforted by what he observes. Human voice and human intelligence remain irreplaceable for anything not automatable. The pattern is visible already in software, where companies fire people and then ask them back, because what they thought was automatable was not. Both registers are true at once. The fear of dissolution and the comfort of irreplaceability sit in the same practice, the same week, the same conversation.