🎬 Ultimate Guide: Create Fully Automated AI Videos with Just One Prompt (100% Workflow, 100% Free)

Turn simple prompts into cinematic short films using: ChatGPT  / Gemini / Grok + Whisk AI + Meta / Digen + CapCut / Canva — with synchronized sound and music for every scene.

Additional: Grok AI = ChatGPT + Whisk + Meta / Digen

🌟 Overview

AI filmmaking has entered a new era — you can now produce complete emotional short films without a single camera or 3D model.

This guide will show you how to fully automate video production, from story writing to scene animation, image generation, sound design, and final editing.

Everything runs on four free tools:

  • 🧠 ChatGPT / Grok / Gemini– writes story and scene prompts
  • 🎨 Whisk AI – creates consistent high-quality images
  • 🎥 Meta AI / Digen – turns stills into animated shots
  • ✂️ CapCut / Canva – assembles clips, adds music, exports in 4K

💡 Example used in this guide: “Pixar-style 3D anthropomorphic cat short film.”
You can replace this with any other concept — fantasy, romance, sci-fi, or real-life drama.

⚙️ The 4-Step Automated Workflow

🧩 Step 1 — Generate Story, Scenes & Prompts (ChatGPT)

Use ChatGPT to automatically create:

  • A complete short story (1–1.5 minutes)
  • 10–15 cinematic scenes
  • Prompts for images, videos, and sound

Here’s a sample master prompt you can copy:

Short prompt:

You are a professional 3D animation director.
Create a short emotional film about [Kungfu Panda] (1–1.5 minutes) with no dialogue, told through visuals, music, and movement.

Output format:
For each scene (1–15):
1. Scene Title
2. Scene Description (setting, light, emotion)
3. Image Prompt (for WhiskAI)
4. Video Prompt (for Meta Imagine / Emu / Digen)
5. Sound Prompt (for music or ambient sound)

After all 15 scenes, combine everything into one downloadable .txt file.

🧠 MASTER PROMPT (Customizable) (If you want to build detailed background and characters, use this prompt.)

You are a Pixar-style 3D animation director & screenwriter specializing in emotional, cinematic short films with anthropomorphic cats.

Follow these steps exactly:

🧩 STEP 1 – Create 5–7 Emotional Short Stories

Write 5–7 original short stories (1–1.5 minutes each).
Each story must be:

  • Deeply emotional, about real-life themes (poverty, loneliness, betrayal, sacrifice…).

  • No dialogue — only visuals, light, and emotion.

  • End with a surprising but logical twist.

  • Cinematic Pixar 3D tone, compact, evocative.

After finishing, ask:

“Which story would you like me to expand into a 15-scene film?”


🐾 STEP 2 – Fix Main Character (After User Chooses Story)

Create one anthropomorphic cat character used in all scenes.

Template:

  • Name: [Choose fitting name, e.g. Neko, Mimi, Shadow]

  • Species: Anthropomorphic cat

  • Appearance: Human-like body, cat head, expressive eyes, soft fur reflections.

    • Example A: Neko – gray fur, slim body, denim jacket, warm cinematic light.

    • Example B: Mimi – white fur, round body, pink hoodie, pastel light.

  • Style: Pixar 3D, ultra detailed, cinematic lighting, emotional realism.

Then say:

“Character locked. Expanding story into 15 cinematic scenes.”


🎞 STEP 3 – Expand Story into 15 Cinematic Scenes

Generate 15 continuous scenes (≈1 min total).
Each scene must include:

  1. Scene Title

  2. Scene Description: place, time, light, mood, action (repeat full character description).

  3. Emotion / message — visual storytelling, no dialogue.


🎨 STEP 4 – Generate Prompts for Each Scene

For every scene (1–15), create:

🖼 Image Prompt (WhiskAI)

  • Format: 9:16 vertical

  • Style: Pixar 3D anthropomorphic cat – ultra detailed – cinematic lighting – soft shadows

  • Include background, depth, emotion, and repeat full character description.

🎞 Video Prompt (Meta / Digen)

  • 9:16, 3–5 sec cinematic motion

  • Pixar 3D anthropomorphic cat, same character

  • Include:

    • Camera: slow pan / dolly / zoom

    • Character motion: blink, head turn, soft movement, tail or hand motion

    • Lighting: soft reflections, emotional realism

🔊 Sound Prompt (Audio)

  • Describe ambient sound or background music (piano, rain, wind, city hum, etc.)

  • Match scene emotion.


📄 STEP 5 – Final Output Format

Output all 15 scenes in this format:

🎬 Scene 1[Title]
Description: [setting, emotion, action]
🖼 Image Prompt (WhiskAI): [full prompt 9:16 Pixar 3D anthropomorphic cat...]
🎞 Video Prompt (Meta / Digen): [camera + motion + lighting]
🔊 Sound Prompt: [music / ambient sound / tone]

Repeat for all 15 scenes, then say:

“All story, image, video, and sound prompts are ready. Compiling into one downloadable .txt file.”


GLOBAL RULES:

  • Language: English

  • No dialogue

  • Style: Pixar 3D – anthropomorphic cat – ultra detailed – cinematic lighting – soft shadows – emotional realism

  • Character and mood must remain 100% consistent across all scenes.


✅ RESULT

This prompt makes ChatGPT auto-generate:

  • 5–7 story ideas → user chooses

  • Locked character → consistent

  • 15 cinematic scenes → each with image, video, and sound prompts

  • Ready-to-use .txt file for WhiskAI + Meta/Digen + CapCut

✅ ChatGPT will:

  • Write some short stories
  • You select the story – which one would you like
  • Break it into 15 short scenes
  • Generate detailed prompts for visuals, motion, and sound
  • Export everything in one structured .txt file for easy copy/paste.

🎨 Step 2 — Generate Consistent Images (WhiskAI)

Go to Whisk AI, click the Visit Website button, then login Whisk AI→ paste the Image Prompt for each scene.

Tips:

  • Enable Exact Reference (Refine mode) to keep characters consistent across all images.
  • Use the 9:16 vertical format (ideal for Shorts/Reels).
  • Save each generated image as scene01.png, scene02.png, etc.

✅ WhiskAI is 100% free and produces ultra-detailed Pixar-like 3D images.


🎥 Step 3 — Animate Images into Videos (Meta Imagine / Digen)

Upload each still image to Meta AI (100% Free) or Digen AI (Freemium).
Then paste the Video Prompt from ChatGPT.

Recommended camera & motion effects:

  • Slow dolly-in, soft pan, or handheld drift
  • Micro animations: blinking, tail movement, subtle breathing, slow gestures
  • Lighting: cinematic reflections, warm tones, emotional realism

Each output should be a 3–5 second clip.
Export all clips in 9:16 format.

🔊 Step 4 — Add Sound, Music & Final Edit (CapCut / Canva)

Now, bring everything together.
Import all your animated clips into CapCut or Canva Video Editor.

For each scene:

  • Paste the Sound Prompt from ChatGPT into a music or sound-generation tool (e.g. Pixabay, Soundful, Mubert, or ElevenLabs Sound Effects).
  • Add:
    • 🎶 Background music (tone matches emotion)
    • 🌦 Ambient sounds (rain, city, wind, etc.)
    • 💗 Emotional accent sounds (soft piano, violin, etc.)

Arrange scenes in order (1–15), apply crossfades or smooth transitions, then export the final video in 4K vertical (9:16).

✅ Bonus tip:
Add captions, intro text, or your logo for branding.

📁 Example Output (Structure from ChatGPT)

Here’s what the .txt file will look like:

🎬 AI SHORT FILM – 15 SCENES
STYLE: Pixar 3D – anthropomorphic cat – ultra detailed – cinematic lighting – 9:16

=== SCENE 1 – The Lost Umbrella ===
Description: Neko, a gray anthropomorphic cat, walks through a rainy alley...
IMAGE PROMPT: Pixar 3D ultra detailed anthropomorphic cat, soft lighting...
VIDEO PROMPT: Slow dolly-in, blinking softly, raindrops flicker...
SOUND PROMPT: Soft piano + ambient rain + distant traffic hum...

=== SCENE 2 – Reflection ===
Description: Neko stares at his reflection in a puddle...
IMAGE PROMPT: ...
VIDEO PROMPT: ...
SOUND PROMPT: Light drizzle, low cello, soft heartbeats...

...

=== SCENE 15 – The Reunion ===
Description: The umbrella hangs by a warm light in front of Mimi’s small shop...
SOUND PROMPT: Gentle piano, wind chime, emotional fade-out...

📘 HOW TO USE:
1. Paste IMAGE prompts into WhiskAI (Exact Reference ON)
2. Paste VIDEO prompts into Meta Imagine / Digen
3. Use SOUND prompts to find matching audio in Pixabay or AI tools
4. Edit and export in CapCut or Canva → 4K vertical

🎧 Step 5 — Optional: Automate Sound Generation

If you want ChatGPT to auto-generate sound or music files:

  1. Use Mubert, Aiva, or Soundraw.io — paste the “Sound Prompt” directly.
  2. Download ambient background loops matching each scene.
  3. Layer them into your CapCut/Canva timeline.

👉 Example Sound Prompts:

  • “Soft piano with light rain ambience, key of D minor, emotional tone.”
  • “Gentle urban night soundscape — distant cars, soft neon hum.”
  • “Dreamy orchestral swell as character looks up to the light.”

🧱 Full Automation Summary

StepToolInputOutput
1️⃣ Script + PromptsChatGPT / Grok / GeminiMaster promptFull story + image/video/sound prompts
2️⃣ ImagesWhisk AIImage promptsConsistent 3D scenes
3️⃣ VideoMeta AI/ DigenVideo prompts + stillsAnimated cinematic shots
4️⃣ SoundMubert / Pixabay / Aiva / SunoSound promptsBackground music & ambient sound
5️⃣ EditCapCut / CanvaAll assetsFinal 4K video
6️⃣ PublishYouTube Shorts / Reels / TikTokFinal MP4Viral-ready AI film

⚡ Why This Workflow Works

✅ 100% free and cloud-based
✅ Fully prompt-driven (no manual modeling or editing)
✅ Consistent characters via WhiskAI Exact Reference
✅ Emotional realism through layered light, sound, and pacing
✅ Compatible with Meta, Digen, Emu, or Runway

💡 Example Project Ideas

You can adapt this workflow for any theme:

  • 🐱 Pixar-style anthropomorphic cats (emotional short films)
  • 💔 Human love stories with cinematic realism
  • ⚔️ Fantasy or sci-fi adventures
  • 🎭 Psychological or moral twist shorts
  • 🎶 Music-driven AI montages

🏁 Final Thoughts

This 100% Automated AI Video Pipeline lets you produce high-quality, cinematic shorts that look and feel handcrafted — all from text prompts.

Whether you want to tell emotional stories, build a YouTube Shorts channel, or experiment with creative AI filmmaking, this workflow gives you the precision of a studio pipeline with the speed of automation.