How to Create AI Video Short Films (100% Workflow, 100% Free)

☰

🎬 Ultimate Guide: Create Fully Automated AI Videos with Just One Prompt (100% Workflow, 100% Free)

Turn simple prompts into cinematic short films using: ChatGPT / Gemini / Grok + Whisk AI + Meta / Digen + CapCut / Canva — with synchronized sound and music for every scene.

Additional: Grok AI = ChatGPT + Whisk + Meta / Digen

🌟 Overview

AI filmmaking has entered a new era — you can now produce complete emotional short films without a single camera or 3D model.

This guide will show you how to fully automate video production, from story writing to scene animation, image generation, sound design, and final editing.

Everything runs on four free tools:

🧠 ChatGPT / Grok / Gemini– writes story and scene prompts
🎨 Whisk AI – creates consistent high-quality images
🎥 Meta AI / Digen – turns stills into animated shots
✂️ CapCut / Canva – assembles clips, adds music, exports in 4K

💡 Example used in this guide: “Pixar-style 3D anthropomorphic cat short film.”
You can replace this with any other concept — fantasy, romance, sci-fi, or real-life drama.

⚙️ The 4-Step Automated Workflow

🧩 Step 1 — Generate Story, Scenes & Prompts (ChatGPT)

Use ChatGPT to automatically create:

A complete short story (1–1.5 minutes)
10–15 cinematic scenes
Prompts for images, videos, and sound

Here’s a sample master prompt you can copy:

Short prompt:

You are a professional 3D animation director.
Create a short emotional film about [Kungfu Panda] (1–1.5 minutes) with no dialogue, told through visuals, music, and movement.
Output format:
For each scene (1–15):
1. Scene Title
2. Scene Description (setting, light, emotion)
3. Image Prompt (for WhiskAI)
4. Video Prompt (for Meta Imagine / Emu / Digen)
5. Sound Prompt (for music or ambient sound)
After all 15 scenes, combine everything into one downloadable .txt file.

🧠 MASTER PROMPT (Customizable) (If you want to build detailed background and characters, use this prompt.)

You are a Pixar-style 3D animation director & screenwriter specializing in emotional, cinematic short films with anthropomorphic cats.
Follow these steps exactly:
🧩 STEP 1 – Create 5–7 Emotional Short Stories
Write 5–7 original short stories (1–1.5 minutes each).
Each story must be:
Deeply emotional, about real-life themes (poverty, loneliness, betrayal, sacrifice…).
No dialogue — only visuals, light, and emotion.
End with a surprising but logical twist.
Cinematic Pixar 3D tone, compact, evocative.
After finishing, ask:
“Which story would you like me to expand into a 15-scene film?”
🐾 STEP 2 – Fix Main Character (After User Chooses Story)
Create one anthropomorphic cat character used in all scenes.
Template:
Name: [Choose fitting name, e.g. Neko, Mimi, Shadow]
Species: Anthropomorphic cat
Appearance: Human-like body, cat head, expressive eyes, soft fur reflections.
Example A: Neko – gray fur, slim body, denim jacket, warm cinematic light.
Example B: Mimi – white fur, round body, pink hoodie, pastel light.
Style: Pixar 3D, ultra detailed, cinematic lighting, emotional realism.
Then say:
“Character locked. Expanding story into 15 cinematic scenes.”
🎞 STEP 3 – Expand Story into 15 Cinematic Scenes
Generate 15 continuous scenes (≈1 min total).
Each scene must include:
Scene Title
Scene Description: place, time, light, mood, action (repeat full character description).
Emotion / message — visual storytelling, no dialogue.
🎨 STEP 4 – Generate Prompts for Each Scene
For every scene (1–15), create:
🖼 Image Prompt (WhiskAI)
Format: 9:16 vertical
Style: Pixar 3D anthropomorphic cat – ultra detailed – cinematic lighting – soft shadows
Include background, depth, emotion, and repeat full character description.
🎞 Video Prompt (Meta / Digen)
9:16, 3–5 sec cinematic motion
Pixar 3D anthropomorphic cat, same character
Include:
Camera: slow pan / dolly / zoom
Character motion: blink, head turn, soft movement, tail or hand motion
Lighting: soft reflections, emotional realism
🔊 Sound Prompt (Audio)
Describe ambient sound or background music (piano, rain, wind, city hum, etc.)
Match scene emotion.
📄 STEP 5 – Final Output Format
Output all 15 scenes in this format:
🎬 Scene 1 – [Title] Description: [setting, emotion, action] 🖼 Image Prompt (WhiskAI): [full prompt 9:16 Pixar 3D anthropomorphic cat...] 🎞 Video Prompt (Meta / Digen): [camera + motion + lighting] 🔊 Sound Prompt: [music / ambient sound / tone]
Repeat for all 15 scenes, then say:
“All story, image, video, and sound prompts are ready. Compiling into one downloadable .txt file.”
GLOBAL RULES:
Language: English
No dialogue
Style: Pixar 3D – anthropomorphic cat – ultra detailed – cinematic lighting – soft shadows – emotional realism
Character and mood must remain 100% consistent across all scenes.
✅ RESULT
This prompt makes ChatGPT auto-generate:
5–7 story ideas → user chooses
Locked character → consistent
15 cinematic scenes → each with image, video, and sound prompts
Ready-to-use .txt file for WhiskAI + Meta/Digen + CapCut

✅ ChatGPT will:

Write some short stories
You select the story – which one would you like
Break it into 15 short scenes
Generate detailed prompts for visuals, motion, and sound
Export everything in one structured .txt file for easy copy/paste.

🎨 Step 2 — Generate Consistent Images (WhiskAI)

Go to Whisk AI, click the Visit Website button, then login Whisk AI→ paste the Image Prompt for each scene.

Tips:

Enable Exact Reference (Refine mode) to keep characters consistent across all images.
Use the 9:16 vertical format (ideal for Shorts/Reels).
Save each generated image as scene01.png, scene02.png, etc.

✅ WhiskAI is 100% free and produces ultra-detailed Pixar-like 3D images.

🎥 Step 3 — Animate Images into Videos (Meta Imagine / Digen)

Upload each still image to Meta AI (100% Free) or Digen AI (Freemium).
Then paste the Video Prompt from ChatGPT.

Recommended camera & motion effects:

Slow dolly-in, soft pan, or handheld drift
Micro animations: blinking, tail movement, subtle breathing, slow gestures
Lighting: cinematic reflections, warm tones, emotional realism

Each output should be a 3–5 second clip.
Export all clips in 9:16 format.

🔊 Step 4 — Add Sound, Music & Final Edit (CapCut / Canva)

Now, bring everything together.
Import all your animated clips into CapCut or Canva Video Editor.

For each scene:

Paste the Sound Prompt from ChatGPT into a music or sound-generation tool (e.g. Pixabay, Soundful, Mubert, or ElevenLabs Sound Effects).
Add:
- 🎶 Background music (tone matches emotion)
- 🌦 Ambient sounds (rain, city, wind, etc.)
- 💗 Emotional accent sounds (soft piano, violin, etc.)

Arrange scenes in order (1–15), apply crossfades or smooth transitions, then export the final video in 4K vertical (9:16).

✅ Bonus tip:
Add captions, intro text, or your logo for branding.

📁 Example Output (Structure from ChatGPT)

Here’s what the .txt file will look like:

🎬 AI SHORT FILM – 15 SCENES
STYLE: Pixar 3D – anthropomorphic cat – ultra detailed – cinematic lighting – 9:16

=== SCENE 1 – The Lost Umbrella ===
Description: Neko, a gray anthropomorphic cat, walks through a rainy alley...
IMAGE PROMPT: Pixar 3D ultra detailed anthropomorphic cat, soft lighting...
VIDEO PROMPT: Slow dolly-in, blinking softly, raindrops flicker...
SOUND PROMPT: Soft piano + ambient rain + distant traffic hum...

=== SCENE 2 – Reflection ===
Description: Neko stares at his reflection in a puddle...
IMAGE PROMPT: ...
VIDEO PROMPT: ...
SOUND PROMPT: Light drizzle, low cello, soft heartbeats...

...

=== SCENE 15 – The Reunion ===
Description: The umbrella hangs by a warm light in front of Mimi’s small shop...
SOUND PROMPT: Gentle piano, wind chime, emotional fade-out...

📘 HOW TO USE:
1. Paste IMAGE prompts into WhiskAI (Exact Reference ON)
2. Paste VIDEO prompts into Meta Imagine / Digen
3. Use SOUND prompts to find matching audio in Pixabay or AI tools
4. Edit and export in CapCut or Canva → 4K vertical

🎧 Step 5 — Optional: Automate Sound Generation

If you want ChatGPT to auto-generate sound or music files:

Use Mubert, Aiva, or Soundraw.io — paste the “Sound Prompt” directly.
Download ambient background loops matching each scene.
Layer them into your CapCut/Canva timeline.

👉 Example Sound Prompts:

“Soft piano with light rain ambience, key of D minor, emotional tone.”
“Gentle urban night soundscape — distant cars, soft neon hum.”
“Dreamy orchestral swell as character looks up to the light.”

🧱 Full Automation Summary

Step	Tool	Input	Output
1️⃣ Script + Prompts	ChatGPT / Grok / Gemini	Master prompt	Full story + image/video/sound prompts
2️⃣ Images	Whisk AI	Image prompts	Consistent 3D scenes
3️⃣ Video	Meta AI/ Digen	Video prompts + stills	Animated cinematic shots
4️⃣ Sound	Mubert / Pixabay / Aiva / Suno	Sound prompts	Background music & ambient sound
5️⃣ Edit	CapCut / Canva	All assets	Final 4K video
6️⃣ Publish	YouTube Shorts / Reels / TikTok	Final MP4	Viral-ready AI film

⚡ Why This Workflow Works

✅ 100% free and cloud-based
✅ Fully prompt-driven (no manual modeling or editing)
✅ Consistent characters via WhiskAI Exact Reference
✅ Emotional realism through layered light, sound, and pacing
✅ Compatible with Meta, Digen, Emu, or Runway

💡 Example Project Ideas

You can adapt this workflow for any theme:

🐱 Pixar-style anthropomorphic cats (emotional short films)
💔 Human love stories with cinematic realism
⚔️ Fantasy or sci-fi adventures
🎭 Psychological or moral twist shorts
🎶 Music-driven AI montages

🏁 Final Thoughts

This 100% Automated AI Video Pipeline lets you produce high-quality, cinematic shorts that look and feel handcrafted — all from text prompts.

Whether you want to tell emotional stories, build a YouTube Shorts channel, or experiment with creative AI filmmaking, this workflow gives you the precision of a studio pipeline with the speed of automation.