Skip to main content
Join Community

Search AI Workflow Pro

Search tools, categories, stacks, and pages

Make a Short Video From Scratch

From a one-line idea to a finished short: concept, visuals, motion, edit and subtitles.

This workflow turns a one-line idea into a polished short video. You start with ChatGPT to refine the concept and write a script, then create stunning visuals with Midjourney. Next, you animate those images using Kling AI's motion capabilities. In Descript, you edit the video by editing the transcript—a fast, intuitive method. Finally, you add auto-subtitles and polish with Captions. This combination works because each tool specializes: AI writing, high-quality image generation, text-to-video, transcript-based editing, and automated finishing. It's for content creators, marketers, or anyone wanting to produce shareable short videos without traditional production skills.

The workflow, step by step

  1. 1

    Draft the script and storyboard

    ChatGPT

    Use ChatGPT to generate a concise script and shot list from your one-line idea. It excels at iterative, contextual writing, letting you refine tone and length. Over other chat models, it offers superior instruction following and analysis.

    Hand-off → A polished script and storyboard outline ready for visual generation.

  2. 2

    Create key visuals from prompts

    Midjourney

    Midjourney produces high-quality, artistic images that visually represent your script's scenes. It's preferred over other image generators for its aesthetic quality and prompt adherence.

    Hand-off → A set of high-resolution images matching each storyboard frame.

  3. 3

    Animate stills into video clips

    Kling AI

    Upload your Midjourney images to Kling AI to add motion. Kling excels at creating smooth, camera-controlled animations and can sync lip movement if needed. It's better than alternatives for short, dynamic clips.

    Hand-off → Short video clips with motion for each scene.

  4. 4

    Edit video by editing the script

    Descript

    Import clips and script into Descript to cut, rearrange, and add voiceover by editing the transcript. This tool is best for rapid, text-based editing, avoiding complex timeline work.

    Hand-off → A rough cut with synchronized audio and video.

  5. 5

    Add auto-subtitles and polish

    Captions

    Captions automatically generates professional subtitles and applies color grading, sound effects, and music. It's the best for simplifying the final polish and ensuring accessibility.

All tools in this stack

ChatGPT logo

ChatGPT

freemium

OpenAI flagship conversational AI with code, writing, analysis, and vision capab...

Rating
4.6
Category
AI chat
Pricing
$20/mo Plus
Midjourney logo

Midjourney

paid

Leading AI image generation tool known for artistic, high-quality outputs.

Rating
4.7
Category
AI image
Pricing
$10/mo Basic
Kling AI logo

Kling AI

freemium

Kuaishou's text- and image-to-video model producing high-fidelity, physically co...

Rating
4.2
Category
AI video
Pricing
Free credits; from $6.99/mo
Descript logo

Descript

freemium

AI video and podcast editor that lets you edit media by editing the transcript, ...

Rating
4.4
Category
AI video
Pricing
Free tier; $24/mo Hobbyist
Captions logo

Captions

freemium

AI-powered creator studio for shooting, editing, and captioning talking-head vid...

Rating
4.1
Category
AI video
Pricing
Free tier; $9.99/mo Pro

Frequently asked questions

How much does this full workflow cost?

ChatGPT is free or $20/month for Pro, Midjourney starts at $10/month, Kling AI is free with limited credits, Descript has a free tier and Pro at $24/month, Captions is free with optional paid upgrades. Total can be under $50/month if using free tiers, but pro features cost more.

Are there free alternatives for each tool?

Yes: Use Google Gemini instead of ChatGPT, Stable Diffusion locally for Midjourney, Runway Gen-2 for Kling, DaVinci Resolve for Descript, and CapCut for Captions. Quality varies, but viable.

Where should a beginner start?

Start with ChatGPT to define your idea and script. Then generate one image in Midjourney to see if the style works. Then proceed step by step, as each step depends on the previous output.

What's a common mistake in this workflow?

Skipping storyboarding — without a clear shot list, Midjourney images won't match the narrative. Also, over-relying on AI for voiceover without manual correction can sound robotic.

How long does this workflow take?

For a 30-second short, expect 2-4 hours total if you're familiar with tools. First time may take longer due to learning curves.

More stacks to explore

Community

Want a stack review for your workflow?

Join the community — share what you're building and get stack recommendations from AI builders who ship.

AWP Premium
Founding price$99/yr
  • Stack reviews for your workflow
  • Tool recommendations from builders who ship
  • Prompt templates and working guides
  • Direct access to Leo and the community

Founding rate locks in for as long as you stay — it rises for new members as the library grows. Free tier available · cancel anytime.