Skip to main content

I Created a Real Human Ad - No Camera, No Crew, $12 Budget!

· 14 min read

What if you could create a complete, polished advertisement — visuals, voiceover, and full video — all for under $12?


That's exactly what I set out to prove in this project. In this blog post, I'll walk you through:


  1. The tools I used and why
  2. A cost breakdown
  3. Comparisons with OpenAI Sora and WanAI
  4. How I ultimately built the ad using Vertex AI VEO


Let's get started.



🧠 Why AI-Powered Ads Matter


AI-powered ads are revolutionizing how we create and scale marketing content. Here's why:


  1. Speed: Full video ads can be created in minutes
  2. 💰 Affordability: Cost is a fraction of traditional production
  3. 🔄 Scalability: Generate multiple ad variants quickly
  4. 🎥 No actors or equipment needed

With just a text prompt, these tools handle:


  1. Script generation
  2. Voice narration
  3. Visual animation


🧪 Tools Compared: Sora, WanAI, and Vertex AI


To benchmark what works best, I tested three AI platforms:


🔷 1. OpenAI Sora


Sora delivered a high-quality 8-second cinematic video based on my prompt.


  1. ✅ Smooth transitions
  2. ✅ Excellent lighting and motion
  3. ❌ Limited access
  4. ❌ No built-in voice-over

Sora can be ideal for short cinematic storytelling, but not suitable for complete end-to-end ad creation unless you combine tools.



🔶 2. WanAI


WanAI provided an easy-to-use interface, but I hit a limitation quickly.


  1. ❌ Video generation failed
  2. ❌ No output, even after retries
  3. ⚠️ Likely due to the free-tier restriction

If you're using WanAI's free version, it may not be reliable for actual ad production.


🌐 Accessing WanAI via Alibaba Model Studio


WanAI is part of Alibaba Cloud's generative AI offerings and can be accessed through the Model Studio interface. While not as frictionless as some Western platforms, it's still worth exploring — especially for experimentation.


🪪 Step 1: Create an Alibaba Cloud Account

  1. Visit https://www.alibabacloud.com
  2. Register for a new account (email and phone verification required)
  3. You may be prompted to complete identity verification depending on your region

🧠 Step 2: Navigate to Model Studio

  1. After logging in, search for Model Studio from the Alibaba Cloud Console or access it directly at: https://modelstudio.console.aliyun.com
  2. Agree to terms and enable the service for your account

🖼️ Step 3: Find WanAI

  1. Inside Model Studio, explore the generative AI models section
  2. Locate WanAI or its equivalent under video generation or multi-modal AI
  3. Note: the UI is partially in Chinese — use browser translation if needed

🎬 Step 4: Provide Prompt and Generate

  1. Use a descriptive text prompt to generate your video
  2. Wait for processing (this may take 1-2 minutes)
  3. Note: in the free tier, results may be throttled or not generated at all

⚠️ In my case, WanAI did not produce a video under the free plan — likely due to quota restrictions or runtime limits.


If you're interested in evaluating WanAI for business use, consider upgrading to a paid Alibaba Cloud subscription to unlock full capabilities.



✅ 3. Vertex AI VEO


This is where the magic happened. Vertex AI VEO allowed me to:


  1. 🎞️ Generate a high-quality 24 1. 30 second video
  2. 🗣️ Add professional voice narration
  3. 🧱 Use slide-based or text-prompt-based generation
  4. ✅ Fully control the visuals, timing, and tone

Best of all, the total cost stayed under $12 for the complete video with voice.



📊 AI Platform Comparison


Below is a detailed comparison of the AI tools used during this ad creation journey:


ToolPlatformModalitiesPublic AccessStrengthsWeaknesses
Vertex AI (VEO)Google CloudImage, Video (VLOGGER), Sound (AudioLM)Requires GCP AccountOne-stop shop to create audio, image, and video.
Transparent pricing
Requires billing setup and project configuration
SoraOpenAI (Experimental)Video-only (no sound)No (Research preview)Photorealism, physics-aware scenesNot yet publicly available
Alibaba ModelScopeAlibaba CloudImage, Text2Video, VoicePublic (HuggingFace / GitHub)Open-source, wide model supportPoor video quality, UI less polished, DIY integration


🎬 How I Built the Ad in Vertex AI VEO


🧰 Media Studio: A Unified Interface for Generative Creativity


One of the most powerful components of Google's AI content ecosystem is the Media Studio — a clean, intuitive interface that brings together all major generative modalities.


From a single dashboard, you can:


  1. Imagen 1. Generate stunning images using natural language descriptions
  2. Chirp 1. Produce voice-over narration with lifelike clarity
  3. Lyria 1. Compose custom music tracks based on mood, tone, and genre
  4. Veo 1. Generate high-quality, dynamic video scenes

1️⃣ Start with a Prompt or Slide Structure


In my case, I used a text prompt generated from the Model Studio prompt generator to define a narrative sequence of four emotional scenes:


Generate a series of four images depicting a business user, a middle-aged Asian man with short, dark hair, showcasing a range of emotions. In the first image, he is depicted feeling frustrated, his brow furrowed, and his lips pursed in a grimace. He is wearing a crisp, dark blue suit and a white dress shirt, conveying a sense of professionalism. The office is a typical corporate setting, with a large window behind him overlooking a bustling city landscape, with a cool color scheme. In the second image, he is intrigued, his eyes widened slightly as he leans forward in his chair, studying some documents on his desk. The office is the same, but a warm, muted orange tones color scheme is implemented, with soft light filtering in. In the third image, he is captured in a moment of excited, his arms raised in a gesture of triumph and a wide grin on his face. The office is the same, with high-contrast color scheme and dramatic lighting. In the fourth and final image, the user is portrayed as relieved and relaxed, his shoulders slumped in a comfortable posture, and a gentle smile playing on his lips. He is wearing the same suit as in the first three images, but now is sitting on a comfy couch in a modern, minimalist living room with a calm, pastel color scheme and soft, warm ambient lighting.

You can either:


  1. Enter your enter a prompt similar to above, or
  2. use Create Prompt from Model Studio prompt generator

2️⃣ 🖼️ Create or Upload an Image:


If you already have an image of your character or scene, you can upload it to enhance, animate, or extend it using AI tools inside Media Studio.


Alternatively, you can generate a new image using Imagen, Google's generative image model.


For example, try a prompt like:

Photorealistic depiction of a middle-aged Asian business man sitting in a modern office, looking frustrated, with a city skyline visible through the window behind him

This will help you visually establish the first emotional scene for your AI-generated ad.


With links to documentation and API references at the top, Media Studio is designed for both no-code users and developers — enabling fast experimentation and production across image, voice, music, and video workflows.


Whether you're creating an ad, explainer video, music-backed clip, or voice-over narration — Media Studio serves as your creative AI command center.


Sample images I created


3️⃣ Generate Video


⚙️ VEO 2 Configuration and Output Setup


To generate the final video outputs, I used VEO 2 — the latest iteration of Vertex AI's video generation model. Here's how I configured it for optimal results:


  1. Model: VEO 2
  2. Aspect Ratio: 16:9 (landscape, suitable for YouTube and web ads)
  3. Number of Results: 4 (to get multiple variations for creative flexibility)
  4. Video Length: 8 seconds per scene (ideal for short ad segments)

I also specified a Google Cloud Storage (GCS) output path for storing the generated videos:


gs://[your-bucket-name]/ads/


Storing outputs in GCS helped ensure persistence, easy access, and safe backup of all video assets during editing and review.


Finally, I enabled the Prompt Enhancement feature, which uses an LLM to automatically rewrite and enrich prompts for better video quality and fidelity. This dramatically improved the expressiveness and alignment of the output with my original creative intent.


🧩 Prompt-to-Image: From Concept to VEO Input


After generating the composite image showing the four emotional expressions using the earlier prompt, I downloaded it and used a basic image editor to split it into four separate images — each representing one distinct emotion:


  1. Frustrated
  2. Intrigued
  3. Excited
  4. Relieved

I uploaded these images one by one into Vertex AI VEO, using them as input slides. For each image, I crafted a specific narrative prompt to guide the video generation:


  1. Frustrated:
A small business user is frustrated at the speed, availability of his website — and that too after spending an exorbitant amount.
  1. Intrigued:
The same user is now intrigued after discovering a new cloud-based website platform that promises better speed, uptime, and affordability.
  1. Excited:
He is excited and overjoyed after seeing his website go live instantly, with blazing fast performance and zero downtime.
  1. Relieved:
Now relaxed and smiling, he's enjoying peace of mind, knowing that his online business is running smoothly and cost-effectively.

This scene-by-scene approach allowed VEO to generate a seamless, emotionally resonant ad narrative — visual storytelling powered entirely by AI.


For each image, I provided a targeted contextual prompt. For example, for the first image:


A small business user is frustrated at the speed, availability of his website — and that too after spending an exorbitant amount.

This approach helped VEO sequence the emotional narrative visually and thematically, delivering a more dynamic and relatable ad experience.


Choose:

  1. Aspect ratio: 16:9
  2. Duration: 24-30 seconds
  3. Resolution: 720p or 1080p

Within 1-2 minutes, VEO generated a fluid video matching my structure.


4️⃣ Add Voice-Over


Now that the video was created, had to add a voice where I used Vertex AI's built-in voice generation:

  1. English narration or multilingual depending on your ad.
  2. Voice: Choose the voice. Ensure you are choosing male or female based on the character
  3. Output: Clean, clear, and professional

You can review and re-generate multiple options, then export the audio.


5️⃣ 🎵 Background Music Generation (Optional)


To enhance the emotional tone of the ad, I also generated background music using AI.


I used the following prompt to create the audio:


A melodious soft music required for advertisement where a light music plays.

The result was a gentle, unobtrusive melody that blended perfectly with the voice-over and visuals — helping to elevate the overall ad experience without overpowering the narration.


🎞️ Editing the Final Ad in CapCut


Once all the individual video clips were generated by VEO and downloaded, I moved to the final phase of production: editing and assembling the ad in CapCut.


CapCut is a free and user-friendly video editor that offers a wide range of tools to polish raw footage into a professional-quality video.


🪜 Step-by-Step: Editing AI-Generated Videos in CapCut


1️⃣ Launch CapCut and Start a New Project


  1. Open CapCut on your desktop or mobile device
  2. Click “New Project”
  3. Drag and drop all your downloaded video clips into the timeline

2️⃣ Arrange the Video Sequence


  1. Organize the clips in the intended narrative flow:
  2. Frustrated scene
  3. Intrigued scene
  4. Excited scene
  5. Relieved scene
  6. Trim any silent lead-in or fade-out to keep the pacing tight

3️⃣ Add Transitions


  1. Insert subtle transitions (e.g., fade, slide, or zoom) between clips
  2. This creates smoother scene shifts and a more cohesive visual flow

4️⃣ Import and Sync Voice-Over


  1. Import the AI-generated voice-over audio file
  2. Drag it to the audio track in the timeline
  3. Adjust timing and sync it with the visuals so that emotional cues align

5️⃣ Add Background Music


  1. Import the AI-generated background music file or you can choose from Canva or Capcut assets to have a background music
  2. Lower the background music volume (e.g., 30-40%) so it complements the voice-over
  3. Apply fade-in and fade-out to ensure the music blends smoothly

6️⃣ Overlay Text or Branding (Optional)


  1. Add on-screen text, brand logo, or a call-to-action (CTA) at the end:
  2. “Visit now”
  3. “Start your website in minutes”
  4. “Powered by AI”

7️⃣ Export the Final Video


  1. Set export resolution to 1080p for best quality
  2. Click Export
  3. Save your finished ad for distribution or upload to your preferred platform


Using CapCut gave me complete control over how the scenes, voice, and music came together — transforming AI-generated assets into a polished, emotionally compelling advertisement ready for publishing.



💵 Cost Estimation: How I Calculated the $12 Budget


One of the key takeaways from this experiment was demonstrating that a professional-quality ad can be produced for under $12, using only AI-powered tools — no actors, no cameras, no editing studio.


Here's how I calculated the total cost:


🔍 Reference Sources


To estimate costs accurately, I referred to:


  1. Yahoo Finance coverage of Google's VEO pricing
  2. Google Cloud's official pricing calculator

📊 Breakdown of Costs


ItemEstimated Cost
VEO video generation (24-30s at 720p-1080p)~$7.50-$8.00
Voice-over generation (via Chirp or custom TTS)~$2.00-$2.50
Background music (via Lyria)~$0.50
GCS storage (temporary)~$0.10
Editing (CapCut - Free Tier)$0.00
Total~$10-12 USD

🧠 Note: Costs are based on prompt complexity, video duration, resolution, and number of generated outputs. For most short-form ads under 30 seconds, this range is a realistic budget using cloud-based generative AI.



📈 Final Thoughts


Creating professional ads used to take:

  1. Studios
  2. Actors
  3. Editing software
  4. Weeks of effort

Now, with tools like Vertex AI VEO you can create powerful, production-ready ads in minutes and under $12.


Ready to create your own AI ad? Here's what to do:


  1. Sign up for Vertex AI

  2. Use my prompt templates above

  3. Experiment with different voices and music

  4. Share your results!


📎 Key Takeaways


  1. 🎯 Use Sora if you have access and need cinematic shots
  2. ⚠️ Avoid relying on WanAI's free tier for production work
  3. Vertex AI VEO is the best choice for full, affordable ad creation with voice-over and visuals

Ready to create your own ad? Let me know, and I'll share a downloadable template and workflow to get you started!




Call to Action


Choosing the right platform depends on your organizations needs. For more insights, subscribe to our newsletter for insights on cloud computing, tips, and the latest trends in technology. or follow our video series on cloud comparisons.


Interested in creating ads without the high agency cost? If yes, please contact us and we'll be more than glad to help you embark on not only building professional ads but also marketing on all platforms.