CONTENTS

    Built an AI Real Estate Video Generator from Zillow Photos — Here’s What I Learned

    avatar
    Ray
    ·December 25, 2025
    ·3 min read

    Hey Folks,

    I’m Ray — former software engineer, ex-TikToker/bydancer, and EZsite founder. I spent over 10 years in the used-car online space. One of the most popular features we built back then was auto-generated intro videos from car photos, highlighting trims, mileage, and selling points.

    When Gemini VEO 3.1/Sora dropped, a realtor friend asked if I could do the same for her Zillow listings: “Can you turn my property photos into YouTube/TikTok videos for lead gen?” Challenge accepted.

    I hacked together a prototype that automatically turns property photos into clean showcase videos, and I ran into some fun issues along the way.

    Here’s the breakdown.

    The Idea

    • Automatically convert property photos into professional-looking videos.

    • Pull listing data and images from Zillow.

    • Use AI to generate a storyboard and voiceover.

    • Create clip-by-clip videos and stitch them together with music, subtitles, and transitions.

    Demo

    Overall Approach

    Flow looked like this:

    • Scrape Zillow data

    • Clean image lists

    • User selects images/adds custom prompts

    • Gemini detects and removes watermarks

    • Gemini generates storyboard and voiceover scripts

    • Concurrent Veo 3.1 video generation tasks

    • Poll task status periodically

    • FFmpeg stitching, subtitles, background music

    • Final output

    Main logic runs in JS by vibe coding. Gemini models handle analysis, scripts, and watermark handling. Veo 3.1 does the image-to-video clips.

    Tools I Used (and Recommend)

    • Gemini 2.5 Image: Watermark detection and removal; image understanding for room types/features.

    • Veo 3.1: Image-to-video clip generation for consistent visual fidelity.

    • FFmpeg: Post-processing for stitching, transitions, subtitles, and audio mixing.

    • Crawlbase: Handling Zillow’s anti-scraping reliably during prototyping.

    • EZsite.ai (AI Webcoding Tool similar to Lovable): Handy for spinning up quick landing pages and demo sites without wrestling with boilerplate. I used it to throw together a simple showcase page and submission form for agents—great for testing funnels and collecting feedback fast.

    Challenges I Ran Into:

    Zillow Data Scraping
    • Zillow’s anti-scrape is no joke. My DIY crawler kept getting IPs rate-limited/blocked.

    • Used Crawlbase for testing — they tossed me 1000 free credits, which was enough to get things moving.

    • Discovery: Zillow uses Next.js with server-side rendering. Most data is in the HTML.

    • Parse HTML → extract price, beds/baths, sqft, features, image URLs.

    • Video Generation Model Choice

    • Text-only approaches couldn’t maintain visual consistency.

    • Switched to Veo 3.1’s image-to-video: one clip per image with pan/zoom/parallax; control transitions later in FFmpeg.

    Async Task Handling
    • Built a simple JS DAG scheduler:

    • watermark check → removal → storyboard → parallel clip generation → completion watcher → stitching.

    • Lightweight DB + scheduled job to progress ready tasks.

    Watermark Handling
    • Preflight watermark detection; auto-removal if present.

    • Gemini 2.5 Image worked well, with occasional false positives.

    • Video Post-Processing

    • FFmpeg: crossfades, subtitles, background music with ducking.

    • Voiceover: Gemini narrates room features in a calm, professional tone.

    Some Reflections
    • MVP built in ~ 2 hours; most time went into prompt tuning and pacing and finally spent 2 days.

    • Cost: a few dollars per full video from model/API usage.

    • Next: smarter storyboards, richer property details in VO, optional virtual presenter.

    How It Feels End-to-end pipeline is solid:

    scrape → analyze → generate → polish → publish. The “glue” matters more than any single model: reliable orchestration, visual anchoring, and thoughtful post-production.

    Accelerate your organic traffic10X with ChatRealtor