Skip to content

Clips, rendered and edited by AI. Built for an editing agency.

An editing agency came to Bedstone drowning in footage. They had tried the clipping tools on the market and none of them fit how the agency actually edits. We built a custom AI pipeline that ingests long-form source video, finds the moments worth cutting, renders vertical clips, and hands the team edited clips ready for review. It runs in a fraction of the source footage's runtime and takes hours of manual editing out of every day. The client name is withheld; we can arrange a reference call under appropriate confidentiality.

The problem

Editing agencies live and die on turnaround. A long-form source (a stream, a podcast, a webinar, a long video) has to become dozens of short, platform-ready clips, fast. Done by hand, that is hours a day of scrubbing for moments, cutting, reframing to vertical, captioning, and exporting. It does not scale, and it burns the agency's most expensive people on the most repetitive part of the job.

The off-the-shelf auto-clipping tools promise to solve exactly this. The agency had tried them and hit the same wall every time. The tools are generic by design: they cut where their model thinks is interesting, in a format they choose, with captions in their style. They do not bend to how a specific agency edits. The clips came back needing re-editing, the format was wrong, the volume was capped, and there was a per-clip fee on top. The tool created review work instead of removing it.

The brief

The brief was specific: build us a pipeline that fits our editing process, not a tool we have to fit ourselves to. Take our source footage, find the moments we would actually clip, render them in our format, and give them to us edited rather than as raw candidates we have to redo. Run it on our own infrastructure, at our volume, without a per-clip SaaS meter.

What we built

A custom AI clipping pipeline, end to end and bespoke to the agency's stack:

  • Ingestion of long-form source video as it lands, queued for processing.
  • Moment detection that combines the audio transcript with visual analysis, tuned to the kinds of moments this agency actually clips rather than a generic virality score.
  • Automated rendering with FFmpeg: cuts, vertical reframing, and formatting to the agency's spec, not a tool's default.
  • Generated captions, platform-tuned and styled the way the agency styles them.
  • Delivery of edited, review-ready clips into the agency's workflow, so the team reviews and ships rather than re-edits.

What it delivers

What the agency is comfortable making public about the result:

  • Clips are produced in a fraction of the source footage's runtime. The pipeline does not have to watch in real time the way a human editor does, let alone edit in real time.
  • Daily manual editing dropped from hours to a short review pass, freeing the team's senior editors for the work that actually needs a human eye.
  • Output scaled past what manual editing could ever reach, into the hundreds of clips per day from continuous source video.
  • Marginal cost per clip is effectively zero, because the pipeline runs on the agency's own infrastructure instead of a per-clip SaaS meter.

The approach

  1. Built around the agency's taste, not a generic model. The moment detection was tuned to what this agency clips. That is the whole reason the off-the-shelf tools failed and a custom pipeline worked.
  2. Smallest useful slice first. The first version cut and rendered clips end to end. Caption styling, format variants, and volume scaling landed in waves once the core was in production.
  3. Owned, not rented. The pipeline runs on the agency's infrastructure with no per-clip fees, so volume is bounded by hardware, not a vendor's pricing tier.
  4. Production, not a demo. It runs every day against real footage and real deadlines.

The stack

  • Rendering: FFmpeg for cuts, vertical reframing, and export. Boring, fast, and fully controllable.
  • Moment detection: audio transcription plus visual analysis, with a scoring layer tuned to the agency's editing decisions.
  • Captions: model-agnostic LLM captioning, styled to the agency's format.
  • Orchestration: a queue-based pipeline that ingests, processes, and delivers without a human in the loop until review.
  • Infrastructure: AU-region, run on the agency's own infrastructure so marginal cost per clip stays near zero.

What this engagement says about how Bedstone works

  • We build to the operator's actual process. The agency had tried the generic tools. The win came from modelling how they edit, not how a tool thinks they should.
  • We replace SaaS meters with owned pipelines. Where the volume is high and the work is repeatable, owning the pipeline beats renting it per unit.
  • We ship production systems. This runs daily against real deadlines, not as a proof of concept.

Want a reference call?

If you are evaluating Bedstone for a similar engagement, we can arrange a direct conversation with the relevant client under appropriate confidentiality. Start a brief and we will scope the right reference for your situation.

Related reading

Start a brief