Skip to content

Music Video (MV) API

The MV API turns a song (Suno clip or any audio file) into a complete music video. One endpoint, one task lifecycle, two creation modes — selected via mode:

modeBest for
"fast"Quick finished MV, lip-sync, social-ready output
"studio"Editable B2B production — storyboard review, per-scene control, cost gates

Both modes share the same input contract, the same output shape (MVView), the same task envelope, and the same webhook events. The capability flags on MVView.capabilities tell your UI which controls to render — you don’t branch on mode beyond the initial create call.

Authentication uses the same x-api-key header as every other OmnAPI surface.


MethodPathDescription
POST/api/v1/mv/quoteUnified pricing preview (Fast + Studio variants)
POST/api/v1/mvCreate MV (returns generation taskId + canonical id)
GET/api/v1/mv/:idRead unified MVView
PATCH/api/v1/mv/:idEdit top-level metadata + Studio stage outputs
DELETE/api/v1/mv/:idArchive
POST/api/v1/mv/:id/scenes/:idx/editEdit a single scene (Fast: re-render; Studio: in-place patch)
POST/api/v1/mv/:id/scenes/:idx/renderTrigger i2v for one scene (Studio only)
POST/api/v1/mv/:id/scenes/:idx/regenerate-imageRe-roll a scene’s reference image (Studio only)
PATCH/api/v1/mv/:id/scenes/:idx/select-renderingPin canonical rendering (Studio only)
POST/api/v1/mv/:id/finalizeStitch final MP4
POST/api/v1/mv/:id/auto-pilotRun all Studio stages inline (Studio only)
POST/api/v1/mv/:id/{stage}Trigger a single Studio stage by name
GET/api/v1/mv/:id/finalGet a 15-minute presigned URL for the final MP4
GET/api/v1/mv/templatesStudio style templates catalog
GET/api/v1/mv/modelsRender-model catalog
GET/api/v1/mv/pricingLive MV render pricing rules by provider / model / resolution

Studio stages ({stage} above): analyze-emotion, draft-concept, draft-narrative, plan-scenes, lock-character, evaluate-scenes.


Terminal window
curl -X POST https://api.omnapi.com/api/v1/mv \
-H "x-api-key: om_live_..." \
-H "Content-Type: application/json" \
-d '{
"mode": "fast",
"source": { "type": "suno", "clipId": "484a67d4-..." },
"title": "Sunny Morning",
"aspectRatio": "9:16",
"resolution": "540p",
"lipSync": false,
"subtitles": true,
"language": "auto"
}'

Response:

{
"id": "01H...",
"mode": "fast",
"taskId": "task_01H...",
"status": "PENDING",
"creditsReserved": 1724,
"warningCodes": []
}

Poll the task until terminal, then GET /api/v1/mv/:id to read the unified MVView.

FieldTypeNotes
mode"fast" | "studio"Required. Picks the engine.
sourceobjectRequired. See Source variants.
referenceImagesstring[]0–7 images (URL, data: base64, or R2 key). Fast: passed to the managed renderer; if omitted and the source is a Suno clip, the clip’s cover image is auto-injected and MV_REFERENCE_FROM_SUNO_COVER is emitted. Studio: [0] is the character anchor candidate when characterImage is unset.
characterImagestringStudio-only. Single anchor portrait. 400 MV_CHARACTER_IMAGE_FAST_NOT_ALLOWED if used with mode=fast.
promptstring≤3000 chars. Style hint + scene direction.
aspectRatioenum16:9, 9:16 (default), 1:1, 4:3, 3:4.
resolutionenum540p (default), 720p, 1080p.
lipSyncboolFast-only. 400 MV_LIPSYNC_UNSUPPORTED_IN_STUDIO if used with mode=studio. Adds a dynamic Vidu surcharge, roughly 24 OmnAPI credits/sec at the default rate.
subtitlesboolBurn subtitles into the final MP4. See Subtitle behavior.
subtitleColorstringHex, default #FFFFFF.
language"auto" | "en" | "zh"Default "auto".
srtUrlstringExplicit SRT override (URL or data: base64).
titlestring≤200 chars. Studio: rendered as 3-second title card; Fast: stored on project.
configobjectMode-specific. See Mode configs.
callbackUrlstringPer-task webhook URL.
metadataobjectFree-form, echoed on read.
tagsstring[]Free-form labels for filtering.
priorityint1–10 task queue priority, default 5.
expectedVersionintReserved for idempotency on retries.
// Suno clip — backend resolves audio URL, duration, word-level timeline
{ "type": "suno", "clipId": "<suno-clip-id>" }
// External audio URL — caller-provided
{
"type": "audio",
"audioUrl": "https://example.com/song.mp3",
"durationSec": 60,
"lyrics": "optional plain text"
}
// Inline base64 upload (≤20MB decoded)
{
"type": "audio-upload",
"contentBase64": "...",
"contentType": "audio/mpeg",
"durationSec": 60
}

Allowed contentType for audio-upload: audio/mpeg, audio/wav, audio/wave, audio/x-wav, audio/aac, audio/mp4, audio/x-m4a. Duration must be 10–180 seconds regardless of variant.

For mode: "studio":

{
"draft": false, // skip scene-images; 100 credits instead of 250
"preview": false, // render scenes at 720p/24fps/≤4s for cost control
"templateId": "tmpl_...", // optional MVTemplate id
"characterPrompt": "...", // text-only anchor (mutex with characterImage)
"videoProvider": "<model-id-from-/models>",
"maxLipsync": 1 // max scenes marked lipsync framing
}

templateId is silently ignored in mode=fast (warning code MV_TEMPLATE_IGNORED_IN_FAST).


Terminal window
curl -H "x-api-key: om_live_..." \
https://api.omnapi.com/api/v1/mv/<id>

Returns the unified MVView:

type MVView = {
id: string;
mode: "fast" | "studio";
status: "PENDING" | "GENERATING" | "READY" | "RENDERING"
| "FINALIZING" | "COMPLETE" | "EDITING" | "FAILED" | "ARCHIVED";
version: number;
source: { type, clipId?, audioUrl?, durationSec, lyrics? };
prompt: string | null;
title: string | null;
config: { aspectRatio, resolution, lipSync, subtitles, ... };
characterAnchor: { url, r2Key } | null;
referenceImages: { url: string }[];
scenes: MVSceneView[];
finalMv: MVFinalView | null;
stages: { emotionMap, creativeConcept, narrativeArc };
warningCodes: string[];
lastErrorCode: string | null;
lastErrorMessage: string | null;
failedAt: string | null;
generationTaskId: string | null;
createdAt: string;
updatedAt: string;
capabilities: {
canEditScenePrompt: boolean; // both modes
canEditSceneImage: boolean; // studio only
canEditSceneFraming: boolean; // studio only
canTriggerRender: boolean; // studio only
canFinalize: boolean;
canPatchStageOutput: boolean; // studio only
};
};
type MVSceneView = {
index: number;
startSec: number;
endSec: number;
lyricsWindow: string | null;
framing: string | null;
prompt: string;
imageUrl: string | null; // studio: scene still; fast: always null
videoUrl: string | null; // presigned
providerJobId: string | null; // fast only, opaque id
status: "PLANNED" | "IMAGE_READY" | "RENDERING" | "READY" | "FAILED" | "STALE";
renderingHistory: Array<{
id, videoUrl, durationSec, isSelected, createdAt
}>;
};

Use MVView.capabilities to drive your UI. Don’t branch on mode directly — capability flags may unlock controls that today are studio-only, and vice versa.

All presigned URLs (scene images, scene videos, character anchor, final MP4) have a 15-minute TTL. Refresh by reading MVView again, or call GET /api/v1/mv/:id/final for just the final URL.


Terminal window
curl -X POST https://api.omnapi.com/api/v1/mv/<id>/scenes/0/edit \
-H "x-api-key: om_live_..." \
-d '{ "prompt": "tighter close-up on the rapper's face, golden hour" }'

Fast: spawns an edit-scene task; the new MP4 replaces the scene’s current videoUrl. Returns { taskId, sceneIndex, version }.

Studio: synchronous in-place update of the scene’s imagePrompt, videoPrompt, framing, or referenceImageUrl (call render separately to actually re-render i2v). Returns { taskId: null, sceneIndex, version }.

Body fieldFastStudio
promptrequiredoptional; fans out to both imagePrompt + videoPrompt when alone
imagePrompt400 MV_NOT_SUPPORTED_IN_FASToptional
videoPrompt400 MV_NOT_SUPPORTED_IN_FASToptional
framing400 MV_NOT_SUPPORTED_IN_FASToptional
referenceImageUrl400 MV_NOT_SUPPORTED_IN_FASToptional
expectedVersionoptionalrecommended

Terminal window
curl -X POST https://api.omnapi.com/api/v1/mv/<id>/scenes/0/render \
-d '{
"videoProvider": "<model-id-from-/models>",
"resolution": "720p",
"durationSec": 4,
"draft": false
}'

Returns { taskId, creditsReserved }. Body fields are all optional — defaults inherit from the storyboard’s config.

Terminal window
curl -X PATCH https://api.omnapi.com/api/v1/mv/<id>/scenes/0/select-rendering \
-d '{ "renderingId": "01H..." }'

Synchronous; no task, no credits. Required before finalize if you’ve re-rolled scenes.

Terminal window
curl -X POST https://api.omnapi.com/api/v1/mv/<id>/scenes/0/regenerate-image \
-d '{ "imagePromptOverride": "more dramatic lighting" }'

Returns { taskId, creditsReserved } (15 credits). After completion the scene’s imageUrl updates.


Each LLM stage is a separately callable task. Useful for B2B power-users who want to inspect or PATCH intermediate outputs.

EndpointStageOutput column
POST /api/v1/mv/:id/analyze-emotionAnalyze song structure and moodemotionMap
POST /api/v1/mv/:id/draft-conceptDraft the visual conceptcreativeConcept
POST /api/v1/mv/:id/lock-characterLock a character anchorcharacter anchor
POST /api/v1/mv/:id/draft-narrativeDraft the narrative arcnarrativeArc
POST /api/v1/mv/:id/plan-scenesPlan scene prompts and timingscenes[]
POST /api/v1/mv/:id/evaluate-scenesEvaluate scene consistency(read-only)
POST /api/v1/mv/:id/auto-pilotAll inlineall stages

Body for all of them:

{
"maxLipsync": 1,
"expectedVersion": 7
}

lock-character accepts additional { refImageR2Key, description }.

Between stages, use PATCH /api/v1/mv/:id to overwrite a stage’s output:

Terminal window
curl -X PATCH https://api.omnapi.com/api/v1/mv/<id> \
-d '{
"expectedVersion": 8,
"emotionMap": { "...": "..." }
}'

Patchable fields: title, prompt, characterImage, emotionMap, creativeConcept, narrativeArc, scenesArray. A 409 MV_VERSION_CONFLICT means another writer bumped the row — re-fetch and retry.


Terminal window
curl -X POST https://api.omnapi.com/api/v1/mv/<id>/finalize -d '{}'

Returns { taskId, creditsReserved }.

Finalize concatenates selected scene MP4s, overlays the source audio, burns subtitles + title card when requested, and returns a final MP4 URL.

When the task completes:

Terminal window
curl https://api.omnapi.com/api/v1/mv/<id>/final

Returns:

{
"id": "<mvId>",
"videoUrl": "https://r2.../signed?...",
"expiresInSec": 900
}

The presigned URL is 15 minutes. The final MV file itself stays in R2 for 30 days before lifecycle expiry.


Terminal window
# Fast quote
curl -X POST https://api.omnapi.com/api/v1/mv/quote \
-d '{ "mode": "fast", "durationSec": 60, "lipSync": true, "resolution": "540p" }'
# Studio storyboard quote
curl -X POST https://api.omnapi.com/api/v1/mv/quote \
-d '{ "mode": "studio", "step": "storyboard", "draft": false }'
# Studio per-scene render quote
curl -X POST https://api.omnapi.com/api/v1/mv/quote \
-d '{
"mode": "studio", "step": "render-scene",
"videoProvider": "<model-id-from-/models>", "resolution": "720p",
"durationSec": 4, "draft": false
}'
# Studio total estimate
curl -X POST https://api.omnapi.com/api/v1/mv/quote \
-d '{
"mode": "studio", "step": "total",
"videoProvider": "<model-id-from-/models>", "resolution": "720p",
"perSceneDurationSec": 4, "estimatedSceneCount": 8
}'

All return { credits, breakdown, warningCodes? }.

Pricing summary:

ModeItemCredits
Fastquote0
FastVidu Q2-pro-equivalent 540pDynamic; 60s ≈ 1,724
FastVidu Q2-pro-equivalent 720pDynamic; 60s ≈ 3,476
FastVidu Q2-pro-equivalent 1080pDynamic; 60s ≈ 5,400
Fastlip-syncDynamic, about +24 credits/sec
Fastcompose/final copy100
Studiostoryboard draft / full100 / 250
Studioanalyze emotion / draft concept / draft narrative10 each
Studioplan scenes20
Studiolock character / evaluate scenes0
Studioscene image regenerate15
Studioselect rendering / patch / reads0
Studiofinalize50

Studio render pricing defaults are shown below. Active production rows may come from mv_provider_pricing_rules; use GET /api/v1/pricing/catalog or POST /api/v1/mv/quote for the live value.

Provider / modelResolutionFPSDraftCredits / billable second
Replicate prunaai/p-video720p24off30
Replicate prunaai/p-video720p24on8
Replicate prunaai/p-video1080p24off60
Replicate prunaai/p-video1080p24on15
Replicate prunaai/p-video720p48off45
Replicate prunaai/p-video720p48on12
Replicate prunaai/p-video1080p48off90
Replicate prunaai/p-video1080p48on23
MiniMax MiniMax-Hailuo-02512P24off21
MiniMax MiniMax-Hailuo-02768P24off81
MiniMax MiniMax-Hailuo-021080P24off134
Vidu vidu2.0360p24off38
Vidu vidu2.0540p24off60
Vidu vidu2.0720p24off90
Vidu vidu2.01080p24off150

Sourcesubtitles=truesrtUrlModeBehavior
sunoyesfastOmnAPI converts the Suno timeline to subtitles.
sunoyesyesfastCaller-supplied SRT wins.
sunoyesstudioOmnAPI burns word-level subtitles from the Suno timeline.
audio / audio-uploadyesyesfastCaller-supplied SRT is used.
audio / audio-uploadyesyesstudioOmnAPI burns subtitles from the parsed SRT.
audio / audio-uploadyesfastSubtitle timing is inferred when available.
audio / audio-uploadyesstudioSilently disabled. warningCodes: ["MV_SUBTITLE_DISABLED_NO_TIMELINE"] attached.
anyfalseanyanyNo subtitles.

MVView.statusMeaning
PENDINGGeneration task queued
GENERATINGMV generation is running
READYReady to render scenes (Studio) or compose (Fast)
RENDERINGAt least one scene render in flight (Studio)
FINALIZINGFinal stitch in progress
COMPLETEfinalMv.videoUrl is populated
EDITINGUser has edits since last finalize
FAILEDPipeline hard-failed
ARCHIVEDUser deleted via DELETE /api/v1/mv/:id
MVSceneView.statusMeaning
PLANNEDStudio: scene plan exists, image not yet generated (draft mode)
IMAGE_READYStudio: image generated, no video yet
RENDERINGScene rendering is in flight
READYScene has a selected (Studio) / latest (Fast) video
FAILEDGeneration failed
STALEA newer edit invalidated this scene

CodeHTTPMeaning
MV_MODE_REQUIRED400mode field absent
MV_MODE_INVALID400Unknown mode value
MV_SOURCE_REQUIRED400source field absent
MV_SOURCE_INVALID400Unknown source variant
MV_SUNO_CLIP_NOT_READY409Suno clip exists but not COMPLETED
MV_AUDIO_DURATION_INVALID400Outside 10–180s
MV_AUDIO_CODEC_UNSUPPORTED400Not MP3/WAV/AAC/M4A
MV_TOO_MANY_REFERENCES400More than 7 reference images
MV_REFERENCE_IMAGE_INVALID400Malformed image input
MV_IMAGE_PAYLOAD_TOO_LARGE413Image exceeds 50MB single or 20MB base64 total
MV_CHARACTER_IMAGE_FAST_NOT_ALLOWED400characterImage with mode=fast
MV_LIPSYNC_UNSUPPORTED_IN_STUDIO400lipSync=true with mode=studio
MV_ASPECT_RATIO_INVALID400Unknown aspect ratio
MV_RESOLUTION_INVALID400Unknown resolution
MV_PROMPT_TOO_LONG400prompt.length > 3000
MV_NOT_SUPPORTED_IN_FAST400Studio-only operation called on Fast MV
MV_NOT_SUPPORTED_IN_STUDIO400Reserved
MV_NOT_FOUND404id unknown
MV_VERSION_CONFLICT409expectedVersion stale
INSUFFICIENT_CREDITS402Quote exceeds balance
RATE_LIMITED429Per-account concurrency cap hit

Warning codes (non-fatal, surfaced on MVView.warningCodes)

Section titled “Warning codes (non-fatal, surfaced on MVView.warningCodes)”
CodeTrigger
MV_SUBTITLE_DISABLED_NO_TIMELINEStudio + audio-source + subtitles=true + no SRT
MV_LIBASS_UNAVAILABLEWorker lacks libass; subtitles skipped
MV_CJK_FONT_UNAVAILABLEWorker lacks CJK font; subtitles skipped for zh
MV_PHOTOMAKER_BYPASSED_FRAMINGStudio: ≥1 scene skipped face-lock due to framing
MV_TEMPLATE_IGNORED_IN_FASTFast: templateId provided but has no effect
MV_REFERENCE_FROM_SUNO_COVERFast: referenceImages was empty; Suno clip cover auto-injected

{
"event": "mv.created" | "mv.generation.completed" | "mv.generation.failed"
| "mv.scene.rendering.completed" | "mv.scene.rendering.failed"
| "mv.scene.edited" | "mv.finalize.completed" | "mv.finalize.failed"
| "mv.stage.<name>.completed" | "mv.archived",
"mode": "fast" | "studio",
"mvId": "<id>",
"taskId": "task_...",
"sceneIndex": 0, // when applicable
"timestamp": "2026-06-01T...",
"data": { /* MVView snapshot */ }
}

See Webhook Events for delivery semantics.