Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.viddyscribe.com/llms.txt

Use this file to discover all available pages before exploring further.

Generate AD Audio

Generate an audio-only track with audio descriptions (no video rendering). This is useful when you only need the audio track without video processing.
Already uploaded media or using the signed upload flow? Generate with media_id. See Large Local File Upload or API Reference.

1. Using a Video from public URL

Upload from a public URL and generate in one step.
curl -X POST https://api.viddyscribe.com/enterprise/api/generate_ad_audio \
  -H "X-API-Key: vsk_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "type": "url",
      "url": "https://example.com/video.mp4"
    },
    "generation_config": {
      "language": "en-US",
      "voice": "Achernar",
      "format": "vtt",
      "custom_instructions": "Keep descriptions concise and focus on on-screen text."
    }
  }'
Response:
{
  "job_id": "task_abc123xyz",
  "status": "queued",
  "media_id": "550e8400-e29b-41d4-a716-446655440000"
}
Use the job_id to poll get_results for completion.

2. Using a Video from local file

Upload a local file and generate in one step.
Direct multipart upload supports local files up to 32 MB. For larger local files, see Large Local File Upload.
curl -X POST https://api.viddyscribe.com/enterprise/api/generate_ad_audio \
  -H "X-API-Key: vsk_your_api_key_here" \
  -F 'input={"type": "file"}' \
  -F "file=@video.mp4" \
  -F 'generation_config={"language": "en-US", "voice": "Achernar", "format": "vtt", "custom_instructions": "Keep descriptions concise and focus on on-screen text."}'
Response:
{
  "job_id": "task_abc123xyz",
  "status": "queued",
  "media_id": "660f9511-f30c-52e5-b827-557766551111"
}
Use the job_id to poll get_results for completion. On success, audio_signed_url contains a signed URL to download the audio track (WAV format).

Tips

  • Note: Audio generation only supports standard_ad type (concise descriptions during dialogue pauses).
  • The default text output format is json. Set format to "vtt" to include a WebVTT string in the response output.
  • Audio output is priced at 0.75x the base workflow cost.
  • Use custom_instructions to guide the AI’s description style (e.g. tone, focus, or length).
  • Set audio_track_type to "ad_only" to receive just the AD narration WAV; the default "mixed" returns the source dialogue plus AD narration.

Retrieve Results

Use the job_id from the previous step to fetch results:
curl -X GET "https://api.viddyscribe.com/enterprise/api/get_results?job_id=TASK_ID" \
  -H "X-API-Key: vsk_your_api_key_here"
Example successful response for audio jobs (with format: "vtt"):
{
  "job_id": "task_abc123xyz",
  "status": "done",
  "media_id": "550e8400-e29b-41d4-a716-446655440000",
  "audio_signed_url": "https://storage.googleapis.com/bucket/path/to/audio.wav?X-Goog-Signature=...",
  "output": {
    "format": "vtt",
    "content": "WEBVTT\n\n1\n00:00:00.500 --> 00:00:03.100\nA woman in a yellow top sits at a desk with a laptop.\n\n2\n00:00:03.200 --> 00:00:05.900\nShe looks at the camera and smiles.\n"
  },
  "created_at": "2025-09-30T08:00:00Z",
  "updated_at": "2025-09-30T08:10:00Z"
}