Core

Audio Generation

Provider-neutral text-to-speech model contracts and request builders.

Import from @anvia/core or @anvia/core/audio-generation.

AudioGenerationModel

interface AudioGenerationModel<RawResponse = unknown> {
  readonly provider?: string;
  readonly defaultModel?: string;
  audioGeneration(request: AudioGenerationRequest): Promise<AudioGenerationResponse<RawResponse>>;
}

Purpose: provider-neutral audio generation and text-to-speech contract.

Return behavior: providers return generated audio bytes, optional mediaType, and the provider response as rawResponse.

Request Builder

type AudioGenerationRequest = {
  text: string;
  voice: string;
  speed: number;
  additionalParams?: JsonValue;
};

type AudioGenerationResponse<RawResponse = unknown> = {
  audio: Uint8Array;
  mediaType?: string;
  rawResponse: RawResponse;
};

class AudioGenerationRequestBuilder<Model extends AudioGenerationModel = AudioGenerationModel> {
  text(text: string): this;
  voice(voice: string): this;
  speed(speed: number): this;
  additionalParams(additionalParams: JsonValue): this;
  build(): AudioGenerationRequest;
  send(): Promise<AudioGenerationResponse>;
}

const response = await audioGenerationRequest(model)
  .text("Welcome to Anvia.")
  .voice("alloy")
  .speed(1)
  .additionalParams({ response_format: "mp3" })
  .send();

Purpose: Chainable builder for audio generation requests.

Defaults: text: "", voice: "", and speed: 1.

Notable errors: provider adapters reject on SDK or provider errors.