AICQuailVADAnalyzer

Overview

AICQuailVADAnalyzer is a standalone voice activity detection (VAD) analyzer powered by ai-coustics’ Quail VAD 2.0 model. Unlike the deprecated AICVADAnalyzer which relies on AICFilter’s internal VAD, this analyzer owns its own dedicated processor and can be placed anywhere in the pipeline, working independently of audio enhancement. The analyzer provides noise-robust speech detection using a specialized Quail VAD-only model, making it ideal for detecting speech in challenging acoustic environments. To use AIC, you need a license key. Get started at ai-coustics.com.

Installation

The AIC Quail VAD analyzer requires additional dependencies:

uv add "pipecat-ai[aic]"

Constructor Parameters

license_key

str

required

ai-coustics SDK license key for authentication. Get your key at developers.ai-coustics.io.

model_id

str | None

default:"\"quail-vad-2.0-xxs-16khz\""

Quail VAD model identifier. Defaults to the published standalone VAD model "quail-vad-2.0-xxs-16khz". See artifacts.ai-coustics.io for the catalogue. Ignored if model_path is provided.

model_path

Path | None

default:"None"

Optional path to a local .aicmodel file. Overrides model_id when set. Useful for offline deployments or custom models.

model_download_dir

Path | None

default:"None"

Directory for downloaded models. Defaults to ~/.cache/pipecat/aic-models.

speech_hold_duration

float | None

default:"None"

Seconds the VAD continues reporting speech after the signal stops containing speech. Range: 0.0 to 300x the model window length. Default (SDK): 0.03s

minimum_speech_duration

float | None

default:"None"

Seconds of speech required before the VAD reports speech detected. Range: 0.0 to 1.0. Default (SDK): 0.0s

sensitivity

float | None

default:"None"

Speech-probability threshold for dedicated Quail VAD models. Range: 0.0 to 1.0. Values above this threshold are considered speech. Default is model-specific.Note: This differs from the deprecated AICVADAnalyzer which used an energy-based threshold in range 1.0 to 15.0.

sample_rate

int | None

default:"None"

Initial sample rate; the pipeline will set this via set_sample_rate once the transport rate is known.

params

VADParams | None

default:"None"

Optional VADParams for the base VAD state machine configuration.

Usage Examples

Basic Usage

The recommended approach for AIC-powered voice detection:

import os
from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.audio.vad.aic_quail_vad import AICQuailVADAnalyzer
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.transports.services.daily import DailyTransport, DailyParams

# Create the AIC filter for audio enhancement
aic_filter = AICFilter(
    license_key=os.environ["AIC_SDK_LICENSE"],
    model_id="quail-vf-2.0-l-16khz",
)

# Create standalone Quail VAD 2.0 analyzer
aic_vad = AICQuailVADAnalyzer(
    license_key=os.environ["AIC_SDK_LICENSE"],
)

transport = DailyTransport(
    room_url,
    token,
    "Bot",
    DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        audio_in_filter=aic_filter,
    ),
)

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=aic_vad,
    ),
)

With Custom VAD Parameters

Fine-tune the VAD behavior for your specific use case:

from pipecat.audio.vad.aic_quail_vad import AICQuailVADAnalyzer

aic_vad = AICQuailVADAnalyzer(
    license_key=os.environ["AIC_SDK_LICENSE"],
    speech_hold_duration=0.05,  # Hold speech detection for 50ms after speech ends
    minimum_speech_duration=0.1,  # Require 100ms of speech before triggering
    sensitivity=0.5,  # Speech probability threshold (0.0-1.0)
)

VAD-Only (Without Enhancement)

Use Quail VAD without audio enhancement:

from pipecat.audio.vad.aic_quail_vad import AICQuailVADAnalyzer
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.transports.services.daily import DailyTransport, DailyParams

# Just VAD, no enhancement filter
aic_vad = AICQuailVADAnalyzer(
    license_key=os.environ["AIC_SDK_LICENSE"],
)

transport = DailyTransport(
    room_url,
    token,
    "Bot",
    DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        # No audio_in_filter - raw audio goes directly to VAD
    ),
)

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=aic_vad,
    ),
)

Using a Local Model

For offline deployments or custom Quail VAD models:

from pathlib import Path
from pipecat.audio.vad.aic_quail_vad import AICQuailVADAnalyzer

aic_vad = AICQuailVADAnalyzer(
    license_key=os.environ["AIC_SDK_LICENSE"],
    model_path=Path("/path/to/your/quail-vad-model.aicmodel"),
)

See the AIC Quail VAD example for a complete working example with detailed logging.

Comparison to Deprecated AICVADAnalyzer

Feature	AICQuailVADAnalyzer (Recommended)	AICVADAnalyzer (Deprecated)
Model	Standalone Quail VAD 2.0	Enhancement model’s internal VAD
Independence	Owns its own processor	Bound to `AICFilter` instance
Audio path	Processes whatever the pipeline feeds it	Reads post-enhancement VAD state
Sensitivity	Probability threshold (0.0-1.0)	Energy threshold (1.0-15.0)
Placement	Can be placed anywhere in pipeline	Must follow `AICFilter`
Use case	Noise-robust VAD as primary differentiator	Legacy coupling to enhancement pipeline
Removal timeline	N/A (current recommended approach)	Will be removed in Pipecat 1.6.0

Audio Flow

The Quail VAD analyzer can work with or without the AIC enhancement filter, providing flexibility in your pipeline architecture.

Notes

Requires ai-coustics license key (get one at developers.ai-coustics.io)
Environment variable: Use AIC_SDK_LICENSE for authentication
Default model is quail-vad-2.0-xxs-16khz, optimized for 16kHz audio
Model is downloaded and cached on first use
Works independently of AICFilter - can be used with or without audio enhancement
Provides noise-robust speech detection in challenging acoustic environments
Handles PCM_16 audio format (int16 samples)
Thread-safe for pipeline processing
For available models, visit artifacts.ai-coustics.io

​Overview

​Installation

​Constructor Parameters

​Usage Examples

​Basic Usage

​With Custom VAD Parameters

​VAD-Only (Without Enhancement)

​Using a Local Model

​Comparison to Deprecated AICVADAnalyzer

​Audio Flow

​Notes

Overview

Installation

Constructor Parameters

Usage Examples

Basic Usage

With Custom VAD Parameters

VAD-Only (Without Enhancement)

Using a Local Model

Comparison to Deprecated AICVADAnalyzer

Audio Flow

Notes