Resources

Resources

Resources

How to Build Voice Agents with Spitch & LiveKit

How to Build Voice Agents with Spitch & LiveKit

Learn how to integrate Spitch STT and TTS models into Livekit in less than 5 minutes

Learn how to integrate Spitch STT and TTS models into Livekit in less than 5 minutes

Ifeoluwa Oduwaiye

Jul 11, 2025

Building voice agents has become significantly easier with our new integration with Livekit. What does this mean for developers? Well, you can now develop and deploy voice agents in Yoruba, Igbo, Hausa, Amharic or English in less than 5 minutes.

You can now bring natural-sounding, African voice capabilities to your Livekit applications in minutes. In this blog, we’ll guide you on how to integrate Spitch STT (Speech-to-Text) and TTS (Text-to-Speech) models into LiveKit. But before we dive in, here are some prerequisites you’ll need to follow along.

Prerequisites

  • A Spitch API key

  • Livekit API key, API Secret Key, and URL

  • A Python IDE

  • Optional: An OpenAI LLM API Key


Setting up your environment variables and modules

Install the following Python packages on your local computer

pip install spitch \

  "livekit-agents[deepgram,openai,cartesia,silero,turn-detector]~=1.0" \

  "livekit-plugins-noise-cancellation~=0.2" \

  "python-dotenv"

Set up your environment variables with the following API secrets

SPITCH_API_KEY=<Your Spitch API Key>

LIVEKIT_API_KEY=<your API Key>

LIVEKIT_API_SECRET=<your API Secret>

LIVEKIT_URL=<your URL>

OPENAI_API_KEY=<Your OpenAI API Key>

Speech‑to‑Text (STT) Integration

Spitch STT currently supports 5 African languages with more languages coming really soon. To get started with implementing our STT API into Livekit, use the code sample below, and feel free to check out our docs for more information. 

from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import (
    spitch
)

load_dotenv()

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")

async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt=spitch.STT(language="en")
        # , llm, tts, ...
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_input_options=RoomInputOptions(
            # LiveKit Cloud enhanced noise cancellation
            # - If self-hosting, omit this parameter
            # - For telephony applications, use `BVCTelephony` for best results
            noise_cancellation=noise_cancellation.BVC(), 
        ),
    )

    await ctx.connect()

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Text‑to‑Speech (TTS) Integration

Spitch TTS currently also supports 5 African languages, and 22 voices with more languages and voices coming soon. To get started with implementing our TTS API into Livekit, use the code sample below, and feel free to check out our docs for more information. 

from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import (
    spitch
)

load_dotenv()

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")

async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        tts=spitch.TTS(language="en", voice="kani")
        # , llm, tts, ...
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_input_options=RoomInputOptions(
            # LiveKit Cloud enhanced noise cancellation
            # - If self-hosting, omit this parameter
            # - For telephony applications, use `BVCTelephony` for best results
            noise_cancellation=noise_cancellation.BVC(), 
        ),
    )

    await ctx.connect()

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Integrating a STT-LLM-TTS pipeline with Spitch, OpenAI, and Livekit

Spitch Livekit Integration Pipeline

This pipeline is going to take in voice as input, use the Spitch TTS to transcribe the audio. The generated transcript would be passed to an LLM for processing through predefined prompts. The output is then converted to audio using the Spitch TTS.

This pipeline can be used to develop multilingual voice agents that can be applied for a lot of use cases by simply changing the LLM prompt and language. To implement this pipeline, make use of the code below.

import asyncio
from asyncio import windows_events

# Force the SelectorEventLoopPolicy
asyncio.set_event_loop_policy(windows_events.WindowsSelectorEventLoopPolicy())


from dotenv import load_dotenv

from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import (
    spitch,
    openai,
    noise_cancellation,
    silero,
)
from livekit.plugins.turn_detector.multilingual import MultilingualModel

load_dotenv()

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")


async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt=spitch.STT(language="en"),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=spitch.TTS(language="en", voice="kani"),
        vad=silero.VAD.load(),
        turn_detection=MultilingualModel(),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_input_options=RoomInputOptions(
            # LiveKit Cloud enhanced noise cancellation
            # - If self-hosting, omit this parameter
            # - For telephony applications, use `BVCTelephony` for best results
            noise_cancellation=noise_cancellation.BVC(),
        ),
    )

    await ctx.connect()

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )


if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Next Steps & Resources

Concluding Remarks

Now that you are armed with all the resources needed to build voice agents with Spitch, it's time to put that knowledge into production! Feel free to experiment with the different voices and languages on our platform, and don't be a stranger to our docs. We can’t wait to see what you build!

Speak to the Future Africa

Speak to the Future Africa

Speak to the Future Africa

Our AI voice technology is built to understand, speak, and connect with Africa like never before.

Our AI voice technology is built to understand, speak, and connect with Africa like never before.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.