Research

Research

Research

Introducing Mansa.v1: An Automatic Speech Recognition Built for African Accented Users

Introducing Mansa.v1: An Automatic Speech Recognition Built for African Accented Users

We introduce you to Mansa, our new ASR model for African accented users that excels in Named Entity Recognition, timestamp generation and custom-spelling

We introduce you to Mansa, our new ASR model for African accented users that excels in Named Entity Recognition, timestamp generation and custom-spelling

Odunola Jenrola

Aug 13, 2025

Today, we're excited to unveil Mansa.v1, a 1.4B parameter model,  a preview and the first in our series of automatic speech recognition models designed with a singular focus: delivering exceptional performance on African content. While global ASR giants have made impressive strides, they often stumble when encountering the rich linguistic diversity of African names, places, and cultural references. Mansa.v1 is one step to change this.

What Makes Mansa.v1 Different

Mansa.v1 excels at recognising African named entities within English speech. It captures every name, every place, verbatim with impressive accuracy.

Key Features at a Glance:
  • African named entity recognition in English contexts

  • Custom spelling guidance for names and specialised terms.

  • Sentence or Word-level timestamps for audio up to 30 minutes (25MB)


Performance Comparison

Here’s how our model stacks up against other commercial STT solutions!

Comparative Transcription Results

Original Audio Content: "After the handshake formalities, the Bayelsa governor takes the kickoff at the Samson Siaisia Stadium, where the All Stars International Football Club of Yenagoa took off to a flying start with Charles Gabriel getting the curtain raiser."

Model

Output

Named Entity Accuracy

Whisper Large-v3

after the handshake formalities the bayota governor takes the kickoff at the samson cscs stadium where the all-stars international football club of yanagawa took off to a flying start with charles gabriel getting the curtain raiser few minutes later

0/4 (0%)

ElevenLabs Scribe

After the handshake formalities, the Bayelsa governor takes the kickoff at the Sampson Siyasiya Stadium, where the All Stars International Football Club of Yenagoa took off to a flying start with Charles Gabriel getting the curtain raiser.

3/4 (75%)

GPT-4o Transcribe

After the handshake formalities, the Bayelsa governor takes the kickoff at the Samson Siasia Stadium where the All Stars International Football Club of Yenagoa took off to a flying start with Charles Gabriel getting the curtain raiser.

4/4 (100%)

Mansa.v1

After the handshake formalities, the Bayelsa governor takes the kickoff at the Samson Siaisia Stadium, where the All Stars International Football Club of Yenagoa took off to a flying start with Charles Gabriel getting the curtain raiser.

4/4 (100%)


Advanced Features

Named Entity Specification: You're in Control

Mansa.v1 allows the ability to guide the model's spelling for specific terms. This is useful for content creators, journalists, and businesses working with nuanced African words like place names, personal names, and cultural concepts that language models often mistranscribe.

How It Works

To make use of this feature, simply provide a list of special words before transcription. You can access this feature either through the playground or through our SDKs. You can head over to our docs for more details on how to integrate with our SDKs.

Spitch Plaground


Timestamps: Precision Down to the Second

Mansa.v1 provides word-level timestamps for audio files up to 30 minutes, making it perfect for:

  • Subtitle generation with precise synchronisation

  • Content navigation for podcasts and interviews

  • Quote extraction for journalists and researchers

  • Accessibility compliance with accurate closed captions

Spitch Playground
Timestamp Accuracy Comparison

Our timestamp alignment is consistently on par with competitors, with an average deviation of just ±0.1 seconds compared to ±0.3-0.5 seconds for other models.


Real-World Applications

Here are some of the real world applications of Mansa.v1.

Film & Media Production

Creating accurate subtitles for Nollywood films or African documentaries becomes effortless with Mansa. Simply provide cast member names and location spellings upfront, and watch Mansa.v1 ensure consistency throughout transcription.

Corporate Communications

Transcribing earnings calls or board meetings with African executives? Pre-load company names, executive names, and technical terms for flawless transcripts.

News & Journalism

Covering African elections, business news, or cultural events? Mansa.v1 ensures every politician's name, every constituency, every cultural reference is captured correctly.


Getting Started

Visit our API Documentation to get your API key and start transcribing. All new users get $1 worth of Spitch credits on sign up to try out our services.

For questions or partnership inquiries: info@spitch.app

Current Limitations

In our internal tests we noticed some scenarios where the model degrades:

  • Extreme Code-switching makes the model enter a repetition loop.

  • Extremely spontaneous Audio and overlapping speakers.

These would be addressed in our upcoming multilingual release.

What's Next

Here are some exciting features to look out for in the next version of our model.

  • Multi-language support: Native transcription for Yoruba, Swahili, Amharic, and more

  • Code Switching: Multiple languages in one sequence.

  • Speaker diacrization: Identifying who said what in multi-speaker scenarios

  • Real-time streaming: Live transcription for broadcasts and events

Speak to the Future Africa

Speak to the Future Africa

Speak to the Future Africa

Our AI voice technology is built to understand, speak, and connect with Africa like never before.

Our AI voice technology is built to understand, speak, and connect with Africa like never before.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.

© 2025 Spitch. All rights reserved.