Intron, a Nigerian AI startup that provides speech-to-text and text-to-speech transcription tools for African languages, has expanded its speech recognition platform, Sahara, to support 57 languages, adding 24 new ones as it deepens its push into healthcare, legal, financial services, and telecom.
Sahara v2 covers 23 African languages within that total and supports more than 500 distinct African accents. The 24 newly added languages include Hausa, Swahili, isiZulu, Yoruba, Kinyarwanda, Twi, Igbo, isiXhosa, African French, Amharic, Bemba, Luganda, Oromo, Pidgin, Shona, and Wolof. Olatunji said commercial demand guided the selection.
The upgrade also introduces what the company describes as the world’s first bilingual Swahili-English automatic speech recognition model built to handle code-switching, alongside new text-to-speech capabilities and enterprise offline deployment options.
In Africa, where most of its roughly 2,000 languages exist primarily as spoken tongues with little written form, voice recognition tools are beginning to play a key role in how people interact with technology.
The global speech and voice recognition market is projected to hit $81.59 billion by 2032, and startups like Intron are building the infrastructure layer the continent needs, rather than inheriting systems that were not designed for how Africans speak.
Intron said Sahara uses locally sourced voice data to understand the diverse accents, dialects, and contextual nuances of African speech.
According to benchmarks conducted by the company using African voice datasets, it curated and made publicly testable, Sahara v2 outperformed systems including Gemini, GPT-4, Whisper, ElevenLabs, and Azure by up to 64% on African names, organisations, and locations.
The company also reports 35% better performance with numbers, 20% stronger robustness in noisy and multi-speaker environments, and 25% higher cross-domain accuracy across healthcare, finance, legal services, and telecommunications.
“We curated datasets of African voices, combining publicly available datasets with our in-house collection, and made them available so anyone can test global models on African speech,” Tobi Olatunji, founder and CEO of Intron, told TechCabal.
Founded in 2020 by Olatunji and Olakunle Asekun, Intron began by building clinical documentation tools before expanding into broader voice infrastructure. Sahara now powers speech-to-text, text-to-speech, and voice authentication systems used by enterprise and government clients.
Intron says it sees consistent usage in at least six African countries, including Nigeria, Kenya, South Africa, Ghana, Rwanda, and Uganda. Enterprise clients include the Ogun State Judiciary and Audere, a South African company using Sahara to transcribe WhatsApp voice messages across multiple local accents.
“We are a for-profit company, so we prioritised languages where there is enterprise demand and willingness to pay,” Olatunji said. “There is still a massive gap across the continent, but we have to start where there is both population coverage and clear use cases.”
Sahara v2 was built using more than 14 million audio clips totaling over 50,000 hours from more than 40,000 speakers across 30 African countries. Olatunji said much of the early medical speech data did not previously exist and had to be created from scratch.
“When we started, there was no African medical speech data available,” he said. “We recruited contributors ourselves across multiple countries, compensated them, and ensured they understood how their data would be used.”
He added that newer datasets supported by organisations such as the Gates Foundation, Lacuna Fund, and Google have supplemented the company’s work, though most of Sahara’s training data remains in-house.
The new bilingual Swahili-English model was developed in partnership with Penda Health, an outpatient health care provider in Kenya, to support rapid switching between languages during real conversations. Code-switching is common in many African countries, particularly in clinical and service settings.
“A doctor may ask questions in English, a patient replies in Swahili, then switches back to English, all within the same exchange,” Olatunji said. “Many monolingual systems struggle in those scenarios.”
Intron says additional bilingual models for languages such as Yoruba, Hausa, Zulu, and Kinyarwanda are in development and expected in subsequent releases.
The company also introduced its first local-language text-to-speech model in Hausa, designed to power multilingual voice bots for call centres, health assistants, and financial services, enabling voice interactions in local languages.
Sahara v2 is primarily deployed via the cloud, but Intron now supports fully offline enterprise deployments through a partnership with Nvidia. The models can run on Nvidia Jetson Edge devices for organisations operating in low-connectivity environments.
The company says the entry-level device costs about $250 and can support several users connected locally. Mobile phones are not currently supported for full offline processing, though the platform can operate in caching mode for intermittent connectivity.
Intron says it complies with in-country data protection requirements and allows enterprise clients to determine whether data is stored locally or in the cloud.
Alongside the product release, the company is publishing its inaugural 2026 Africa Voice AI Report, which examines benchmarking standards, data quality, and the commercial readiness of voice AI systems across the continent.
Intron plans to raise $3 million later this year to expand language coverage and continue developing bilingual and domain-specific models.
“A lot of the investment that we make is upfront,” Olatunji said. “We spend a lot of money on creating models, running experiments, and producing data to train some of the best models.”
He added that the company is already seeing demand for similar code-switching models beyond Swahili-English, with languages such as Yoruba, Hausa, Zulu, and Kinyarwanda on its roadmap.
Crédito: Link de origem
