Google and African institutions are launching WAXAL, a massive, open-source voice dataset covering 21 indigenous languages, to democratize AI in sub-Saharan Africa.
The Birth of WAXAL: An Answer to the African Language Divide
The result of three years of collaboration funded by Google Research Africa, the WAXAL project (“to speak” in Wolof) provides, for the first time, high-quality voice data for more than 100 million speakers.
While voice assistants and real-time translators dominate the West, Africa and its more than 2,000 languages have lagged behind due to a lack of suitable datasets. WAXAL changes the game with 1,250 hours of naturally transcribed speech and more than 20 hours of studio recordings for synthetic voices.
Indigenous languages at the heart of the project
African partners – Makerere University (Uganda), University of Ghana, Digital Umuganda (Rwanda) and the African Institute for Mathematical Sciences – led the collection, retaining full ownership of the data under open license on Hugging Face. Here are the 21 languages covered by WAXAL:
- West Africa: Akan, Ewe, Fante, Fulani, Hausa, Igbo, Yoruba
- East Africa: Luganda, Swahili, Kikuyu, Dholuo, Acholi
- Southern/Central Africa: Shona, Lingala, Malagasy, and others such as Dagaare, Dagbani, Ikposo, Masaaba, Nyankole, Rukiga, and Soga
Practical Applications for Africa
This open-source dataset paves the way for local innovations in speech AI, particularly crucial in areas with low literacy rates:
- Voice assistants for agricultural and healthcare services
- Medical and educational transcriptions in local languages
- Customer automation for pan-African SMEs
- Accessibility tools for people with disabilities
African leadership and digital sovereignty
Unlike the top-down approaches of tech giants, WAXAL adopts a community-first model where African universities lead the way Data collection and quality are key, with Google acting as a technical facilitator. “The ultimate impact is the empowerment of Africans to build their tech in their own languages,” emphasizes Aisha Walcott-Bryant, Head of Google Research Africa. Joyce Nakatumba-Nabende of Makerere adds, “AI must speak our contexts to transform our communities.”
Economic Impact and Enhanced Research
Already, WAXAL is boosting university research in Uganda and Ghana, training students and researchers in linguistic AI. In the long term, it could catalyze startups specializing in AI for health, education, and agriculture—sectors where 80% of Africans prefer to interact orally. This dataset is part of the post-N-ATLAS (Nigeria, 2025) momentum for a truly African AI.
Perspectives: Towards an Inclusive Pan-African AI
Available today on Hugging Face, WAXAL invites researchers, students, and entrepreneurs to create scalable tools that reflect the continent’s linguistic diversity. As generative AI explodes in popularity, this strategic project positions Africa as a key player, not just a consumer, in global voice technology.






