AppTek’s 4D for HLT methodology, represented by language/dialect, demographic, domain and channel, is designed to capture every voice around the world through advancements in machine learning and the high-quality data science research intended to fuel AI-enabled automatic speech recognition (ASR) and neural machine translation (NMT) models.
However, capturing every voice is not an easy task, and bias in speech recognition is still a problem with studies finding a large performance gap across age groups, genders, and demographics across many countries and regions in speech recognition systems.
To better mitigate bias and capture the distinct voices and languages heard around the world, including deaf and hard-of-hearing users and text-to-speech for the blind, AppTek has developed a specific methodology that focuses on the ABCD’s of HLT: AI Bias-Correction Data Science.
AppTek’s ABCD's methodology is designed with the specific purpose to enrich AI models with demographically diverse acoustic features in supervised and unsupervised machine learning models with the goal to increase inclusivity and representation across all languages and, more importantly, all people around the world.
To implement this ABCD approach, the AppTek Science Team plans the process with demographic-specific questions such as:
(a.) Gender: Is there an even 50/50 gender mix to ensure balance?
(b.) Accents/Dialects: Is the research representing all accents and dialects indicative of the language model we are building? Today, in US English ASR for example, AppTek has built equity balanced models that augment and counterweigh underrepresented dialects and accents.
(c.) Education: Are the various levels of education represented by domain?
(d.) Age: Do we have all appropriate age groups represented for a given domain? We collect all age ranges from children to elderly voices.
Once demographic analysis is completed, the team builds acoustic models which includes either securing existing audio (such as broadcast news) or creating simulated audio (telephony and microphone conversations) based on the pre-assigned demographic mix.
By taking into consideration age, accents, dialects, education levels, gender, and more, AppTek's ABCD methodology is a step forward to more responsible, diverse and inclusive AI that ensures every voice across the world can be heard.
AppTek is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U) and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading solutions for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek’s solutions cover a wide array of languages/ dialects, channels, domains and demographics.