By Mike Veronis
Media and entertainment takes many forms these days, delivered through both traditional and leading-edge platforms. There’s an opportunity for the M&E industry to enrich access and consumption for all. Media & Entertainment can leverage the optimization of advanced automatic speech recognition and machine translation to expedite and optimize their workflow and drastically reduce operating costs.
This is the first in a series of posts where we’ll explore several ways the content experience can be enhanced for audiences and where M&E companies can increase their content monetization. Let’s start with a common example.
Have you ever been in a place, perhaps your local gym, an airport or even a cocktail bar, where you noticed something engaging on a screen but no sound was available with it? You missed the details and lost an understanding of what was going on. Frustrating, wasn’t it? Well, imagine that being your regular experience. That’s what it’s like for those who are hard of hearing. It’s what makes closed captioning so critical.
Whether an audience is hard of hearing or just viewing content without the benefit of audio, many people read closed captions. These viewers have every right and should have every expectation of gaining the same level of information understanding as anyone else. Currently, audiences who are hard of hearing are limited to what closed captioning makes available; and that often contains errors, omissions and inaccuracies, especially in manual captioning situations where a live captioner can’t discern or type words quickly enough.
In fact, we analyzed many closed captioning samples, and found that they had an average error rate of 40% including omissions/deletions. Captioning errors not only affect viewer comprehension, they can alter meaning, have a negative impact on a brand, or even result in regulatory compliance violations.
If we look beyond native language, many international organizations need to provide real-time closed captioning to members and stakeholders, across borders, languages and accents. Accuracy is especially critical here, as misunderstood or misinterpreted information could have some serious consequences― think about volatile international relations, crisis situations, or bad publicity for a particular business with global reach.
At AppTek, we took on the problems with commonly-used closed captioning, achieving a language technology breakthrough that revolutionizes the closed captioning process. Artificial neural networking, Natural Language Processing and densely connected Long Short-Term Memory are core to our superior Automatic Speech Recognition (ASR) platform. These advances not only increase the speed and accuracy of translation, they capture nuances like intonation, apply correct punctuation (think about the power of a comma!), and separate speakers by discerning unique voices. Those capabilities are essential to a thorough understanding of any content, yet they are lacking from other speech recognition systems.
Our Live Captioning Appliance is a fully functional server that gets installed with our ASR media software. It delivers fully automated, same-language captions for live content in all of the standard captioning key formats, and with up to 95% accuracy and speed that matches or exceeds human captioning. +Machine Translation can also be an additional option to automatically produce inter-subtitling with high accuracy in a different languagefor foreign language speaking audiences.
These rich capabilities provide a distinct advantage for stations with a “Digital First” strategy. Along with automating expensive, time-consuming manual tasks, they can improve content management, create rich metadata which helps with SEO for archives, websites, program libraries and more, and accelerate time to market. Perhaps most importantly, they enable the full meaning of ALL content to be available to ALL audiences.
There are so many opportunities for M&E companies to modernize their delivery, maximize their content investments and bring more information, educational and enjoyment value to customers. We’ll explore more of them in the next post, along with how AppTek is filling that gap between current delivery methods and what’s truly possible. The state-of-the-art in automatic speech recognition and machine translation has achieved qualitative leaps with neural networks and deep learning.