Point your index finger forward, then sweep your thumb up – you’ve just signed “hello”, according to Alibaba Cloud’s sign language translator named Xiaomo. She says hi back.
But Xiaomo isn’t human. She’s a digital avatar who will provide two-way translation between Mandarin and Chinese sign language during the upcoming Asian Para Games taking place in Hangzhou over the last week of October.
Her cloud and AI-powered services will be accessible to hearing-impaired Asiad participants via payment platform Alipay to ask people for directions, view event schedules and more.
Xiaomo took nearly two years to develop, according to algorithm engineer Zhang Bang.
“I’m very proud to see our research being applied to solving real-life problems and having a real impact on people,” he told Alizila in an interview.
Watch the video to meet Xiaomo.
Below is a transcript of this video, edited for clarity and brevity
Zhang Bang: My name’s Bang Zhang, an engineer in Alibaba Cloud.
My daily work is more in the field of fundamental research, research about computer vision, multimodalities, and natural language processing. I’m very proud to see our research being applied to solving real-life problems and having a real impact on people.
Xiaomo is a digital avatar who can perform two-way translation between sign language and spoken Chinese in order to support communications between people with hearing difficulties and the wider community.
There are two key components. In the first component, it’s about technologies that are related to digital avatars. And the second module is the translation module, which is powered by a large language model.
Building the dataset for the training machine learning model is one of the biggest challenges that we have faced. Building a sign language dataset, which has a very large size with video collections and annotations, is very time-consuming and costly for sure.
It took us almost two years to build such a dataset, thanks to the help of volunteers and sign language practitioners.
The other challenge on the technology side is that, in sign language, one word can have many different meanings. To tackle such a challenge, we leverage a technique called machine translation. It can perform the task of word selection and translation at the same time. It helps us to reduce the ambiguity and enhance the accuracy of the translation.
During the Asian Para Games, participants with hearing impairments can simply open the Alipay app and use Xiaomo to ask for various assistance, including asking for directions, seeking medical help and game viewing.
It would definitely make the Games more inclusive and greatly help communication between individuals with hearing difficulties and other people.
We have already pioneered the use of Xiaomo in museums, tourist spots, cafes and news broadcasts.
In the future, we hope to see Xiaomo assist individuals with hearing impairments to address their needs in public services.