New Dataset for AI-Enabled Sign Language Translation


The dataset will allow more automatic sign language understanding and translation. These technologies could be applied to applications such as virtual assistants and robotics.

Artificial intelligence (AI) is helping humans save, parse, and learn language. With a new dataset, researchers and developers could get a massive boost developing technologies for the deaf community.

The How2Sign Dataset

The dataset includes over 80 hours of videos showing sign language interpreters translating a variety of tutorials. Amanda Duarte, a researcher in the Emerging Technologies for Artificial Intelligence group at the Barcelona Supercomputing Center (BSC), spent two years recording these videos and preparing the data.

Duarte also made use of Carnegie’s Carnegie Mellon University’s Panoptic Studio, a state-of-the-art dome-shaped studio that allowed researchers to video translators and reconstruct their movements in 3D.

(Source: Barcelona Supercomputing Center)

Thanks to Duarte, How2Sign provides a public resource for researchers in natural language processing and computer vision, helping usher in a new era of deaf and hard of hearing enabled products and services. Making the internet more accessible is a huge goal, and one of the first applications is software that transfers signs from one user to another.

The dataset provides a valuable resource for researchers and developers to design quality technology that considers the needs of the deaf community. Artificial intelligence requires computation and algorithms capability, but it also requires data.

Future accessibility projects

Duarte, INPhiNIT doctoral student of the “la Caixa” Foundation, has received funding from several sources — Facebook AI, the “la Caixa” Foundation, as well as the collaboration of the Image Processing Group of the Universitat Politècnica de Catalunya (UPC), Carnegie Mellon University and Gallaudet University — to make this dataset happen.

The dataset will allow more automatic sign language understanding and translation. These technologies could expand to application areas such as virtual assistants, robotics, and other emerging technologies.

Duarte will present the new resource at the CVPR 2021 conference later this summer. Her work and the dataset are currently ongoing, expanding and improving the data repository. The more the dataset expands, the more accessible technology will become.

Elizabeth Wallace

About Elizabeth Wallace

Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do.

Leave a Reply

Your email address will not be published. Required fields are marked *