My final project for Stanford CS 231N was on automatically identifying sign languages from publicly licensed YouTube clips. For this project I learned from scratch about working with neural networks, computer vision, and video data.
Automatic processing of sign languages can only recently potentially advance beyond the toy problem of fingerspelling recognition. In just the last few years, we have leaped forward in our understanding of sign language theory, effective computer vision practices, and large-scale availability of data. This project achieves better-than-human performance on sign language identification, and it releases a dataset and benchmark for future work on the topic. It is intended as a precursor to sign language machine translation.