My final project for Stanford CS 224N was on hybrid word-character machine translation for Arabic.
Traditional models of neural machine translation make the false-but-true-in-English assumption that words are essentially equivalent to units of meaning. Morphologically rich languages disobey this assumption. We implement a hybrid translation model that backs off unknown words to a representation created by modeling their constituent characters in TensorFlow, we apply the model to Arabic translation, and approach state-of-the-art performance for Arabic over the weeks allotted for a class project.