Blog | Pamela Toman

End-to-end neural networks for subvocal speech recognition

My final project for Stanford CS 224S was on subvocal speech recognition. This was my last paper at Stanford; it draws on everything I learned in a whirlwind of CS grad school without a CS undergraduate major. Pol Rosello provided the topic; he and I contributed equally to the paper.

We describe the first approach toward end-to-end, session-independent subvocal automatic speech recognition from involuntary facial and laryngeal muscle movements detected by surface electromyography. We leverage character-level recurrent neural networks and the connectionist temporal classiﬁcation loss (CTC). We attempt to address challenges posed by a lack of data, including poor generalization, through data augmentation of electromyographic signals, a specialized multi-modal architecture, and regularization. We show results indicating reasonable qualitative performance on test set utterances, and describe promising avenues for future work in this direction.

A brief introduction to reinforcement learning

We spend a lot of time talking about supervised learning when discussing ML with students, but I find reinforcement learning just as interesting and useful.

I developed a talk on reinforcement learning for high school participants in SAIL ON, the year-round diversity and high school outreach program of the Stanford AI Lab that I initiated and led, which follows the SAILORS intensive two-week AI summer camp.

We discuss how reinforcement learning works, how to make decisions given Bayesian bounds, touchstone RL problems and recent applications, and where RL tends to succeed and fail.

A brief introduction to geographic analysis

Making mistakes in geographic analysis is disturbingly easy. The “Intro to Geographic Analysis” materials briefly discuss computational representations of geographic data. Then I delve into potential gotchas — from spatial databases to hexagonal partitioning, from avoiding analysis on lat-longs to choosing appropriate graphical formats, and more.

Hybrid Word-Character Neural Machine Translation for Arabic

My final project for Stanford CS 224N was on hybrid word-character machine translation for Arabic.

Traditional models of neural machine translation make the false-but-true-in-English assumption that words are essentially equivalent to units of meaning. Morphologically rich languages disobey this assumption. We implement a hybrid translation model that backs off unknown words to a representation created by modeling their constituent characters in TensorFlow, we apply the model to Arabic translation, and approach state-of-the-art performance for Arabic over the weeks allotted for a class project.

Microservices for a non-technical audience

In the wake of 2015/2016 microservice hype, my tech-adjacent leadership struggled to understand “what a microservice is”, and whether they should push their organizations to transition to microservices (or allow their engineers to push them toward that end). Upon request, I gave this talk on microservices for non-technical audiences to distill tangible wisdom and add advice.

Pamela Toman

End-to-end neural networks for subvocal speech recognition

A brief introduction to reinforcement learning

A brief introduction to geographic analysis

Hybrid Word-Character Neural Machine Translation for Arabic

Microservices for a non-technical audience

What’s this blog about?

Recent posts

Tags