Talks and Papers | Pamela Toman

Technical and non-technical professionals, academic audiences, K-12 students, hobbyist groups — knowledge sharing benefits everyone.

Professionals

A brief introduction to convolutional neural networks for computer vision
Convolutional neural networks transformed computer vision from “extremely hard” to “trivially achievable after a few weeks of coursework” between 2012 and 2016. Prepared for technical professional audiences, this talk describes how neural networks extend linear classification, intuitions behind why convolutional neural networks work well for vision, and the circumstances in which they’re worth consideration. [April 2016, repeated August 2016]

A brief introduction to geographic analysis
Making mistakes in geographic analysis is disturbingly easy. We briefly discuss computational representations of geographic data, and then we delve into potential gotchas — from spatial databases to hexagonal partitioning, from avoiding analysis on lat-longs to choosing appropriate graphical formats, and more. [April 2017]

Microservices for a non-technical audience
In the wake of 2015/2016 microservice hype, tech-adjacent leadership struggled to understand “what a microservice is”, and whether they should push their organizations to transition to microservices (or allow their engineers to push them toward that end). Upon request, this talk tried to distill tangible wisdom and add advice. [January 2017]

Comfort, distress and dominance: Reading body language
Body language can indicate state of mind. Being familiar with body language tells can help people read a room, avoid closing past the sell on a negotiation, and become more self-aware. This short orientation was written & delivered as part of a non-technical skills development series within an established team; it was framed as three 2-3 minute topic introductions followed by 5-10 minutes of small group moderated discussion. [May 2022]

Researchers

End-to-end neural networks for subvocal speech recognition
We describe the first approach toward end-to-end, session-independent subvocal automatic speech recognition from involuntary facial and laryngeal muscle movements detected by surface electromyography. We leverage character-level recurrent neural networks and the connectionist temporal classiﬁcation loss (CTC). We attempt to address challenges posed by a lack of data, including poor generalization, through data augmentation of electromyographic signals, a specialized multi-modal architecture, and regularization. We show results indicating reasonable qualitative performance on test set utterances, and describe promising avenues for future work in this direction. [June 2017; Stanford CS 224S]

Hybrid Word-Character Neural Machine Translation for Arabic (poster)
Traditional models of neural machine translation make the false-but-true-in-English assumption that words are essentially equivalent to units of meaning. Morphologically rich languages disobey this assumption. We implement a hybrid translation model that backs off unknown words to a representation created by modeling their constituent characters in TensorFlow, we apply the model to Arabic translation, and approach state-of-the-art performance for Arabic over the weeks allotted for a class project. [March 2017; Stanford CS 224N]

Automatically Assessing Integrative Complexity
Integrative complexity is a construct from political psychology that measures semantic complexity in discourse. Although this metric has been shown useful in predicting violence and understanding elections, it is very time-consuming for analysts to assess. We describe a theory-driven automated system that improves the state-of-the-art for this task from Pearson’s r = 0.57 to r = 0.73 through framing the task as ordinal regression, leveraging dense vector representations of words, and developing syntactic and semantic features that go beyond lexical phrase matching. Our approach is less labor-intensive and more transferable than the previous state-of-the-art for this task. [June 2016; Stanford CS 224U; ICGauge on github]

Automatic Sign Language Identification (poster)
Automatic processing of sign languages can only recently potentially advance beyond the toy problem of fingerspelling recognition. In just the last few years, we have leaped forward in our understanding of sign language theory, effective computer vision practices, and large-scale availability of data. This project achieves better-than-human performance on sign language identification, and it [releases a dataset][10] and benchmark for future work on the topic. It is intended as a precursor to sign language machine translation. [March 2016; Stanford CS 231N]

K-12 students

A brief introduction to reinforcement learning (high school AI audience)
Usually we talk about supervised learning, but reinforcement learning is just as interesting and useful. We discuss how reinforcement learning works, how to make decisions given Bayesian bounds, touchstone RL problems and recent applications, and where RL tends to succeed and fail. This material was developed for high school participants in SAIL ON, the year-round diversity and high school outreach program of the Stanford AI Lab that I initiated and led, which follows the SAILORS intensive two-week AI summer camp. [May 2017]

My First AI (or: Decision Trees & Language Modeling for Middle Schoolers) (middle school audience)
The keynote address at Byte Sized, which was a workshop for middle school girls spearheaded by SAIL ON students, solidified the basics of artificial intelligence and the if/else statements taught the previous day. The talk introduces the language identification problem within AI, teaches about decision trees, and then asks students to write decision trees in small groups to distinguish between Hmong, Balinese, Zulu, and other languages. After a debrief on why computers are might be more effective than human-written rules, it briefly ties in themes of feature extraction and gradient descent via GBMs. [November 2016]

Modeling with Naive Bayes (high school AI audience)
Like other machine learning methods, Naive Bayes is a generic approach to learning from specific data such that we can make predictions in the future. Whether the application is predicting cancer/whether you’ll care about an email/who will win an election, we can use the mathematics of Naive Bayes to connect a set of input variables to a target output variable. (Of course, some problems are harder than others!) This material was developed for high school participants in SAIL ON, the year-round diversity and high school outreach program of the Stanford AI Lab that I initiated and led, which follows the SAILORS intensive two-week AI summer camp. [October 2016]

American Sign Language Basics and Theory (high school linguistics audience)
For theorists, American Sign Language is fascinating — the phonology is not centered in the vocal tract, nonmanual markers and other features present a sort of simultaneity not seen in English, and formal study of ASL is so new that its grammar and theoretical linguistics are still debated. With regard to linguistic creativity, ASL culture contains a variety of art forms. This material was developed for the linguistics club at Montgomery Blair High School outside Washington, DC. The first meeting focused on basic ASL linguistics and being an educated world citizen, whereas the second focused on ASL’s grammatical use of space and theoretical linguistics. [April and December 2013]

Hobbyists

Linguistics and Amateur Radio
How do subconscious rules govern oral language use and perception? How can that knowledge improve emergency communication? The linguistic frameworks discussed are universally relevant, and the take-aways are targeted to an amateur radio operator audience. The intent is to demystify universal aspects of radio communication through discussion of the science of phonetics and phonology. [September 2012]