talks-to-professionals | Pamela Toman

NVIDIA GTC 2024

Some takeaways from GTC 2024:

Vendors

Vendors with noticeably busy booths:

Weights and Biases (busy despite having multiple booths): experimentation platform, focused on managing experiments wrt hyperparameter optimization; their product and go-to-market teams seem top-notch
CoreWeave: a cloud vendor specializing in GPU compute
Twelve Labs: multimodal AI (video)
Unstructured: tooling to prep terrible enterprise data into JSON

Some legacy vendors are holding on and trying to pivot.

The glamor of the expo area is real (GTC this year is what everyone else’s conference aspires to be).

There were a lot of dog-sized robots in varying form factors for varying purposes – pizza delivery, lawn mowing, dancing, etc. I didn’t see applications of those robots that made much sense to me.

Best practice software

Zephyr: an excellent current smallish open source LLM assistant; it will probably be surpassed in a month, but it’s worth keeping an eye on the creators and papers noted by HuggingFaceH4, the group that put together Zephyr (and others) (this is a good place to be checking for papers worth discussing)
Deita: open source project for instruction tuning; there is a scoring estimator included that we might want to investigate for quality evaluations at scale
Eleuther LM Eval Harness: quantifies the perf of LLMs; is the backer to the HuggingFace Open LLM Leaderboard; you can set your own custom prompts and evaluation metrics

Our talk

The “GPU Power Play” talk with PJ went well. We filled about half the seats in a Salon (maybe on the order of 50 people), people filed into the room as we talked and once they arrived they didn’t leave. They took lots of pictures of the slides. We had about a half a dozen people queue up to talk to us after we finished.

We heard from someone interested in putting security defenses onto the compute hardware and into the storage layer. This seems extremely enticing, though I’m not sure how you get the speed while also getting the security. Perhaps the play is to satisfice on quality by relying on capability improvements over time, and expect people are willing enough (or regulated enough) to pay twice over for security? Having not heard the pitch, I’m not sure how you wouldn’t end up paying a direct fee for the hardware/storage, and then also again in reduced specs.

Giving talks

I went to one talk that had an extremely effective title & talk (which I won’t name because although I was impressed, my analysis could be construed as critical). I came away satisfied even though the content turned out halfway through to be marketing fluff. In essence, the speaker framed a ridiculous problem with “wouldn’t it be amazing if it worked” – and then they showed that it doesn’t work, and they got a laugh – and then they showed that it kinda does work, and they got knowing nods – and then they shifted to sales. It was brilliantly executed.

It started out looking like a practically-minded semi-academic presentation, but it gracefully turned into a product demo. The room was standing room only, and didn’t drop away, so I wasn’t the only one who was drawn in, recognized the bait-and-switch, and still appreciated it. The speaker seemed to have minimal experience with deep learning (e.g., they expressed surprise that keeping the first few layers of a DL model and dropping the other layers works better than keeping only the top few layers and dropping the layers that actually connect to the embeddings). Even without deep experience, though, they turned a technically truly silly idea into great insights for product and for marketing. They’re clearly very good in their role.

What worked well at attracting me in this and other talks:

Select a title that suggests a technical capability that a lot of people would love to achieve (without mentioning the solution you’re marketing).
Gradually (very gradually) ease into your solution. Prepare a technical quasi-academic talk, with data selection and approach and challenges well motivated. Name-drop your solution about 25-33% through, but then go right back to the “how to” technical talk. In the final third, transition to a product demo that builds on the tangible, potentially-nonsensical-but-wonderful-if-true example you’ve been fleshing out.
Refer to many projects and people, and do so authoritatively.
Orient the audience to an entire topic that might otherwise be unfamiliar, so that they do benefit technically from the talk. Perhaps the topic is your overall framing, or perhaps it’s a cross-cutting challenge that your problem lets you talk about (performing simulations can fit here), or perhaps it’s particularly useful datasets and tooling – or perhaps it’s multiple topics.
Pre-record the screencast of the demo – but talk over the screencast at the event. This gives you flexibility in the talk track, and it keeps the energy in the room under your live control.

LLMs for text-to-nontext

The only “cool” application I happened to see, for which I could immediately see the value unfold like the glory of Jensen Huang before me, was in autonomous vehicle simulation. NVIDIA is using an LLM to generate non-language synthetic data such as autonomous vehicle scenes (they have a programming language that represents the layers in a scene, and distinct AI models for each layer: one can produce maps of intersections, another can add objects like cars to those maps, and so on). With this text-to-scene platform, their users can chat with an LLM to generate synthetic scenes that focus on edge cases that rarely occur in real data, and they can also produce synthetic data from historic accident reports that focus on the tricky situations. Currently about 20% of their data is synthetic. NVIDIA gave a lot of talks about this idea at GTC, so they’re selling it hard – but it’s also pretty interesting. (Much more interesting than a robot with a leaf blower strapped on it, I’d say.)

Other thoughts

Our interest in foundation models for everything – language, images, autonomous vehicles, etc. – is in direct conflict with Rich Hickey’s idea of simple vs. complected systems. I suspect we as an industry may find ourselves tick-tocking back out to interpretability again in the near-to-medium future.

“Digital twin” is part of the marketing zeitgeist/cliche verbiage; we haven’t been using this phrase so often at Palo Alto Networks. I’m not recommending we start.

There were a lot of vendors focused on GPU compute and fully utilizing GPU compute and enabling AI researchers to train highly effective models – but there didn’t seem to be many people actually doing the work of training those low-loss models with massive hyperparameter experimentation. There was a lot of interest in anyone who indicated they might be successfully training interesting/useful models (but no one actually was). The companies with large research wings using a ton of compute that attended (e.g., Google) were represented by their B2B product arms and not their AI researchers.

It seemed like there were two kinds of talks and participants – (1) interesting technical work on very specific bespoke problems whose actual overall necessary-ness is questionable (e.g., optimization of GPU space so you can run huge contexts – which was a nice talk from the US Navy, but it’s not clear to me that anyone actually needs huge LLM contexts – there were a number of this kind of technically-interesting-but-questionable-value talks), (2) basic application work in a variety of domains (e.g., models that try to detect something like floods in imagery, or copilot work, or use of generative AI within a general stack). This division between “highly applied but very basic” and “technically interesting but limited usefulness” is a tension I just cannot shake.

I would be comfortable giving a talk again; I think that was a reasonable trade-off of time for value (especially given material re-use). I wouldn’t attend GTC for more than that same day unless we were connecting deliberately with people across the industry for some purpose and had done a lot of pre-work toward that end.

So overall, it was a useful conference this year to me primarily in terms of clocking AI imposter syndrome, figuring out some new stealth marketing patterns for talks, and getting a feel for where the zeitgeist is pointing in terms of vendors and open source projects.

My stealth recruiting pitch

I have a stealth recruiting talk that I give for machine learning at Xpanse. It goes like this:

If you want a real mission in your work, cybersecurity can deliver.
My realm of cybersecurity is impossible without AI.
Doing this job means solving cool, hard problems.

I pretend the talk is all very objective and about teaching you stuff (which hopefully it also does). I also hint at a lot of the problems that I’ve worked on and solved the past few years. Technical people really like being shown problems and getting to chew on them (which is convenient, because I’m not comfortable publicly sharing how I solved these problems!). And I’m sure it helps that I’m earnest – the slides really are what I love most about my job. The talk works, too. We get solid applicants from it.

I was inspired by a talk I saw from Stitch Fix a number of years ago. I have really minimal interest in clothing… but after hearing them give a technical talk about the problems they were solving, I became convinced they were doing really neat modeling and would be worth considering in a job hunt. Pretty effective. So I tried to channel that insight.

The slides I’m linking here conclude with a harder sell than I usually give, as well as some cross-team Palo Alto projects, because I revised this version for an explicitly recruiting context. (One of our senior recruiters had seen me give this talk at the Lesbians Who Tech conference, and he asked me to give it again in a different context.)

Comfort, distress and dominance: Reading body language

Body language can indicate state of mind. Being familiar with body language tells can help people read a room, avoid closing past the sell on a negotiation, and become more self-aware. I wrote and delivered a short orientation to Comfort, distress and dominance: Reading body language as part of a non-technical skills development series within an established team. It is framed as three 2-3 minute topic introductions followed by 5-10 minutes of small group moderated discussion.

Introducing microservices to students in Stanford CS 110

Ryan Eberhardt invited Xpanse to give a guest lecture on the last day of a summer session of Stanford CS 110, in that gap between real coursework and the final.

CS 110 is the second course in Stanford’s systems programming sequence. I loved taking it as a student. I loved CS 110 so much that I TAed it twice, even though it’s a really tough course to TA (the students are zillions of new undergrads, there’s a lot of assignments to give feedback on, and the material is pretty hard for them so office hours consist of a never-ending queue of students with questions). My professor Jerry Cain gave me an award for my TA work, so hopefully I did okay by them.

For this re-visit to CS 110, I introduced microservices, containerization, and orchestration. I gave the orientation to why they should care and who we were. Then two sharp coworkers talked about their daily tech of port scanning and functional programming. I concluded the lecture with hinting at the problems solved by Docker and Kubernetes (and the problems created by them), and I asked leading questions that extended some of the core ideas in CS 110: decoupling of concerns, each worker does one thing, pools of workers share a single point of entry, and request/response models.

A brief introduction to geographic analysis

Making mistakes in geographic analysis is disturbingly easy. The “Intro to Geographic Analysis” materials briefly discuss computational representations of geographic data. Then I delve into potential gotchas — from spatial databases to hexagonal partitioning, from avoiding analysis on lat-longs to choosing appropriate graphical formats, and more.

Pamela Toman

Tag: talks-to-professionals