At NeurIPS, what’s old is new again

May 30, 2024

3 Views 0

SaveSavedRemoved 0

The current excitement around large language models is just the latest aftershock of the deep-learning revolution that started in 2012 (or maybe 2010), but Columbia professor and Amazon Scholar Richard Zemel was there before the beginning. As a PhD student at the University of Toronto in the late ’80s and early ’90s, Zemel wrote his dissertation on representation learning in unsupervised machine learning systems for Geoffrey Hinton, one of the three “godfathers of deep learning”.

Richard Zemel, a professor of computer science at Columbia University, an Amazon Scholar, and a member of the advisory board of the Conference on Neural Information Processing (NeurIPS).

Zemel is also on the advisory board of the main conference in the field of deep learning, the Conference on Neural Information Processing (NeurIPS), which takes place this week. His breadth of experience gives him a rare perspective on the field of deep learning — both how far it’s come and where it’s going.

“It’s come a very long way in some sense, in terms of the scope of problems that are relevant and the whole real-world applicability of it,” Zemel says. “But a lot of the same problems still exist. There are just many more facets than there used to be.”

For example, Zemel says, take the concept of robustness, the ability of a machine learning model to maintain performance when the data it sees at inference time differs from the data it was trained on, because of noise, drift in the data distribution, or the like.

“One of the original neural-net applications was ALVINN, the automated land vehicle in a neural network, in the late ’80s,” Zemel says. “It was a neural net that had 29 hidden units, and it was an answer to DARPA’s self-driving challenge. It was a big success for neural nets at the time.

“Robustness came up there because they were worried about the car going off the road, and they didn’t have any training examples of that. They worked out how to augment the data with those kinds of training examples. So thirty years ago, robustness was seen as an important question, and some ideas came up.”

Today, data augmentation remains one of the main ways to ensure robustness. But as Zemel says, the problem of robustness has new facets.

Related content

Francesco Locatello on the four NeurIPS papers he coauthored this year, which largely concern generalization to out-of-distribution test data.

“For instance, we can consider algorithmic fairness as a form of robustness,” he says. “It’s robustness with respect to particular groups. A lot of the methods that are used for that are methods that have also been developed for robustness, and vice versa. For example, they’re formulated as trying to develop a prediction that has some invariance properties. And it could be that you’re not just developing a prediction: in the deep-learning world, you’re trying to develop a representation that has these properties. The final layer of representation should be invariant. Think of multiclass object recognition: anything that has a label of class K should have a very similar kind of distribution over representations, no matter what environment it comes from.”

With generative-AI models, Zemel says, evaluating robustness becomes even more difficult. In practice, the most common machine learning model has, until recently, been the classifier, which outputs the probabilities that a given input belongs to each of several classes. One way to gauge a classifier’s robustness is to determine whether its predicted probabilities — its confidence in its classifications — accurately reflects its performance on data. If the model is overconfident, it probably won’t generalize well to new settings.

But with generative AI models, there’s no such confidence metric to appeal to.

“If now the system is busy writing sentences, what does the uncertainty mean?” Zemel asks. “How do you talk about uncertainty? The whole question about building robust, properly confident, responsible systems becomes that much harder in the in the era where generative models are actually working well.”