Lihong Li wins 2023 Seoul Test of Time Award

rockstaryreviews

June 15, 2024

4 Views 0

SaveSavedRemoved 0

Lihong Li, a senior principal scientist in Amazon Ads, has won the 2023 Seoul Test of Time award for the 2010 paper “A Contextual-Bandit Approach to Personalized News Article Recommendation.” The paper, coauthored by Wei Chu, John Langford, and Robert E. Schapire, introduced an innovative approach to personalized recommendation engines.

The Seoul Test of Time Award “is awarded annually to the author or authors of a paper presented at a previous World Wide Web conference that has, as the name suggests, stood the test of time.”

Related content

In 2017, when the journal IEEE Internet Computing was celebrating its 20th anniversary, its editorial board decided to identify the single paper from its publication history that had best withstood the “test of time”. The honor went to a 2003 paper called “Amazon.com Recommendations: Item-to-Item Collaborative Filtering”, by then Amazon researchers Greg Linden, Brent Smith, and Jeremy York.

“The paper tackles an important problem from a novel angle that turned out to be one of the fundamental techniques in the years to come after publication,” said Li. “The paper considers recommendation as a reinforcement learning problem, which was not a popular view at that time.”

Li and his colleagues, who worked at Yahoo! Labs in 2010, introduced a new way of thinking about personalized recommendation engines. The team addressed the challenge of creating a personalized recommendation engine to directly maximize a utility function that measures user satisfaction.

Recommender systems at the time relied on past user activities to provide meaningful recommendations at an individual level. However, the paper notes, “in many web-based scenarios, the content universe undergoes frequent changes, with content popularity changing over time as well. Furthermore, there are new visitors to a website with no historical consumption record.”

“These issues make traditional recommender-system approaches difficult to apply,” the paper states. “It thus becomes indispensable to learn the goodness of match between user interests and content from user interactions, when one or both of them are new.”

Contextual bandits

The paper proposed a contextual-bandit approach to driving personalized recommendations in news content “in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.”

“News content changes every hour within the day,” said Li. “That’s why we need a solution to quickly adapt to changing content, and recommend the best content to users.” In doing so, the solution has to balance two competing goals: maximizing user satisfaction and gathering information about “goodness of match” between user interest and content. Contextual bandits are a special class of reinforcement learning problems that are well-suited to the scenario.

Related content

Applications in product recommendation and natural-language processing demonstrate the approach’s flexibility and ease of use.

The paper develops practical contextual bandit algorithms, which optimize metrics about user engagement such as click-through rates, downstream revenue, or other business impacts. Li later worked on extending his approach to scenarios in which utility is measured in terms of long-term user engagements.

“In reality, decisions change the behavior of the user and, in turn, change the future way they interact with the website and the future utility,” said Li. “So a system should be able to take these long-term impacts into account and make a decision to maximize long-term utility instead of short-term.”

The authors reported that their “computationally efficient contextual bandit algorithm” not only drove higher click-through rates but also solved for the scaling challenge because it could be “reliably evaluated offline using previously recorded random traffic.” The evaluation technique itself has also found uses in other web-based scenarios.

The path to the prize

Li received a bachelor of engineering in computer science and technology at Tsinghua University in Beijing, then went on to earn a master of science in computing science at the University of Alberta. He earned his PhD in computer science from Rutgers University, working in the area of reinforcement learning.

During his time at Rutgers, Li met two mentors who would later become coauthors on the award-winning paper. Schapire was a Princeton professor on Li’s thesis defense committee, and Langford was Li’s internship mentor at Yahoo! in 2007. In October 2020, Li joined Amazon as a senior principal scientist.

Related content

Research investigates how to construct recommendation algorithms when the search space is massive and how to perform natural-language searches on the COVID-19 literature.

“One thing that attracted me is the customer obsession culture of Amazon that uses solid science technologies and solutions to tackle deep customer questions,” Li said. “Contextual bandits and, more generally, reinforcement learning techniques can help Amazon fulfill customer needs in shopping, entertainment, and beyond, as well as play a key role in improving large language models.”

Li and his colleagues received the Seoul Test of Time Award at the Web Conference 2023 in Austin, Texas.

“I was thrilled, and winning was totally unexpected,” said Li.

First conceived in 1989 by Tim Berners-Lee at CERN in Geneva, the Web Conference (formerly known as the International World Wide Web Conference, abbreviated as WWW) is a yearly international academic conference on the topic of the future directions of the World Wide Web.

“Scientists often publish innovation in papers. When the invention stays on paper and doesn’t reach the real world, it doesn’t feel like the story is complete,” Li said. “This award is a recognition that the invention has had a long-lasting impact, not just on the problem we worked on, but also in the field and in other parts of the industry. I’m grateful to be a recipient of the award and am gratified to see that this 13-year-old work continues to be useful.”

SaveSavedRemoved 0

Lihong Li wins 2023 Seoul Test of Time Award

Previous

Making machine translation more robust, consistent, and stable

Lihong Li wins 2023 Seoul Test of Time Award

Next

Physics-constrained machine learning for scientific computing

Tags: Awards and recognitions Bandit problems Recommender systems Reinforcement learning Test of Time Award

Related Articles

Added to wishlistRemoved from wishlist 0

Training code generation models to debug their own outputs

Training code generation models to debug their own outputs

Added to wishlistRemoved from wishlist 0

Benchmarking tool for graph-centric predictive modeling on databases

Benchmarking tool for graph-centric predictive modeling on databases

Added to wishlistRemoved from wishlist 0

Buy Smarter: The Consumer Guide to Smart TVs

Buy Smarter: The Consumer Guide to Smart TVs

Added to wishlistRemoved from wishlist 0

Lightweight LLM for converting text to structured data

Lightweight LLM for converting text to structured data

We will be happy to hear your thoughts

Leave a reply Cancel reply

Shopping cart