기계학습 및 비전 연구실(지도교수 김현우 컴퓨터학과장)에서 "MLV TechTalk"을 진행합니다.

University of Arizona의 전광성 교수님께서  강화학습에서 사용되는 볼츠만탐험에 대해 소개해주실 예정입니다.

"강화학습에서 탐험에 종종쓰이는 볼츠만탐험은 사실 밴딧이라는 간소화된 문제에서도 최적의 성능과 거리가 먼 것으로 알려져있는데, 이를 최적의 성능으로 고치는 것, 그리고 기존의 알고리즘과의 연관성을 다루는 것이 주된 내용입니다. 앞으로 강화학습에서의 활약이 기대되는 새로운 알고리즘입니다."

관심 있는 학생들은 누구나 참여가 가능합니다. 



1. 일시: 2022.07.04(월), 11:00~12:00


2. 강연자: 전광성 교수님/University of Arizona


3. 제목: Maillard Sampling: Boltzmann Exploration Done Optimally for Interactive Machine Learning


4. Abstract

At the heart of interactive machine learning is the ability to determine which actions to take next to maximize information under given constraints.  For example, recommender systems like to suggest products to users that will not only result in high click-through rates but also inform the system about the user's preference to better serve the user in the long run.  For these problems (commonly referred to as 'bandit' problems), the PhD dissertation of Maillard (2013) proposed a less-known algorithm that we call Maillard sampling (MS) that can be viewed as a correction to a popular heuristic called Boltzmann exploration.  In this talk, we claim that MS is a strong competitor to Thompson sampling, the industrial standard algorithm. We will show that the performance guarantee of MS matches that of Thompson sampling and showcase practical benefits of MS such as enabling computationally-efficient offline evaluation, which has potential to overthrow the throne of Thompson sampling in industry.


5. Biography

Kwang-Sung Jun is an assistant professor at the CS department, University of Arizona since 2019. His research interest is interactive machine learning with a focus on bandit problems, online learning, and confidence bounds. Before joining UA, he was a postdoc at Boston University with Dr. Francesco Orabona. Before then, he spent 9 years at University of Wisconsin-Madison for a PhD degree with Dr. Xiaojin (Jerry) Zhu and a postdoc with Drs. Robert Nowak, Rebecca Willett, and Stephen Wright.