기계학습 및 비전 연구실(지도교수 김현우 컴퓨터학과장)에서 "MLV TechTalk"을 진행합니다.

Naver AI Lab의 김진화 박사가 "텍스트 기반 이미지 생성 모델과 이미지 캡션 생성 모델들을 통합하여 평가할 수 있는 지표에 대한 연구로, 정규 분포 가정을 통한 교차 상호 정보를 계산해서 활용하는 새로운 접근법"을 소개할 예정입니다.

관심 있는 학생들은 누구나 참여가 가능합니다. 


1. 일시: 2022.06.16(목), 16:30~17:30

2. 강연자: 김진화 박사/Naver AI Lab

3. 제목: Mutual Information Divergence: A Unified Metric for Multimodal Generative Models

4. Abstract:
Text-to-image generation and image captioning are recently emerged as a new experimental paradigm to assess machine intelligence. They predict continuous quantity accompanied by their sampling techniques in the generation, making evaluation complicated and intractable to get marginal distributions. Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trained model, we propose the negative Gaussian cross-mutual information using the CLIP features as a unified metric, coined by Mutual Information Divergence (MID). To validate, we extensively compare it with competing metrics using carefully-generated or human-annotated judgments in text-to-image generation and image captioning tasks. The proposed MID significantly outperforms the competitive methods by having consistency across benchmarks, sample parsimony, and robustness toward the exploited CLIP model. We look forward to seeing the underrepresented implications of the Gaussian cross-mutual information in multimodal representation learning and the future works based on this novel proposition.
5. Biography:
Jin-Hwa Kim has been Research Scientist at NAVER AI Lab since August 2021. He has been studying multimodal deep learning (e.g., visual question answering), multimodal generation, ethical AI, and other related topics. In 2018, he received a Ph.D. degree at Seoul National University under the supervision of Professor Byoung-Tak Zhang for the work on "Multimodal Deep Learning for Visually-grounded Reasoning." In September 2017, he received a 2017 Google Ph.D. Fellowship in Machine Learning, Ph.D. Completion Scholarship by Seoul National University, and the VQA Challenge 2018 runners-up at the CVPR 2018 VQA Challenge and Visual Dialog Workshop. He was Research Intern at Facebook AI Research (Menlo Park, CA, USA), mentored by Yuandong Tian, Devi Parikh, and Dhruv Batra from January to May 2017. He had worked for SK Telecom (August 2018 to July 2021) and SK Communications (January 2011 to October 2012).
6. 줌 참가: