본문 바로가기
주메뉴 바로가기
서브메뉴 바로가기

고려대학교 인공지능응용전공

QUICK MENU

로그인
닫기
홈페이지 가입을 위한 개인정보 수집.이용에 대한 동의안내

고려대학교는 제공자가 동의한 아래의 내용 외의 다른 목적으로 활용하지 않습니다.
- - 개인정보 수집·이용 목적 : 홈페이지 가입
- - 개인정보 수집항목 : 포탈아이디, 이름
- - 개인정보 보유 및 이용기간 : 회원탈퇴시까지
- - 개인정보 동의 거부권리 안내 : 신청인은 본 개인정보 수집에 대한 동의를 거부하실 수 있으며, 이 경우 홈페이지 가입이 제한됩니다.
동의 비동의

확인

사이트맵
오시는 길

게시판

게시판 공지사항

공지사항

[MLV TechTalk] Naver AI Lab 김진화 박사(2022.06.16(목)) 상세
제목	[MLV TechTalk] Naver AI Lab 김진화 박사(2022.06.16(목))
내용	기계학습 및 비전 연구실(지도교수 김현우 컴퓨터학과장)에서 "MLV TechTalk"을 진행합니다. Naver AI Lab의 김진화 박사가 "텍스트 기반 이미지 생성 모델과 이미지 캡션 생성 모델들을 통합하여 평가할 수 있는 지표에 대한 연구로, 정규 분포 가정을 통한 교차 상호 정보를 계산해서 활용하는 새로운 접근법"을 소개할 예정입니다. 관심 있는 학생들은 누구나 참여가 가능합니다. 1. 일시: 2022.06.16(목), 16:30~17:30 2. 강연자: 김진화 박사/Naver AI Lab 3. 제목: Mutual Information Divergence: A Unified Metric for Multimodal Generative Models 4. Abstract: Text-to-image generation and image captioning are recently emerged as a new experimental paradigm to assess machine intelligence. They predict continuous quantity accompanied by their sampling techniques in the generation, making evaluation complicated and intractable to get marginal distributions. Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trained model, we propose the negative Gaussian cross-mutual information using the CLIP features as a unified metric, coined by Mutual Information Divergence (MID). To validate, we extensively compare it with competing metrics using carefully-generated or human-annotated judgments in text-to-image generation and image captioning tasks. The proposed MID significantly outperforms the competitive methods by having consistency across benchmarks, sample parsimony, and robustness toward the exploited CLIP model. We look forward to seeing the underrepresented implications of the Gaussian cross-mutual information in multimodal representation learning and the future works based on this novel proposition. 5. Biography: Jin-Hwa Kim has been Research Scientist at NAVER AI Lab since August 2021. He has been studying multimodal deep learning (e.g., visual question answering), multimodal generation, ethical AI, and other related topics. In 2018, he received a Ph.D. degree at Seoul National University under the supervision of Professor Byoung-Tak Zhang for the work on "Multimodal Deep Learning for Visually-grounded Reasoning." In September 2017, he received a 2017 Google Ph.D. Fellowship in Machine Learning, Ph.D. Completion Scholarship by Seoul National University, and the VQA Challenge 2018 runners-up at the CVPR 2018 VQA Challenge and Visual Dialog Workshop. He was Research Intern at Facebook AI Research (Menlo Park, CA, USA), mentored by Yuandong Tian, Devi Parikh, and Dhruv Batra from January to May 2017. He had worked for SK Telecom (August 2018 to July 2021) and SK Communications (January 2011 to October 2012). 6. 줌 참가: https://korea-ac-kr.zoom.us/j/88253749184?pwd=M1BQQ3duVUtrVmcrOE5wKy9QOWM2Zz09 Meeting ID: 882 5374 9184 Passcode: x1pyaJk02R
첨부

목록