Skip to content

Latest commit

 

History

History
61 lines (27 loc) · 5.5 KB

README.md

File metadata and controls

61 lines (27 loc) · 5.5 KB

Awesome-Visual-Dialog

Problem Statement

Visual Dialog: to teach machines to have natural language conversations with humans about Image.

Dataset

-- Dialog Dataset-:MS-COCO and QA Dataset -: Visual Dilog

Visual Dialog Papers

2019

  • Patro, Badri N., Anupriy and Vinay P. Namboodiri. “Probabilistic framework for solving Visual Dialog.” (Pattern Recognition 2020). paper

  • Niu, Yulei, Hanwang Zhang, Manli Zhang, Jianhong Zhang, Zhiwu Lu, and Ji-Rong Wen. "Recursive visual attention in visual dialog." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6679-6688. 2019. paper

  • Kang, Gi-Cheon, Jaeseo Lim, and Byoung-Tak Zhang. "Dual Attention Networks for Visual Reference Resolution in Visual Dialog." arXiv preprint arXiv:1902.09368 (2019). paper

  • Shekhar, Ravi, Tim Baumgartner, Aashish Venkatesh, Elia Bruni, Raffaella Bernardi, and Raquel Fernandez. "Ask No More: Deciding when to guess in referential visual dialogue." arXiv preprint arXiv:1805.06960 (2018). paper

2018

  • Jain, Unnat, Svetlana Lazebnik, and Alexander G. Schwing. "Two can play this game: visual dialog with discriminative question generation and answering." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5754-5763. 2018.paper

  • Kodra, Lorena, and Elinda Kajo Meçe. "Multimodal Attention Agents in Visual Conversation." In International Conference on Emerging Internetworking, Data & Web Technologies, pp. 584-596. Springer, Cham, 2018.paper

  • Massiceti, Daniela, N. Siddharth, Puneet K. Dokania, and Philip HS Torr. "FlipDial: A generative model for two-way visual dialogue." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6097-6105. 2018.paper

  • Zhang, Jiaping, Tiancheng Zhao, and Zhou Yu. "Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog." arXiv preprint arXiv:1805.03257 (2018).paper

  • Zhang, Haichao, Haonan Yu, and Wei Xu. "Listen, Interact and Talk: Learning to Speak via Interaction." arXiv preprint arXiv:1705.09906 (2017).paper

  • Shekhar, Ravi, Tim Baumgartner, Aashish Venkatesh, Elia Bruni, Raffaella Bernardi, and Raquel Fernandez. "Ask No More: Deciding when to guess in referential visual dialogue." arXiv preprint arXiv:1805.06960 (2018). paper

2017

  • Das, Abhishek, Satwik Kottur, José MF Moura, Stefan Lee, and Dhruv Batra. "Learning cooperative visual dialog agents with deep reinforcement learning." In Proceedings of the IEEE International Conference on Computer Vision, pp. 2951-2960. 2017. paper

  • Das, Abhishek, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José MF Moura, Devi Parikh, and Dhruv Batra. "Visual dialog." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2. 2017.paper

  • Lu, Jiasen, Anitha Kannan, Jianwei Yang, Devi Parikh, and Dhruv Batra. "Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model." In Advances in Neural Information Processing Systems, pp. 313-323. 2017.paper

  • Chattopadhyay, Prithvijit, Deshraj Yadav, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra, and Devi Parikh. "Evaluating visual conversational agents via cooperative human-ai games." In Fifth AAAI Conference on Human Computation and Crowdsourcing. 2017.paper

  • Strub, Florian, Harm De Vries, Jeremie Mary, Bilal Piot, Aaron Courville, and Olivier Pietquin. "End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries." 2017.paper

  • Seo, Paul Hongsuck, Andreas Lehrmann, Bohyung Han, and Leonid Sigal. "Visual reference resolution using attention memory for visual dialog." In Advances in neural information processing systems, pp. 3722-3732. 2017. paper

  • Zhang, Junjie, Qi Wu, Chunhua Shen, Jian Zhang, Jianfeng Lu, and Anton van den Hengel. "Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards." arXiv preprint arXiv:1711.07614 (2017). paper

  • Huber, Bernd, Daniel McDuff, Chris Brockett, Michel Galley, and Bill Dolan. "Emotional Dialogue Generation using Image-Grounded Language Models." In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, p. 277. ACM, 2018. paper