copy of SIGVID
SIGVID
The reading list for the Special Interest Group on Visual Information Description
Image Captioning
Level 0
- Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)[ICLR 2015 Oral]
- Sequence to Sequence -- Video to Text[ICCV 2015]
- What value do explicit high level concepts have in vision to language problems?[CVPR 2016]
Level 1
- Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images[ICCV 2015]
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention[ICML 2015]
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning[CVPR 2016 Oral]
- Image Captioning with Deep Bidirectional LSTMs[ACMMM 2016 Oral]
Video Captioning
- Early Embedding and Late Reranking for Video Captioning[ACMMM 2016 Grand Challenge Award]
- Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks[CVPR 2016 Oral]
- Frame-and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation[best in MSR Video to Language Challenge]
Visual Question Answering
- VQA: Visual Question Answering[ICCV 2015]
Miscellaneous
- Spatial Transformer Networks[NIPS 2015]
Theories of DNN
- Identifying and attacking the saddle point problem in high-dimensional non-convex optimization[NIPS 2014]
- The loss surfaces of multilayer networks[JMLR 2015]
- On the expressive power of deep neural networks[ML/AI arxiv 2016]
Appendix
Other Reading Lists
- Deep Learning Papers Reading Roadmap
- Awesome Deep Learning
- Awesome Deep Vision
- Awesome - Most Cited Deep Learning Papers