Scott E. Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. Learning what and where to draw. CoRR, abs/1610.02454, 2016. URL http://arxiv.org/abs/1610.02454.
Related Work / Contribution
A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR, 2016.
Dataset
Caltech-UCSD Birds (CUB) dataset ● ● ●
Over 11k images of 200 species 10 single sentence descriptions per image Bounding box and up to 15 bird part x, y coordinates
Dataset
MPII Human Pose (MPH) dataset ● ● ● ●
Used 19k of the 25k images (filtered out multiple people images) 410 common activities Keypoint coordinates for 16 joint types Crowdsourced to get 3 single-sentence descriptions for each image
Results
Learning What and Where to Draw - GitHub
Caltech-UCSD Birds (CUB) dataset. â Over 11k images of 200 species. â 10 single sentence descriptions per image. â Bounding box and up to 15 bird part x, ...