Sunday, May 1, 2016
Tues, May 3: Unsupervised Representation Learning
Wednesday, April 27, 2016
Thurs, Apr 28: Transient Attributes
Monday, April 25, 2016
Tues, Apr 26: Quizz: Targeted crowdsourcing with a billion (potential) users
Wednesday, April 20, 2016
Thurs, April 21: How do humans sketch objects?
Sunday, April 17, 2016
Tues, Mar 19:Exploring Nearest Neighbor Approaches for Image Captioning
Wednesday, April 13, 2016
Thurs, April 14: Visual Question Answering
Tuesday, April 5, 2016
Thurs, Mar 7: Deep Neural Decision Forests.
Tuesday, March 22, 2016
Thurs, Mar 24: Learning Visual Biases from Human Imagination
Monday, March 21, 2016
Tues, Mar 22: Special Presentation by Zhile Ren
Tuesday, March 15, 2016
Thurs, Mar 17: What makes Paris look like Paris?
Sunday, March 13, 2016
Tues, Mar 15: Learning Visual Similarity
Also read:
Learning Deep Representations for Ground-to-Aerial Geolocalization. Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays. CVPR 2015.
Wednesday, March 9, 2016
Thurs, Mar 10: Semantic Segmentation
Thursday, March 3, 2016
Tues, Mar 8: Fast and Faster R-CNN
(additionally, the faster version)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. NIPS 2015.
Friday, February 26, 2016
Thurs, Mar 3: DeepBox: Learning Objectness
Sample Code for Pretrained Network Feature Extraction
Saturday, February 20, 2016
Thurs, Feb 25: Diagnosing error in object detectors
Matlab code for generating Hoiem-style ROC curves: https://github.com/pdollar/coco/blob/master/MatlabAPI/CocoEval.m
Tuesday, February 16, 2016
Thurs. Feb 18: Understanding Deep Image Representations by Inverting Them.
Tues Feb 26: Object Detectors Emerge in Deep Scene CNNs
Supplemental: Learning Deep Features for Scene Recognition using Places Database. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. NIPS 2014.
Wednesday, February 10, 2016
Thurs, Feb 11: SUN Attributes
Monday, February 8, 2016
Tues, Feb. 9: AlexNet
Thursday, January 28, 2016
Thurs, Feb 4: Deep learning Tutorial
CVPR 2014 Tutorial on Deep Learning. Graham Taylor, Marc'Aurelio Ranzato, and Honglak Lee. Read only the first two sets of labeled Introduction and Supervised learning.
CVPR 2014 tutorial
Tues, Feb 2: Crowdsourcing Detectors with Minimal Training
Tues, Feb 2: MS COCO
COCO website
This is the first paper for which you'll post reading summaries. Here is the description of these summaries from the class website: Students will be expected to read one paper for each class. For each assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries must be posted to the class blog http://cs7476.blogspot.com/ by 11:59pm the day before each class. Feel free to reply to other comments on the blog and help each other understanding confusing aspects of the papers. The blog discussion will be the starting point for the class discussion. If you are presenting you don't need to post a summary to the blog.
Simply click on the comment link below this to post your short summary and one or more questions / discussion topics.
Partner Search
Hi Class. I forgot to mention that you can work on your semester project with a partner. If you don't know who you want to work with feel free to reply to this thread and perhaps say a bit about what project topics you had in mind, if any. E.g. "I'm James and I'm very interested object proposals or crowdsourcing strategies. Let me know if you want to chat about working together on a project".
Saturday, January 16, 2016
Example discussion -- Rich Intrinsic Image Decomposition of Outdoor Scenes from Multiple Views
Rich Intrinsic Image Decomposition of Outdoor Scenes from Multiple Views. Pierre-Yves Laffont, Adrien Bousseau, George Drettakis. TVCG 2013.
Project page.
Example Student Summary #1
This paper presents a new method for decomposing outdoor scenes into intrinsic images. This paper differs from previous papers by further decomposing the illumination component into components for illumination from the sun, the sky, and other scene objects (indirect illumination). Using multiple images of the scene, a sparse 3D point cloud is constructed. The reflectance and sun illumination for each point is learned using mean shift iterations which optimize the energies of regions of influence over candidate reflectance curves of the points. The sky and indirect illumination components are estimated by sending rays out from the 3D points and seeing which rays hit the sky and other objects (and contributing radiance to those illumination components.) The algorithm is effective for estimating the intrinsic images of rich outdoor scenes and allowing for their manipulation while keeping the scene consistent. It is limited by the need for a reasonably accurate estimation of the direction to the sun and its need for a reflective sphere (the sky) to capture an environment map.
Discussion:
How consistent do the images used in these multi-image techniques have to be for them to work correctly? Do the objects need to be static? Does the photographer need to be more or less revolving around some scene center where the camera is pointed? Or is it robust enough to handle more general movements? Can the reverse be used where the camera remains in a fixed position but rotates, creating a panorama that is stitched together (and decomposing it into intrinsic images)?
Example Student Summary #2
This paper presents a pipeline for decomposing intrinsic images using multiple photos of the same scene captured at the same time, and the pipeline is able to decompose the illumination layer of the image further into sun, sky and indirect lighting layers. As inputs to the pipeline, the user needs to capture and provide a set of LDR photos from different viewpoints, two HDR images of the front and side of the reflective sphere, and HDR images of the viewpoints that need to be decomposed. The pipeline starts by generating a point cloud reconstruction of the scene, a approximate geometric proxy o the scene, the direction and radiance of the sun, and a HDR environment map containing the sky and distant indirect radiance. The geometric proxy is then used to compute sky illumination, indirect illumination, and approximate sun visibility for each point, but a more refined estimation of sun visibility needs to be computed by forming curves of candidate reflectances in the color space and finding their intersections, and this estimation algorithm is a key contribution of this work. After having illuminations for the initial set of points, the illuminations are propagated to the entire scene using a method similar to Bousseau et al.'s method for propagating user specified constraints, and finally the three illumination layers are separated using two successive matting procedures.
Discussion:
In section 6 it is mentioned that the sun visibility estimation algorithm assumes that the scene is composed of a sparse set of reflectances shared by multiple points. Does this mean that the algorithm might not be ideal for scenes that have a more diverse set of reflectances (though I guess only indoor scenes tend to have a richer set of reflectances, which is not the focus of this paper anyway)? Also, is there a parameter that can be tuned to adjust the desired degree of sparsity?