User Tools

Site Tools


groupmeeting-winter2015

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
groupmeeting-winter2015 [2015/02/09 16:28]
jsupanci
groupmeeting-winter2015 [2015/03/10 17:13] (current)
bhkong
Line 35: Line 35:
  
 **Abstract**:​ **Abstract**:​
-==== Week 4 -  ​Greg - Jan 29th ====+==== Week 4 -Sam  - Jan 29th ====
  
-**Paper**:+**Topic**:
  
 **Abstract**:​ **Abstract**:​
 ==== Week 5 - Shu  - Feb 5th ==== ==== Week 5 - Shu  - Feb 5th ====
  
-**Paper**:+**Paper**: ​Beyond R-CNN detection: Learning to Merge Contextual Attribute
  
-**Abstract**:​+**Abstract**: ​We will briefly review the R-CNN [1], which actually does classification over thousands of objectness regions extracted from the image. We will see what it missed – interaction between objects and context within the image. When people make use contextual information in addition to CNN, performance is improved [2]. This is also recently supported by an interesting study [3], which compares the action classification performance between state-of-the-art CV methods and linear SVM over the fMRI data. The conclusions in the paper are very interesting,​ but we emphasize the most "​trivial"​ yet convincing one – human brain exploits semantic inference for action classification,​ which is absent in CV methods for action classification. So, exploiting the contextual information will be a reasonable step to improve detection. But how can we represent, extract and utilize the contextual information?​ To answer these questions, I will present two other papers which are seemingly unrelated to the questions. The first one is [4], which presents how to represent/​learn/​use texture attribute to improve texture and material classification;​ the second one is [5] which uses patch match techniques for chair detection in a finer way. Based on these two papers, we will try to answer the questions – how can we represent, learn and use the contextual information to boost detection?
 ==== Week 6 -Minhaeng - Feb 12th ==== ==== Week 6 -Minhaeng - Feb 12th ====
  
-**Paper**:+**Paper**: ​Knowing a good HOG filter when you see it: Efficient selection of filters for detection 
 + 
 +**Abstract**:​[[http://​ttic.uchicago.edu/​~smaji/​papers/​goodParts-eccv14.pdf|http://​ttic.uchicago.edu/​~smaji/​papers/​goodParts-eccv14.pdf]]
  
-**Abstract**:​ 
 ==== Week 7 -  Phuc - Feb 19th @ 10AM ==== ==== Week 7 -  Phuc - Feb 19th @ 10AM ====
  
Line 55: Line 56:
  
 **Abstract**:​ **Abstract**:​
-==== Week 7 - Yi  - Feb 19th @ TBD ====+==== Week 7 - Yi  - Feb 19th @ 5PM in DBH 4013 ====
  
-**Paper**:+**Paper**: ​Deep learning!
  
 **Abstract**:​ **Abstract**:​
 ==== Week 8 - Peiyun - Feb 26th ==== ==== Week 8 - Peiyun - Feb 26th ====
  
-**Paper**:+**Paper** Long-term Recurrent Convolutional Networks for Visual Recognition and Description
  
 **Abstract**:​ **Abstract**:​
 +
 +Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or “temporally deep”, are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are “doubly deep” in that they can be compositional in spatial and temporal “layers”. Such models may have advantages when target concepts are complex and/or training data are limited. Learning long-term dependencies is possible when nonlinearities are incorporated into the network state updates. Long-term RNN models are appealing in that they directly can map variable-length inputs (e.g., video frames) to variable length outputs (e.g., natural language text) and can model complex temporal dynamics; yet they can be optimized with backpropagation. Our recurrent long-term models are directly connected to modern visual convnet models and can be jointly trained to simultaneously learn temporal dynamics and convolutional perceptual representations. Our results show such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.
 +
 ==== Week 9 -  Raul - Mar 5th ==== ==== Week 9 -  Raul - Mar 5th ====
  
Line 70: Line 74:
  
 **Abstract**:​ **Abstract**:​
-==== Week 10 - Sam  - Mar 12th ====+==== Week 10 - Greg  - Mar 12th ====
  
 **Paper**: **Paper**:
groupmeeting-winter2015.1423528104.txt.gz · Last modified: 2015/02/09 16:28 by jsupanci