Computer Vision and Machine Learning for automatic recognition
This course is an up-to-date, deep immersion into details of the most important solutions for visual recognition, with updates in the very recent achievements of deep learning architectures.
During the course, students will gain a detailed understanding of cutting-edge research in computer vision and have the opportunity of hand-on sessions in the lab.
2. Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010,
3. Y. Bengio, Practical recommendations for gradient-based training of deep architectures, Lecture Notes in Computer Science 2012
4. X. Li, T. Uriccho, L. Ballan, M. Bertini, C.G.M. Snoek, A. del Bimbo, Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval, ACM Computing Surveys (CSUR) Volume 49 Issue 1, July 2016
Learning Objectives
1. Learn the state of the art of computer vision research for automatic recognition
2. Learn how to implement such techniques and apply to real problems.
Prerequisites
Fundamentals of image and video analysis
Teaching Methods
Classroom and labs
Further information
Students are tutored during the course and exercises
Type of Assessment
Labs and Homework
Course program
VISUAL AND MULTIMEDIA RECOGNITION 2016-17
Week 1-2 Section 1. Section 2.
• Introduction to visual and multimedia recognition
• Global image features : Color; Texture; Edges and Lines
• Dimensionality reduction: PCA
• MPEG7 holistic descriptors
Week 3-4 Section 4. Local image features
• Rotation invariant Harris corner detector
• Scale invariant keypoint detectors:
- Harris-Laplacian,
- SIFT SURF Features
• Affine invariant region detectors:
- MSER Maximally Stable Extremal Regions
• Local descriptors:
- SIFT, Color SIFT,
- SURF
Week 5-6 Section 5 Visual words and bag of Words representation
• Visual Words and Bag of Words model:
- vocabulary formation by K-means
- Radius-based clustering
• Evolution of BoW model by Coding/Reconstruction- based approaches:
- Sparse Coding
- Local Linear Coding
- Soft Assignment
- Fisher Vectors and VLAD
Week 8 - 9 Section 7. Object detection and categorization
• Bayes classification, Expectation maximization (Recall of statistical principles)
• Support Vector Machines classifier
• Boosting classifier, Adaboost
• Probabilistic Latent Semantic Analysis classifier
• HOG Histogram of Oriented Gradients people detector
• Viola and Jones face detector
• Partial matching of sets of features
Week 9 Laboratory1 Bag of Visual Words
Week 10 Section 8. Deep Learning
• Recall of Multi layer Networks
• Convolutional Neural Networks
• CNN for Visual Recognition
Week 13 Section 9. With image sequences
• Spatio-temporal features and Detectors:
- Dollar’s spatio-temporal detector;
- Dense trajectories improved.
-Advanced solutions with deep nets
• Descriptors for Spatio-temporal features:
- HoG3D (Histogram of 3D Gradients),
- HOF (Histogram of Optical Flow),
- MBH (Motion Boundary Histogram),
- Dense Trajectory Descriptors
• Action and Event recognition
• Principles of Tracking
Week 14 Section 10. Matching at large scale
• Vocabulary Tree
• Multidimensional hashing:
- Local Sensitive Hashing
- Pyramid Match Hashing,
- Semantic Hashing
Week 15 Section 11. Exploiting human and social knowledge
• Imagenet
• Exploiting data from Social Networks