Time Wed, July 5 Thurs, July 6 Fri, July 7 Sat, July 8
0900 Registration Breakfast
0930-1100 Abdenour HADID
Biometrics Spoofing and Anti-spoofing
Edward J DELP
Mobile and Embedded Imaging
Sinno PAN
Transfer Learning
Lingyu DUAN
Compressed Descriptor
1100-1130 Break
1130-1300 Abdenour HADID
Biometrics Spoofing and Anti-spoofing
Edward J DELP
Mobile and Embedded Imaging
Bernd GIROD
Recent Advances in Visual Search
Christine GUILLEMOT
Light Field Image Processing
1300-1400 Lunch
1400-1530 Koichi Shinoda
Video Information Retrieval
Antonion ORTEGA
Graph Signal Processing
Xilin CHEN
Face recognition in Real World
IEEE Networking Session
@ Gardens By the Bay
1530-1600 Break
1600-1730 Koichi Shinoda
Video Information Retrieval
Julien LAI
Intelligent Video Analytics
Xilin CHEN
Understand Image Categorization
IEEE Networking Session
@ Gardens By the Bay

Synopsis: In this talk I will describe some recent work in my laboratory in mobile imaging. In particular I will talk about work in food image analysis that is used to estimate to estimating the diet of an individual. Since dietary assessment is important in many health care applications I will discuss this new area of computer vision and image processing.

I will then discuss another project in my laboratory that involves crime investigation. In particular I will describe our work in the analysis of gang graffiti and gang tattoos.

Homepage: https://engineering.purdue.edu/~ace/

Synopsis: With intelligent processing, cameras have great potential to link the real world and the virtual world. We review advances and opportunities for algorithms and applications that retrieve information from large databases using images as queries. For rate-constrained applications, remarkable improvements have been achieved over the course the MPEG-CDVS (Compact Descriptors for Visual Search) standardization. Beyond CDVS lie applications that query video databases with images, while others continually match video frames against image databases. Exploiting the temporal coherence of video for either case can yield large additional gains. We will look at implementations for example applications ranging from text recognition to augmented reality to understand the challenges of building databases for rapid search and scalability, as well as the tradeoffs between processing on a mobile device and in the cloud.

Homepage: http://web.stanford.edu/~bgirod/

Synopsis: Light field (LF) imaging has emerged as a promising technology in the field of computational photography. Many acquisition devices have been recently designed to capture light fields, going from arrays of cameras capturing the scene from slightly different viewpoints, to single cameras mounted on moving gantries and plenoptic cameras. Plenoptic cameras are becoming commercially available using arrays of micro-lenses placed in front of the photosensor to obtain angular information about the captured scene.

This talk will review recent progress in light field imaging. It will focus on two challenging processing problems: the representation and compression of the very large volume of captured visual data, and the problem of light field editing, now common with 2D images, but still challenging for dense multi-view data.

Homepage: https://people.rennes.inria.fr/Christine.Guillemot/

Synopsis: The goal of a biometric system is to determine the identity of an individual using his/her physical or behavioral characteristics. Biometric systems have many applications such as criminal identification, airport checking, computer or mobile devices log-in, building gate control, digital multimedia access, transaction authentication, voice mail, secure teleworking, etc. Various characteristics (or modalities) can be used: from the most conventional biometric modalities such as face, voice, fingerprint, iris, hand geometry or signature, to the so–called emerging biometric modalities such as gait, hand-grip, ear, body odour, body salinity, electroencephalogram or DNA. Each modality has its strengths and drawbacks.

Despite the significant progress in the field in the recent decades, most of the proposed systems appear to not meet yet all the security and robustness requirements needed for deployment in practical situations. Among tangible threats and vulnerabilities facing current biometric systems are spoofing attacks. A spoofing attack occurs when a person tries to masquerade as someone else by falsifying data and thereby gaining illegitimate access and advantages. For instance, one can spoof a face recognition system by presenting to the camera a photograph, a video or a 3D mask of a targeted person. While one can also use make-up or plastic surgery as other means of spoofing attacks, photographs and videos are probably the most common sources of spoofing attacks in face recognition because one can easily download and capture them. Recently, an increasing attention has been given to the spoofing problem in biometrics.

Within the European project TABULA RASA (http://www.tabularasa-euproject.org, 2010-2014), which was recently selected by the European Commission as a success story, we have extensively studied possible spoofing attacks, evaluated the vulnerability of biometric systems to such attacks, and developed countermeasures to improve the security of biometric systems. The TABULA RASA project brings together 12 research and industry partners from 5 European Member States, Switzerland and China. The local binary patterns (LBP) methodology developed at The Center for Machine Vision Research (CMV) at the University of Oulu has played a key role in the developed anti-spoofing solutions.

This presentation explains the problem of spoofing attacks against biometric systems by presenting the threats, reviewing some proposed solutions, highlighting the open issues and discussing future directions. Face analysis will be used as case study.

Homepage: http://www.ee.oulu.fi/~hadid/

Synopsis: One of the grand challenges of AI is to understand video content. Applications are endless: self-driving cars, interactive robots, filtering, ad-placement. Problem is AI is hard. It’s even harder when it’s live stream video, not much time to compute it.

That’s why we’re building the NVIDIA DeepStream SDK. DeepStream SDK simplifies development of high performance video analytics applications powered by deep learning. It is built on top of NVIDIA Video SDK, which can leverage GPU’s hardware encoding and decoding horsepower, and on top of NVIDIA TensorRT, which accelerates deep neural network's inferencing.

Bio-data: Dr. Junjie Lai received his bachelor and master degrees from Tsinghua University and received his PhD degree from INRIA, France. His PhD research focused on GPU architecture study, performance analysis and optimization.

Dr. Lai is currently the director of the APAC devtech team of NVIDIA. His primary focus is Deep Learning / Machine Learning, Computer Vision, Finance, etc. Besides leading the team, he is collaborating with developers from many well-known internet companies, to better accelerate their Deep Learning applications with NVIDIA GPUs.

Synopsis: Different from Information retrieval from static images, time sequence analysis plays an important role in video information retrieval. In this tutorial, we explain several approaches for modeling dynamic features in video data, which includes hidden Markov models, connectionist temporal classification, statistical language modeling, and attention models. Further, we show some examples of their implementation for the TRECVID multimedia event detection (MED) task. We also introduce software tools for developing and evaluating these models.

Presenters: Koichi SHINODA and Mengxi LIN

Homepage: http://www.ks.cs.titech.ac.jp/~shinoda/index.html