University de Mons, Belgium
University of Catania, Italy
Idiap Research Institue, Switzerland
University of Genoa, Italy
University of Essex, and Universita’ Milano-Bicocca
Kings College London
Keynote I by Prof. François Brémond
Title: Action Recognition for People Monitoring
In this talk, we will discuss how Video Analytics can be applied to human monitoring using as input a video stream. Existing work has either focused on simple activities in real-life scenarios, or on the recognition of more complex (in terms of visual variabilities) activities in hand-clipped videos with well-defined temporal boundaries. We still lack methods that can retrieve multiple instances of complex human activity in a continuous video (untrimmed) flow of data in real-world settings.
Therefore, we will first review few existing activity recognition/detection algorithms. Then, we will present several novel techniques for the recognition of ADLs (Activities of Daily Living) from 2D video cameras. We will illustrate the proposed activity monitoring approaches through several home care application datasets: Toyota SmartHome, NTU-RGB+D, Charades and Northwestern UCLA. We will end the talk by presenting some results on home care applications.
François Brémond biography:
François Brémond is a Research Director at Inria Sophia Antipolis-Méditerranée, where he created the STARS team in 2012. He has pioneered the combination of Artificial Intelligence, Machine Learning and Computer Vision for Video Understanding since 1993, both at Sophia-Antipolis and at USC (University of Southern California), LA. In 1997 he obtained his PhD degree in video understanding and pursued this work at USC on the interpretation of videos taken from UAV (Unmanned Airborne Vehicle). In 2000, recruited as a researcher at Inria, he modeled human behavior for Scene Understanding: perception, multi-sensor fusion, spatio-temporal reasoning and activity recognition. He is a co-founder of Keeneo, Ekinnox and Neosensys, three companies in intelligent video monitoring and business intelligence. He also co-founded the CoBTek team from Nice University in January 2012 with Prof. P. Robert from Nice Hospital on the study of behavioral disorders for older adults suffering from dementia. He is author or co-author of more than 250 scientific papers published in international journals or conferences in video understanding. He has (co)- supervised 20 PhD theses.
More information is available at: http://www-sop.inria.fr/members/Francois.Bremond/
Keynote II by Prof. Fabio Solari
Title: Computational Models for Ecological perception and interaction in Virtual and Augmented Reality
A bio-inspired computational model of visual perception for action tasks is proposed to provide clues to better design virtual and augmented reality (VR and AR) systems. The proposed neural model is based on space-variant mapping, disparity, and optic flow computation by implementing paradigms of the dorsal visual processing stream. The cortical representation of the visual information is directly exploited to infer features related to the real world without devising ad-hoc computer vision algorithms.
Besides artificial vision applications, the proposed model can mimic and describe human behavioral data of both motion and depth perception. By leveraging previous outcomes, we can employ the modeled perception to improve the experience in VR and AR environments: as a case study, to implement a foveated depth-of-field blur that mitigates cybersickness.
Biography of Fabio Solari:
Fabio Solari is Associate Professor of Computer Science at the Department of Informatics, Bioengineering, Robotics, and Systems Engineering of the University of Genoa. His research interests are related to computational models of motion and depth estimation, space-variant visual processing and scene interpretation. Such models are able to replicate relevant aspects of human experimental data. This can help to improve virtual and augmented reality systems in order to provide a natural perception and human-computer interaction. He is principal investigator of three international projects: Interreg Alcotra CLIP E-Santé/Silver Economy”, PROSOL Jeune” and PROSOL “Senior”. He has participated in five European projects: FP7-ICT, EYESHOTS and SEARISE; FP6-IST-FET, DRIVSCO; FP6-NEST, MCCOOP; FP5-IST-FET, ECOVISION. He has a pending International Patent Application (WO2013088390) on augmented reality, and two Italian Patent Applications on virtual (No. 0001423036) and augmented (No. 0001409382) reality. More information is available at www.dibris.unige.it/en/solari-fabio .
Keynote III by Prof. Dimitri Ognibene
Title: Adaptive Vision for Human Robot Collaboration
Unstructured social environments, e.g. building sites, release an overwhelming amount of information yet behaviorally relevant variables may be not directly accessible.
Currently proposed solutions for specific tasks, e.g. autonomous cars, usually employ over redundant, expensive, and computationally demanding sensory systems that attempt to cover the wide set of sensing conditions which the system may have to deal with.
Adaptive control of the sensors and of the perception process input is a key solution found by nature to cope with such problems, as shown by the foveal anatomy of the eye and its high mobility and control accuracy. The design principles of systems that adaptively find and select relevant information are important for both Robotics and Cognitive Neuroscience.
At the same time, collaborative robotics has recently progressed to human-robot interaction in real manufacturing. Measuring and modeling task specific gaze behaviours is mandatory to support smooth human robot interaction. Indeed, anticipatory control for human-in-the-loop architectures, which can enable robots to proactively collaborate with humans, heavily relies on observed gaze and actions patterns of their human partners.
The talk will describe several systems employing adaptive vision to support robot behavior and their collaboration with humans.
Biography of Dimitri Ognibene:
Dimitri Ognibene is Associate Professor of Human Technology Interaction at University of Milano-Bicocca, Italy. His main interest lies in understanding how social agents with limited sensory and computational resources adapt to complex and uncertain environments, how this can induce suboptimal behavior such as addiction or antisocial behaviors, and how this understanding may be applied to real life problems. To this end he develops both neural and Bayesian models and applies them both in physical, e.g. robots, and virtual, e.g. social media, settings. Before joining Milano Bicocca University, he was at the University of Essex as Lecturer in Computer Science and Artificial Intelligence from October 2017 having moved from University Pompeu Fabra (Barcelona, Spain) where he was a Marie Curie Actions COFUND fellow. Previously he developed algorithms for active vision in industrial robotic tasks as a Research Associate (RA) at Centre for Robotics Research, Kings’ College London; He developed Bayesian methods and robotic models for attention in social and dynamic environments as an RA at the Personal Robotics Laboratory in Imperial College London. He studied the interaction between active vision and autonomous learning in neuro-robotic models as an RA at the Institute of Cognitive Science and Technologies of the Italian Research Council (ISTC CNR). He also collaborated with the Wellcome Trust Centre for Neuroimaging (UCL) to study how to model exploration in the active inference modelling paradigm. He has been Visiting Researcher at Bounded Resource Reasoning Laboratory in UMass and at University of Reykjavik (Iceland) exploring the symmetries between active sensor control and active computation or metareasoning. He obtained his PhD in Robotics in 2009 from University of Genoa with a thesis titled “Ecological Adaptive Perception from a Neuro-Robotic perspective: theory, architecture and experiments” and graduated in Information Engineering at the University of Palermo in 2004. He is handling editor of Cognitive Processing, review editor for Paladyn – The journal of Behavioral Robotics, Frontiers Bionics and Biomimetics, and Frontiers Computational Intelligence in Robotics, guest associate editor for Frontiers in Neurorobotics and Frontiers in Cognitive Neuroscience. He has been chair of the robotics area of several conferences and workshops.
Keynote IV by Prof. Jean-Mark Odobez
Title: Towards gaze analysis in the wild
As a display of attention and interest, gaze is a fundamental cue in understanding people’s activities, behaviors, and state of mind, and plays an important role in many applications and research fields. In psychology and sociology, gaze information helps to infer inner states of people or their intention, and to better understand the interaction between individuals. In particular, gaze plays a major role in human communication, like for showing attention to the speaker or indicating who is addressed, which makes the automatic extraction of gaze highly relevant for the design of intuitive human computer or robot interfaces, or for medical diagnosis like for children with Autism Spectrum Disorders (ASD).
Gaze (estimating the 3D line of sight) and Visual Focus of Attention (VFOA) estimation, however, are challenging, even for humans. It often requires not only analysing the person’s face and eyes, but also the scene content including the 3D scene structure and the person’s situation (What is in the field of view of the person? How many people are around? Who is talking? manipulating objects? interacting or observing others?) to detect obstructions in the line of sight or apply attention priors that humans typically have when observing others. In this presentation, we will present three methods that address these challenges: first, a method that leverages standard activity-related priors about gaze to perform online calibration; secondly, an approach for VFOA inference which casts the scene in the 3D field of view of a person, enabling the use of audio-visual information as well as dealing with an arbitrary number of targets, and providing better cross-scene generalization; and third, moving towards gaze estimation in the wild, an approach for the gaze-following task explicitly leveraging derived multimodal cues like depth and pose. Finally, we will shortly describe a gaze-interactive scene demonstration developed for the ‘musée de la main’.
Biography of Jean-Mark Odobez:
He is leading the Perception and Activity Understanding group at the Idiap Research Institue. His main research interests are on human activities analysis from multi-modal data. This entails the investigation of fundamental tasks like the detection and tracking of people, the estimation of their pose or the detection of non-verbal behaviors, and the temporal interpretation of this information in forms of gestures, activities, behavior or social relationships. These tasks are addressed through the design of principled algorithms extending models from computer vision, multi-modal signal processing, and machine learning, in particular probabilistic graphical models and deep learning techniques, surveillance, traffic and human behavior analysis.
Keynote V by Prof. Riadh Abdelfattah
Title: Contribution of Interferometric SAR to water resource management
Water resources are essential for the development of life on Earth. The majority of water is
stored in the oceans (97.19\%), the rest – fresh water – is distributed between frozen surfaces
and glaciers (2.20\%), groundwater (0.60\%) and surface water which includes streams, lakes
and soil moisture (about 0.01\%). While water needs are generally associated with access to
drinking water for the population, it is also essential for many industrial and agri-food sectors.
The use of water has thus continuously intensified and diversified since the beginning of the
20th century, leading to an increased volumes of water used. Agriculture, which uses water
for the irrigation of cultures, represents 3/4 of the current demand in the world: Mapping the
distribution and persistence of surface water as well as groundwater in a timely fashion has
broad value for tracking dynamic events like flooding, and for monitoring the effects of
climate and human activities on natural resource values and biodiversity.
Synthetic Aperture Radar (SAR) sensors are adequate for monitoring surface water extent.
However, this is not the case with groundwater. In this talk, the potential of using
interferometric SAR (InSAR) for detecting intensive exploitation of aquifers (groundwater)
will be addressed. We show that this could be possible using SAR interferometry for mapping
landslides caused by intensive pumping. Moreover, we present the added value of the InSAR
coherence in addition to optical data for accurately mapping surface water extent.
This talk will first provide an overview on the state of the art in SAR, Interferometric SAR
and their environmental applications. Then the InSAR coherence for monitoring change
detection will be detailed.
The second part of this talk describes an analysis on the monitoring of the evolution of water
surface related to the Lebna dam in the North of Tunisia based on optical and SAR (synthetic
aperture radar) data from Sentinel 2 and 1. The InSAR (Interferometric SAR) coherence maps
over the Lebna watershed, Tunisia (~210 km2) were generated using the Sentinel-1 data
acquired from 2017 to 2022. Then, these maps simultaneously with the NDWI (Normalized
Difference Water Index) extracted from the Sentinel-2 data were employed to assess the
separability of different wetland types and their trends over time. Furthermore, the landslide
maps generated using the Differential Interferometric SAR processing chain P-SBAS (Parallel
Small BAseline Subset) will be analyzed with respect to intensive exploitation of aquifers.
Biography of Prof. Raidh Abdelfattah:
Dr. Riadh Abdelfattah is Professor at the Higher School of Engineering in Communications
(SUP’COM) at the University of Carthage In Tunisia. He was the President, and a Vice-
President of the University of Carthage (2017-2020), in charge of research activities,
technologic development and environmental partnership. He is also Associate Researcher at
the Department ITI (Image Traitement de l’Information) at IMT-Atlantique, the ”Institut de
Télécom”, Brest, France. He was member of the scientific council of AUF (Agence
Universitaire de la Francophonie) and member of the Expert Regional Committee (2016-
He received the engineer degree from the Telecommunication Engineering School of Tunis,
Tunisia in 1995, the Master Degree (DEA) and the the Ph.D degree in Electrical Engineering
from the ” Ecole Nationale Ingénieurs de Tunis”, in 1995 and 2000 respectively, and ”le
Diplôme de l’Habilitation Universitaire” from the Higher School of Communications
(SUP’COM) at the University of Carthage in Tunisia (2008).
Between 2000 and 2002 he was a postdoctoral researcher at the ” Ecole Nationale des
Télécommunications”, Paris, France consecutively at the department TSI and then at the
He is a senior member of the IEEE and he served as a member of the Executive Committee of
the IEEE Tunisia Section (2013-2015). He has authored and co-authored more than 80 journal
papers, conference papers and book chapters. His main research interests include
interferometric radar imagining, multiemporal and multiscale image analysis, desertiﬁcation,
ﬂooding and soil salinity mapping from remote sensed data, applied AI for water resource
management and SAR-nanosatellite development.
Keynote VI by Prof. Nicolas Gillis
Title: Nonnegative Matrix Factorization: Sparsity and Linear-Quadratic Model for Hyperspectral Image Unmixing
Given a nonnegative matrix X and a factorization rank r, nonnegative matrix factorization (NMF) approximates the matrix X as the product of a nonnegative matrix W with r columns and a nonnegative matrix H with r rows such that WH approximates X as well as possible. NMF has become a standard linear dimensionality reduction technique. Although it has been extensively studied in the last 20 years, many questions remain open. In this talk, we address two such questions. The first one is about the uniqueness of NMF decompositions, also known as the identifiability, which is crucial in many applications. We provide a new model and algorithm based on sparsity assumptions that guarantee the uniqueness of the NMF decomposition. The second problem is the generalization of NMF to non-linear models. We consider the linear-quadratic NMF (LQ-NMF) model that adds as basis elements the component-wise product of the columns of W, that is, W(:,j).*W(:,k) for all j,k where .* is the component-wise product. We show that LQ-NMF can be solved in polynomial time, even in the presence of noise, under the separability assumption which requires the presence of the columns of W as columns of X. We illustrate these new results on the blind unmixing of hyperspectral images.
Biography of Nicolas Gillis:
Nicolas Gillis is Professor with the Department of Mathematics and Operational Research, University of Mons, Belgium. He is the recipient of the Householder award 2014 and ERC starting grant in 2016. He is senior area editor for IEEEE Transactions on Signal Processing, and associate editor for the SIAM Journals on Matrix Analysis and Applications and on Mathematics of Data Science.