Next planned opening: Fall 2020
IVADO excellence scholarship program for PhD
IVADO’s commitment to equity, diversity and inclusion and note to applicants
To ensure all members of society draw equal benefit from the advancement of knowledge and opportunities in data science, IVADO promotes equity, diversity and inclusion through each of its programs. IVADO aims to provide a recruitment process and research setting that are inclusive, non-discriminatory, open and transparent.
Overview
Description
FAQ
Application
Results - 2018 Contest
Results - 2019 Contest
Results - 2020 Contest
Program description
- Field of study: The IVADO Scholarship Funding program supports research on the issues raised in the Canada First funding competition: data science in a broad sense, including methodological research in data science (machine learning, operations research, statistics) and its application in a range of sectors including the priority sectors of IVADO (health, transportation, logistics, energy, business and finance) or any other sector of application (sociology, physics, linguistics, engineering, etc.).
- Amount of award and grant period: $25 000 per year for a maximum of 12 sessions or 4 years
- Opening of the application process: December 4th, 2019 9 a.m. EST
- Application deadline: January 8th, 2020 9 a.m. EST
- Expected application notification date: February / March 2020
- Criteria: See the description tab
- Submission: See the submission tab
- Information: programmes-excellence@ivado.ca
Program objectives
The goal of the excellence scholarship program is to support promising students in their training as future highly qualified personnel (researchers, professors, professionals) and more generally, future actors in the field of data science, mainly in IVADO members’ areas of excellence: operations research, machine learning, decision sciences.
Eligibility
- Scholarship applicants must:
- have already earned their Master degree prior to their application date or be enrolled in the last session of the program. IVADO will be flexible with applicants who provide an adequate explanation for a career interruption or particular circumstances. This explanation must be included in the application (e.g. pregnancy/maternity or sick leave);
- intend to attend HEC Montréal, Polytechnique Montréal, Université de Montréal, McGill University or University of Alberta;
- have a first class minimum average grade (3.7/4.3 or 3.5/4.00) over the previous years of study.
- not have already received an IVADO PhD scholarship.
- Professor (supervisor) applicants must:
- hold a faculty position as a professor at HEC Montréal, Polytechnique Montréal or Université de Montréal;
- Professors at the University of Alberta and McGill University may act as supervisors providing they are full members of one of these research groups: Mila, CIRRELT, GERAD, CERC Data Science for real-time decision making, CRM, Tech3Lab.
- Only submit one application to the competition.
- For the co-supervisor, there is no constraint.
Funding period
The funding period will start April 1st, 2020 or September 1st, 2020.
Amounts and terms
The funds shall be transferred to the office of research of the supervisor’s university, and the university shall pay the student according to its own compensation rules. For projects that require ethics approval, the funds shall only be paid out once the approval is granted. Some projects may require specific agreements (e.g. pertaining to intellectual property).
Funding may be cut, withheld, delayed or rescinded under the circumstances outlined in the letter of award.
Competitive process
Review and criteria
The applications shall be reviewed to ensure compliance with program rules (e.g. applications that are incomplete, exceed the page limit or list an ineligible applicant or supervisor). Only applications that meet all criteria will be forwarded to the review committee.
The parity-based review committee shall be made up of university professors and shall not be listed as a supervisor by any applicant. However, given the small size of the communities in certain areas, it may prove difficult to select expert reviewers who are not included in an application submitted to the competition. In such cases, a reviewer may be required to assess an application despite being listed in another application as a supervisor. An external reviewer may also join the committee. The committee shall ensure by all possible means that the reviewer does not influence the ranking of the application in which he/she is included.
The review committee will check the project’s alignment between the research project and IVADO’s scientific direction, then shall rank the applications based on excellence, as well as the project’s alignment with IVADO’s overarching framework, which aims to promote multidisciplinary collaboration and diversity in data science.
In terms of excellence, the committee will specifically assess:
- Research ability
- Depth and breadth of experience: multidisciplinary and professional experiences, extra-academic activities, collaborations, contributions to the scientific community and society as a whole, etc.
- Expected adequacy with the proposed project
Final step and commitments
The student shall:
- be physically present at his/her supervisor’s university;
- contribute to IVADO’s community and activities by, for example, taking part in:
- presentations on his/her research;
- training and knowledge dissemination activities;
- consultations;
- activities generally undertaken by career researchers (mentorship, assessment, co-organization of events, etc.);
- recognize that he/she is a member of an academic community to which he/she shall contribute;
- comply with the Tri-Agency Open Access Policy on Publication. Students are encouraged to publish their research findings (papers, recordings of presentations, source codes, databases, etc.), in compliance with the intellectual property rules that apply to their own specific case;
- recognize the financial support granted by IVADO and the CFREF or FRQ when disseminating the research results and, more broadly, in all the activities in which he/she takes part
The supervisor shall:
- provide a work environment that is conducive to the completion of the project
- oversee the work of the student
FAQ
- Is there a particular format for preparing a CV?
- No, there is no particular format that needs to be followed. However, each piece of the record must help the assessor to form an opinion on the record. A CV that is too long or confusing may make evaluation more difficult.
- Are there any specific rules for the recommendation letter?
- No, there are no specific rules for the recommendation letter.
- Can candidates send recommendation letters themselves?
- No, recommendation letters can only be upload by their author in the platform.
- Can I send my unofficial transcript?
- No, you must upload your official transcript including all your current results. Originals or certified copies must be scanned and uploaded to the application and for non-Canadian universities, you must specify the rating scale.
- Is it possible to accumulate this scholarship with other scholarships ?
- This scholarship cannot be held concurrently with other NSERC, SSHRC, CIHR or IVADO scholarships. We do not place restrictions on other sources of funding (as it may be justified in some cases) but we do not encourage it.
- I have already started my phD, can I apply for a phD scholarship ?
- Yes, if you have already started your doctorate, you must provide your master’s and PhD transcripts.
- My grades from last year make me eligible, but my more recent grades are not good enough (or vice versa). Am I eligible?
- No, but if you can justify a decrease in your grades (medical certificate for example), we may accept your application.
- When can I start using my award at the latest?
- September 1st, 2020.
Didn’t find what you were looking for? If not, send us an e-mail.
Please apply through: https://ivado.smapply.io/
All applications sent by email will be rejected.
All applications must contain:
- a questionnaire to be completed on the platform WITH a description of the project (maximum length of one page);
- an example of application form and self-declaration form is available.
- student CV (free format) to be uploaded;
- official transcripts up to your master’s degree and phD marks if you have already started it (as well as information on the grading scale when the transcript is issued by a non-Canadian university);
- recommendations letters (a minimum of 2 and maximum of 3), including a letter directly uploaded by your phD supervisor (or potential future phD supervisor).
- Chun Cheng (Polytechnique Montréal, Louis-Martin Rousseau)
- Our project dedicates to deal with uncertainty in drone routing for disaster response and relief operations. To tackle the uncertainties arose from disaster scenarios, like uncertain demand locations and quantities for relief supplies, we use data-driven robust optimization (RO) method. This technique protects the decision makers against parameter ambiguity and stochastic uncertainty by using uncertainty sets. Therefore, it is significant to set proper parameters for the uncertainty set: a small set cannot accurately capture possible risks while a larger one may lead to overly conservative solutions. To address this problem, we use machine learning (ML) technique to extract information from historical data and real-time observations, and create the parameters by ML algorithms. After calibrating the uncertainty set, we will determine appropriate models for the problem by considering various theories in RO, such as static RO and multiple stage adjustable RO. These approaches will be measured against other applicable approaches such as stochastic programming.
- Dominique Godin (Université de Montréal, Jean-François Arguin)
- Ce projet de recherche a pour objectif de développer et mettre en application des techniques d’apprentissage machine afin de grandement améliorer l’identification des électrons par le détecteur ATLAS du LHC, le plus grand accélérateur de particules jamais construit et l’un des projets scientifiques les plus ambitieux de tous les temps.Afin de mener à bien le programme d’ATLAS, il est nécessaire d’identifier et mesurer chacune des particules, lesquelles s’y créer à un taux de 40 milliards par seconde et génèrent un flot astronomique de données. Parmi celles-ci, les électrons revêtent une très grande importance, mais ils sont également excessivement rares, ne représentant qu’une infime fraction. Considérant la taille et complexité des données disponibles, le problème d’identification des particules aussi rares que les électrons constitue un terrain d’application idéal pour les méthodes d’apprentissage machine. Les algorithmes actuels d’identification des électrons sont très simples et ne font pas usage de ces méthodes de telle sorte qu’une percée dans ce domaine serait une première mondiale qui pourrait éventuellement paver la voie à des découvertes majeures en physique des particules.
- Charley Gros (Polytechnique Montréal, Julien Cohen-Adad)
- Multiple sclerosis (MS) is a disease, with a high rate in Canada, that leads to major sensory and motor problems. This disease affects the neuronal signal transmission in both brain and spinal cord, creating lesions, which are observable on images acquired with an MRI scanner. The count and volume of lesions on an MRI scan of a patient is a crucial indicator of the disease status and commonly used by doctors for the diagnosis, prognosis and therapeutic drug trials. However, the detection of lesions is very challenging and time consuming for radiologists, due to the high variability of their size and shape.This project aims at developing a new, automatic and fast method of MS lesion detection on MRI data of spinal cord, based on newly developed machine learning algorithms. The new algorithm’s performance will be tested on a large dataset involving patients coming from different hospitals in the world. Once the algorithm is optimized, it will be freely available as part of an open-source software, already widely used for spinal cord MRI processing and analysis. A fundamental goal of this project is the integration of this algorithm in hospitals to help radiologists in their daily work.
- Thomas Thiery (Université de Montréal, Karim Jerbi)
- When we are walking through a crowd, or playing a sport, our brain continuously makes decisions about directions to go to, obstacles to avoid and information to pay attention to. Fuelled by the successful combinations of quantitative modeling and neural recordings in nonhuman primates, research into the temporal dynamics of decision-making has brought the study of decision-making to the fore within neuroscience and psychology, and has exemplified the benefits of convergent mathematical and biological approaches to understanding brain function. However, studies have yet to uncover the complex dynamics of large-scale neural networks involved in dynamic decision-making in humans. The present research aims to use advanced data analytics to identify the neural features involved in tracking the state of sensory evidence and confirming the commitment to a choice during a dynamic decision-making task. To this end, we will use cutting-edge electrophysiological brain imaging (magnetoencephalography, MEG), combined with multivariate machine learning algorithms. This project, for the first time, will shed light on the whole-brain large-scale dynamics involved in dynamic decision-making, thus providing empirical evidence that can be generalized across subjects to test and refine computational models and neuroscientific accounts of decision-making. By providing a quantitative link between the behavioral and neural dynamics subserving how decisions are continuously formed in the brain, this project will contribute to expose mechanisms that are likely to figure prominently in human cognition, in health and disease. Moreover, this research may provide neurobiological-inspired contributions to machine learning algorithms that implement computationally efficient gating functions capable of making decisions in a dynamically changing environment. ln addition to advancing our knowledge of the way human brains come to a decision, we also foresee long-term health implications for disorders such as Parkinson’s disease.
Francis Banville (Université de Montréal, Timothée Poisot)
- Réseaux d’interactions écologiques et changements climatiques : inférence et modélisation par des techniques d’apprentissage automatique
Avishek Bose (McGill University, William Hamilton)
- Domain Agnostic Adversarial Attacks for Security and Privacy.
Lluis E. Castrejon Subira (Université de Montréal, Aaron Courville)
- Self-Supervised Learning of Visual Representations from Videos
Elodie Deschaintres (Polytechnique Montréal, Catherine Morency)
- Modélisation des interactions entre les modes de transport par l’intégration de différentes sources de données
Laura Gagliano (Polytechnique Montréal, Mohamad Sawan)
- Artificial Neural Networks and Bispectrum for Epileptic Seizure Prediction
Ellen Jackson (Université de Montréal, Hélène Carabin)
- Evaluation of a Directed Acyclic Graph for Cysticercosis using Multiple Methods
Mengying Lei (McGill University, Lijun Sun)
- Spatial-Temporal Traffic Pattern Analysis and Urban Computation Applications based on Tensor Decomposition and Multi-scale Neural Networks
Tegan Maharaj (Polytechnique Montréal, Christopher Pal)
- Deep ecology: Bringing together theoretical ecology and deep learning
Antoine Prouvost (Polytechnique Montréal, Andrea Lodi)
- Learning to Select Cutting Planes in Integer Programming
Matthew Schlegel (University of Alberta, Martha White)
- Representing the World Through Predictions in Intelligent Machines
- Md Rifat Arefin (Université de Montréal, Irina Rish)
- Developing Biologically inspired Deep Neural Network for Continual Lifelong Learning
We humans are able to continually learn throughout our lifetime which is called lifelong learning. This capability is also crucial for computational systems interacting in the real world and processing continuous streams of data. However, the current deep learning systems struggle to continually acquire the incremental information available over time from non-stationary data distributions. They tend to forget the knowledge which is acquired earlier upon learning the new one – such a problem is called catastrophic forgetting. In this project, we will study biological factors of lifelong learning and their implications for the modelling of biologically motivated neural network architectures that can improve life-long learning capability of computational systems by reducing catastrophic forgetting problem.
- Developing Biologically inspired Deep Neural Network for Continual Lifelong Learning
- Sumana Basu (McGill University, Doina Precup)
- Off Policy Batch Reinforcement Learning for Healthcare
Artificial Intelligence (AI) has an increasing impact on our everyday life, one being in health care. Today most of the successful applications of AI in healthcare are for diagnosis or prediction, but not for the treatment. But AI agents also have the potential for sequential decision making such as assisting doctors in reassessing treatment options, as well as in surgery. The branch of AI that is a natural fit for handling such sequential decision-making problems is known as Reinforcement Learning (RL).So far most of the successful applications of RL have been in the video game environments. But there are relatively fewer applications of RL in healthcare. One of the reasons is that unlike games, in healthcare the RL agents cannot interact with the environment to explore new possibilities to learn the optimal treatment policy. Trying new treatment options on patients without knowing their consequences is not only unethical but also can be fatal. So, the agent has to learn retrospectively from previously collected batches of data. In RL literature, this is called Off-Policy Learning. Challenges in off-policy evaluation, sparse reward, non-stationary data, and sample inefficiency are some of the roadblocks for using RL safely and successfully in healthcare. During my Ph.D. I aim to tackle some of these challenges in the context of healthcare.
- Off Policy Batch Reinforcement Learning for Healthcare
- Christopher Beckham (Polytechnique Montréal, Christopher Pal)
- Unsupervised representation learning
Unsupervised representation learning is concerned with using deep learning algorithms to extract ‘useful’ features (latent variables) from data without any external labels or supervision. This addresses one of the issues with supervised learning, which is the cost and lack of scalability in obtaining labeled data. The techniques developed in this field have broad applicability, especially with regard to training smart ‘AI agents’ and domains where obtaining labeled data is difficult.’Mixup’ (Zhang et al) is a recently-proposed class of data augmentation techniques which involve augmenting a training set with extra ‘virtual’ examples by constructing ‘mixes’ between random pairs of examples in the training set and optimizing some objective on those mixed examples. While the original mixup algorithm simply performed these mixes in input space (which comes with a few limitations) for supervised classification, recent work (Verma et al, Yaguchi et al) proposed performing these mixes in the latent space of the classifier instead, achieving superior results to the original work.One intuitive way to think about ‘latent space mixing’ is that we can imagine that the original data is generated by *many* latent variables, the possible configurations of which increase exponentially as the number of latent variables increases. Because of this we only see a *very small* subset of those configurations in our training set. Therefore, mixup can be seen as allowing the network to explore *novel* combinations of the latent variables it has inferred (which may not already be present in the training set), therefore making the network more robust to novel configurations of latent states (i.e. novel examples) at test time. Empirical results from the works cited corroborate this hypothesis.
The first stage of my PhD was exploring mixup in the context of unsupervised representation learning (building on the work of Verma et al, which I also co-authored), in which the goal is to learn useful latent variables from unlabeled data. This was done by leveraging ideas from adversarial learning and devising an algorithm which is able able to mix between encoded states of real inputs and decoding them into realistic-looking inputs indistinguishable from the real data. We showed promising results both qualitatively and quantitatively, and recently published our findings at the NeurIPS 2019 conference.
Some preliminary experiments suggest that one of our proposed variants of ‘unsupervised mixup’ has a connection to ‘disentangled learning’, which explores the inference of latent variables which are conceptually ‘atomic’ but can be arbitrarily composed together to produce more abstract concepts (which is similar to how we as humans structure information in the brain). This lays the groundwork for some more exciting research to pursue during my PhD.
- Unsupervised representation learning
- Antoine Boudreau LeBlanc (Université de Montréal, Bryn Williams-Jones)
- Bioéthique écosystémique et mégadonnées: santé, agriculture et écologie
Les problèmes actuels sont globaux, liant société, économie et environnement à la santé. L’antibiorésistance par exemple provient d’un mésusage d’antibiotiques en santé et en agriculture qui vient réduire l’efficacité de ceux-ci. Pour attaquer ce problème, de larges collaborations entre médecins, agriculteurs et écologistes deviennent nécessaires, mais demeurent limitées par bons nombres de défis techniques (ex. : partage de données) et éthiques (consentement, sécurité) apparaissant dès l’intégration les données et les connaissances pour intervenir de façon concertée. L’objectif de cette thèse est d’étudier ces enjeux affectant la circulation des données entre santé, agriculture et écologie afin de proposer un modèle de gouvernance des données maximisant l’accès et la protection des données pour appuyer la recherche, la surveillance et l’intervention tout en maintenant la confiance des fournisseurs de données.
Ce projet fondera son analyse éthique sur une cartographie des relations entre les intervenants clés pouvant supporter un réseau de partage de données entre la santé, l’agriculture et l’écologie. Quatre études de cas sont amorcées et permettent de décrire le processus de constitution de ce réseau aux niveaux interministériel, intersectoriel, interprofessionnel, interpersonnel (certification éthique obtenue). Le devis ethnographique réalisé en étroite collaboration avec ces 4 milieux d’accueil supportera l’écriture d’un cadre de gouvernance par théorisation ancrée. Il sera ensuite comparé aux initiatives internationales (Danemark, Angleterre, États-Unis). Cette thèse permettra d’appuyer la mise en œuvre de réseaux structurants de partage de données intersectorielles au niveau de la médecine vétérinaire au Québec et jettera les bases d’un cadre de gouvernance pour l’interconnexion des bases de données entre organisations et secteurs.
- Bioéthique écosystémique et mégadonnées: santé, agriculture et écologie
- Chloé Bourquin (Polytechnique Montréal, Jean Provost)
- Mesure de la pulsatilité cérébrale et son impact sur la cognition chez la souris vasculairement compromise par imagerie ultrasonore
Les maladies cardiovasculaires peuvent être à l’origine d’un vieillissement cérébral accéléré. Les artères, telles l’aorte ou les carotides, sont riches en fibres élastiques, permettant d’adoucir les fluctuations de la pression sanguine (ou pulsatilité) lors du cycle cardiaque dans les vaisseaux cérébraux en aval. Avec l’âge et la maladie, les artères deviennent plus rigides, entraînant une augmentation de la pulsatilité en aval et menant à des altérations microvasculaires. Cartographier la pulsatilité dans l’ensemble du réseau vasculaire cérébral pourrait donc devenir un biomarqueur permettant de diagnostiquer les maladies neurodégénératives. Jusqu’à récemment, suivre l’évolution du pulse dans le réseau vasculaire cérébral n’était pas possible : la microscopie optique ne permet que la mesure des micro-vaisseaux à la surface du cerveau, tandis que l’IRM haut champ permet d’imager un cerveau entier mais n’a pas une résolution spatiotemporelle et sensibilité suffisantes pour mesurer de petits vaisseaux. Une nouvelle technique ultrasonore pourrait relever ce défi : la Microscopie par Localisation Ultrasonore (MLU). Basée sur la localisation et le suivi de microbulles injectées comme agents de contraste, elle permet de cartographier les vaisseaux avec une résolution de l’ordre de 5 µm dans l’ensemble du cerveau. Cependant, cette méthode nécessite de suivre les microbulles durant 10 minutes pour finalement n’obtenir qu’une unique image de la vascularisation cérébrale. Notre objectif est de parvenir à rendre cette méthode dynamique, en la synchronisant avec l’ECG et la respiration afin d’obtenir non pas une image unique mais un film d’au moins une pulsation cardiaque, afin d’observer les variations de vitesse du flux sanguin au cours du cycle, et d’en déduire la pulsatilité. Cette nouvelle méthode permettra de démontrer pour la première fois la variation de la pulsatilité dans le cerveau entier, d’établir un lien de corrélation entre l’augmentation de la pulsatilité et les pertes de cognition ainsi que les dommages cérébraux et d’établir la mesure de la pulsatilité comme biomarqueur pour suivre l’évolution de maladies cardiovasculaires et/ou neurodégénératives.
- Mesure de la pulsatilité cérébrale et son impact sur la cognition chez la souris vasculairement compromise par imagerie ultrasonore
- Xinyu Chen (Polytechnique Montréal, Nicolas Saunier)
- City-Scale Traffic Data Imputation and Forecasting with Tensor Learning
With recent advances in sensing technologies, large-scale and multidimensional urban traffic data are collected on a continuous basis from both traditional fixed traffic sensing systems (e.g., loop detectors and video cameras) and emerging crowdsourcing/floating sensing systems (e.g., GPS trajectory from taxis/buses and Google Waze). These data sets have provided us with unprecedented opportunities for sensing and understanding urban traffic dynamics and developing efficient and reliable smart transportation solutions. For example, forecasting the demand and states (e.g., speed, volume) of urban traffic is essential to a wide range of intelligent transportation system (ITS) applications such as trip planning, travel time estimation, route planning, traffic signal control, to name just a few. However, there are two critical issues that undermine the use of these data sets in real-world applications: (1) the missing data and noisy nature make it difficult to get the true signal, and (2) it is computationally expensive to process large-scale data sets for online applications (e.g., traffic prediction). The goal of this project is to develop new framework to better model local consistencies in spatiotemporal traffic data, such as the {sensor dependencies} and {temporal dependencies} resulting from traffic flow dynamics. The scientific objectives are to: (1) Develop nonconvex low-rank matrix/tensor completion models considering spatiotemporal dependencies/correlations (e.g., graph Laplacian [spatial] and time series [temporal]) and traffic domain knowledge (e.g., fundamental diagram, traffic equilibrium, and network flow conservation). (2) Incorporate Gaussian process kernels and neural network structures (e.g., recurrent neural network (RNN), attention mechanism) to better characterize the the complex correlation structure.
- City-Scale Traffic Data Imputation and Forecasting with Tensor Learning
- Abhilash Chenreddy (HEC Montréal, Delage Erick)
- Inverse Reinforcement Learning with Robust Risk Preference
RL/IRL methods provide powerful tools for solving a wide class of sequential decision-making problems under uncertainty. However, the practical use of these techniques as a computational tool has been limited historically owing to multiple factors like the presence of high-dimensional continuous state and action spaces in many real-world decision problems, the stochastic and noisy nature of the real world systems compared to the simulated environments, and the indifference of traditional reward and utility functions to the risk preference of the agent. I am excited about the possibility of directing my future research towards building risk-aware MDP models as they would provide stronger reliability guarantees than their risk-neutral counterparts. one typical modeling premise in RL/IRL is to optimize the expected utility (i.e., an assumption that humans are risk-neutral), which deviates from actual human behaviors under ambiguity. Recent work suggests such an effort can provide stable solutions for high-dimensional state space problems, thus making them more applicable for practical use cases.As an effort in this direction, under the guidance of Prof. Erick Delage, I am working towards developing risk-aware IRL/RL algorithms for portfolio selection problems. Applications that I am interested in include, but are not limited to, i) learning the agent’s risk profile using inverse learning methods and ii) Risk sensitive exploration in RL setting. Our work tries to formulate the inverse learning model from a distributionally robust optimization (DRO) point of view where the agent performs at least as well as the expert in terms of the risk-sensitive objective. We plan to achieve this by building an ambiguity set for the expert’s risk preference and train the agent to learn by taking a worst-case approach, thus shielding the agent from the ambiguity in the underlying risk distribution.
- Inverse Reinforcement Learning with Robust Risk Preference
- Theophile Demazure (HEC Montréal, Pierre-Majorique Léger)
- Apprentissage profond et classification des états cognitifs pour la modulation en temps réel des interactions humain-machine en milieu automatisé.
L’univers du travail est en train d’être profondément modifié. Des technologies comme la robotique et des applications de l’intelligence artificielle s’intègrent de plus en plus dans les tâches du travail. L’objectif de cette recherche est de prendre en compte l’humain dans un environnement composé de machines. Ces dernières ne sont pas capables de percevoir que l’employé, avec lequel elles collaborent, est fatigué, absent mentalement, ou tout simplement distrait. Un collègue, dans ce cas-ci, s’ajusterait ou le préviendrait afin qu’il reprenne ses esprits. La machine, quant à elle, poursuivrait son activité sans s’ajuster augmentant le risque d’accident ou d’erreur. Pour répondre à ce problème, ce projet porte sur le développement d’un système qui s’adapte à l’état cognitif de son utilisateur comme la fatigue, la charge mentale ou la fatigue.Les interfaces cerveau-machines utilisent des mesures neurophysiologiques de l’être humain pour surveiller, s’adapter, ou se faire contrôler. À l’intérieur, des algorithmes d’apprentissage machine permettent de classifier l’état cognitif à partir des données capturées en temps réel. En utilisant les signaux électriques dégagés par le cerveau et la dilatation de la pupille, il est possible de discriminer entre plusieurs états dans le temps la situation de l’opérateur.
Le prototype développé pourra ainsi donner l’ordre à d’autres machines de ralentir la cadence ou de prévenir lorsque l’employé avec qui elles collaborent semble fatigué ou peu vigilant. Ce prototype sera développé et évalué en laboratoire dans un environnement contrôler. Il s’agit d’une preuve de concept à l’entreprise. Les interfaces cerveau-machines sont aujourd’hui principalement utilisées en médecine pour des prothèses, système d’assistance à la parole ou fauteuil roulants. Les retombées sont majoritairement en sécurité au travail (transport, manufacture) et dans l’optimisation de l’interaction humain-machine (collaboration humain-machine).
- Apprentissage profond et classification des états cognitifs pour la modulation en temps réel des interactions humain-machine en milieu automatisé.
- Sébastien Henwood (Polytechnique Montréal, François Leduc-Primeau)
- Coded Neural Network
Les réseaux de neurones profonds connaissent un engouement généralisé en ce début de décennie. Or, les progrès dans ce domaine s’accompagnent d’une hausse des besoins en capacité de calcul devançant la loi de Moore. Dans ce contexte, on cherche à proposer un ensemble de méthodes permettant d’optimiser les besoins en énergie de réseaux de neurones profonds en prenant en compte les caractéristiques physiques (mémoire, processeur, etc) du système accueillant le réseau pour son usage final.
L’objectif est d’avoir une méthode suffisamment générale pour s’adapter aux tâches et réseaux variés que les concepteurs pourraient vouloir déployer dans leurs applications, et réduisant la charge énergétique selon un compromis capacité du réseau/énergie contrôlable.
Ces travaux permettraient d’une part de gagner en énergie sur les systèmes des utilisateurs (par exemple, téléphone intelligent), ce faisant favorisant les usages déconnectés. D’autre part, on cherche à toucher les utilisations en data-centers, si voraces en énergie.
Ce projet de recherche permettra à terme de tirer parti au mieux des ressources allouées à l’apprentissage automatique dans sa phase d’exploitation, pour s’assurer de son acceptabilité sociale d’une part et de sa viabilité technique et économique d’autre part.
- Coded Neural Network
- Jad Kabbara (McGill University, Jackie Cheung)
- Computational Investigations of Pragmatic Effects in Language
This thesis focuses on natural language processing (NLP), specifically computational pragmatics, using deep learning methods. While most NLP research today focuses on semantics (literal meaning of words and sentences), my research takes a different approach: I focus on pragmatics which deals with intended meaning of sentences, one that is context-dependent. Correctly performing pragmatic reasoning is at the core of many NLP tasks including information extraction, summarization, machine translation, sentiment/stance analysis. My goal is to develop computational models where pragmatics is a first-class citizen both in terms of natural language understanding and generation. I have already made strong progress toward this goal: I developed a neural model for definiteness prediction [COLING 2016] — the task of determining whether a noun phrase should be definite or indefinite — in contrast to prior work relying on heavily-engineered linguistic features. This has applications in summarization, machine translation and grammatical error correction. I also introduced the new task of presupposition triggering detection [ACL 2018 — best paper award] which focuses on detecting contexts where adverbs (e.g. “again”) trigger presuppositions (e.g.,“John came again” presupposes “he came before”). This work is important because it is a first step towards language technology systems capable of understanding and using presuppositions and because it constitutes an interesting testbed for pragmatic reasoning. Moving forward, I propose to examine the role of pragmatics, particularly presuppositions, in language understanding and generation. I will develop computational models and corpora that incorporate this understanding to improve: (1) summarization systems e.g. in a text rewriting step to learn how to appropriately allocate adverbs in generated sentences to make them more coherent and (2) reading comprehension systems where pragmatic effects are crucial for the proper understanding of texts and where systems can answer questions of pragmatic nature whose answers are not found explicitly in the text. By the end, the thesis would present the first study on presuppositional effects in language to enable pragmatically-empowered natural language understanding and generation systems
- Computational Investigations of Pragmatic Effects in Language
- Caroline Labelle (Université de Montréal, Sébastien Lemieux)
- Enhancing the Drug Discovery Process: Bayesian Inference to evaluate Efficacy Characteristics of potential Drug Through Uncertainty
During the multi-phase drug-discovery process, many compounds are tested in various assays which generates a great deal of data from which Efficacy Metrics (EM) can be estimated. Compounds are selected with the aim of identifying at least one sufficiently potent and efficient to go into preclinical testing. This selection is based on the EM meeting a specific threshold or by comparison to other compounds.
Current analysis methods suggest point estimates of EM and hardly consider the inevitable noise present in experimental observations, thus failing to report the uncertainty on the EM and precluding its use during compound selection. We propose to extend our previously introduced statistical methods (EM inference and pairwise comparison) to the ranking of a panel of compounds and to combinatorial analysis (multiple compounds tested simultaneously). Given an EM threshold, we aim at identifying the compounds with the highest probability of meeting that criteria.
We use a hierarchical Bayesian model to infer EM from dose-response assays (single- and multi-doses), yielding empirical distributions for EM of interest rather than single point estimates. The assay’s uncertainty can thus be propagated to the EM inference and to compound selection. We are thus able to identify all compounds of an experimental dose-response dataset with at least 1% chance of being amongst the best for various given EM, and to characterize the effects of each compounds of a combinatorial assay.
This novel methodology is developed and applied to the identification of novel compounds able to inhibit cellular growth of leukemic cells.
- Enhancing the Drug Discovery Process: Bayesian Inference to evaluate Efficacy Characteristics of potential Drug Through Uncertainty
- Sébastien Lachapelle (Université de Montréal, Simon Lacoste-Julien)
- Uncertainty in Operations Research, Causality and Out-of-Distribution Generalization
My research focuses on two main directions: widening the operations research toolbox using recent advances in deep learning and learning causal structures. Both aspects have the potential to be useful in various applications, for example the optimization of railway operations, gene expression studies as well as the understanding of different protein interactions in human cells.Together with Emma Frejinger and its team at the CN chair, we developed a methodology which allows to predict tactical solutions given only partial knowledge of the problem using deep neural networks. We demonstrated the efficiency of the approach on the problem of booking intermodal containers on double-stack trains. Moreover, we are currently working to apply machine learning techniques to standard operations research problems such as the knapsack and the travelling salesman problem in hope of gaining insight about classical algorithms to solve them.
More recently, I have been interested in the nature of causal reasoning and how machines could acquire it. Typical machine learning systems are good at finding statistical dependencies in data, but often lack the causal understanding which is necessary to predict the effect of an intervention (e.g. the effect of a drug on the human body). Together with my co-authors, we developed “”Gradient-Based Neural DAG Learning””, a causal discovery algorithm which aims at going beyond simple statistical dependencies. We showed the algorithm was capable of finding known causal relationships between multiple proteins in human cells.
In the future, I will work to make machine learning more adaptive and able to reuse past knowledge in order to learn new patterns faster. This is something humans do all the time, but which is hard for current algorithms. I believe causality is part of the answer, but other frameworks like meta-learning, transfer learning and reinforcement learning are going to be necessary. Apart from bringing us closer to human-level intelligence, making progress in this direction would benefit many applications. For instance, if a machine learning system is used to predict tactical solutions to a railway optimization problem, the distribution of problems it faces might shift due to changes in trade legislation, hence rendering the predicted solutions far from optimal. We should aim to build systems which can adapt to a changing world quickly.
- Uncertainty in Operations Research, Causality and Out-of-Distribution Generalization
- Maude Lizaire (Université de Montréal, Guillaume Rabusseau)
- Connexions entre réseaux récurrents, automates pondérés et réseaux de tenseurs pour l’apprentissage avec données séquentielles
À plusieurs reprises dans l’histoire, des découvertes ont été faites parallèlement par plusieurs scientifiques. On n’a qu’à penser au calcul infinitésimal développé indépendamment par Newton sous l’influence de ses travaux sur les lois universelles du mouvement et Leibniz inspiré par le principe philosophique de l’infiniment petit. À l’intersection entre plusieurs disciplines, ce type de découvertes n’atteignent leur plein potentiel que grâce à la contribution des différentes expertises. Dans cet ordre d’idées, de nombreuses équivalences peuvent être tracées entre les formalismes développés en physique et en intelligence artificielle. En particulier, une méthode pilier de la formulation moderne employée en physique quantique, les réseaux de tenseurs, peut être reliée aux réseaux récurrents, l’une des principales familles de modèles adaptés aux données structurées en apprentissage profond. Ces derniers sont également connectés aux automates pondérés, qui sont des modèles au coeur des méthode formelles et de vérification en informatique théorique. L’exploration des liens entre ces trois méthodes (réseaux de tenseurs, réseaux récurrents et automates pondérés) permet de tirer profit des garanties théoriques offertes par les méthodes formelles, de l’expressivité et des nombreuses applications des réseaux récurrents, tout en faisant le pont avec les débouchés des réseaux de tenseurs dans les domaines des matériaux et de l’informatique quantiques. Le projet vise ainsi à créer des passerelles entre ces différentes disciplines et exploiter les progrès faits dans l’une au profit des autres.
- Connexions entre réseaux récurrents, automates pondérés et réseaux de tenseurs pour l’apprentissage avec données séquentielles
- Elena Massai (Université de Montréal, Marina Martinez)
- Neuroprosthesis development to recover the gait after spinal cord injury in rats
Spinal Cord Injury (SCI) interrupts the communication between the brain and the spinal locomotor networks, causing leg paralysis. When SCI is incomplete (iSCI), some nerve fibers survive the lesion and patients with iSCI can eventually regain some motor abilities. The goal of this study is to assess in the rat model whether combined brain and spinal stimulation can lead to a superior locomotion recovery after spinal cord injury. Artificial Intelligence (AI) techniques will be employed to track the motor activity, drive the stimulation and optimize the strategy in real time. By refining the spatiotemporal stimulation parameters, the intelligent algorithm will help the rat’s brain to generate leg trajectory that features a better clearance of the ground during swing, stronger leg extension and higher posture during stance. We expect that optimized neuroprosthetic stimulation will result in locomotor patterns that are more similar to intact rats and will facilitate the recovery of voluntary control of locomotion. The results will provide a framework for the future development of efficient neuromodulation interfaces and prosthetic approaches for rehabilitation.
- Neuroprosthesis development to recover the gait after spinal cord injury in rats
- Antoine Moevus (Polytechnique Montréal, Benjamin De Leener)
- Quantitative susceptibility mapping framework for assessing cortical development in neonates after severe deoxygenation at birth
Hypoxic ischemic encephalopathy (HIE) is a newborn brain pathology that is common but, unfortunately, not well understood. HIE affects 1.5 per 1000 live births in developed countries and is the leading cause of death and devastating sequelae in terms of neonates cognitive, behavioural, and physical disabilities. The most effective clinical treatment, therapeutic hypothermia, improves the survival rate; however, the repercussions of HIE remain unclear for survivors. As of today, the understanding of altered cortical growth mechanisms after HIE is incomplete but promising non-invasive magnetic resonance imaging (MRI) technique, called quantitative susceptibility mapping (QSM), provide new brain biomarkers that can help understand how HIE affects the brain development. Yet, because cortical development of neonates is rapid and sophisticated, standard clinical neurological imaging tools, such as MRI templates, are not suited for neurodevelopmental analysis in neonates.
Therefore, we propose to implement new methods for solving the QSM reconstruction problem and improve the common MRI template by developing adaptive age-based longitudinal templates. We will adopt a data-driven strategy with deep learning in order to create a new framework for the pediatric and neurology communities.
- Quantitative susceptibility mapping framework for assessing cortical development in neonates after severe deoxygenation at birth
- Alexis Montoison (Polytechnique Montréal, Dominique Orban)
- Méthodes multi-précision pour l’optimisation et l’algèbre linéaire
Ce projet de recherche a pour but de développer des méthodes capables de basculer d’une précision
machine à l’autre durant la résolution de problèmes d’optimisation de grande taille, et d’effectuer
l’essentiel des opérations en basse précision où elles sont peu coûteuses et requièrent peu d’énergie.
Nos résultats préliminaires indiquent des économies énergétiques pouvant aller jusqu’à 90% sur certains
problèmes.
Ces méthodes s’appliquent notamment à la biologie des systèmes, qui requiert des solutions en
quadruple précision, et au machine learning, où la demi précision est de plus en plus populaire.
Sur les plateformes spécialisées émergentes gérant nativement ces nouvelles précisions, comme les
cartes graphiques Turing de Nvidia qui implémentent la demi précision ou encore le processeur IBM
Power9 qui implémente la quadruple précision, ces méthodes seront à même d’exploiter au maximum le
bénéfice du travail en multi-précision.
À l’ère des données massives et de l’explosion de l’information, des algorithmes permettant des
économies d’énergie significatives sur les plateformes adéquates sont un investissement pour l’avenir du
Canada, en termes du volume de données exploité et de l’environnement.
- Méthodes multi-précision pour l’optimisation et l’algèbre linéaire
- Amine Natik (Université de Montréal, Guillaume Lajoie)
- Decomposition of information encoded in learned representations of recurrent neural networks
The human brain contains billions of neurons that communicate with each other through trillions of synapses, enabling us to learn new skills, solve complex tasks and understand intricate concepts. Everything we do such as walking, eating, communicating, and learning, is a function of these neurons firing in certain patterns, in specific locations. This sophisticated biological neural network is the outcome of millions of years of evolution. Recent advances in deep learning have proposed several artificial neural network architectures for solving complex learning tasks, by taking simplified inspiration from neural circuits in our brains. Examples of these include convolutional neural networks for image and audio processing, recurrent neural networks for sequence learning and autoencoders for dimensionality reduction. Both biological and artificial networks rely on efficient calibration of synapses (or connection weights) to match desired behaviours. This adjustment is how a network “learns”, but is a complicated task that is not well understood. An important substrate of networks after learning is the internal low dimensional representation found in the joint activity of neural populations that emerge upon performing a learned task. The present research aims to explore and further investigate these internal representations, and address the question of how do structural properties of network connectivity impact the geometry, dimensionality and learning mechanisms encoded by these internal features. We plan to answer this question by leveraging multidisciplinary data exploration tools from graph signal processing, dimensionality reduction, representation learning and dynamical systems. We expect that this project will allow us to gain better understanding of how natural and artificial neural networks solve complicated tasks, which in turn will help us find methodological ways to improve existing structures, and build new models, but more from a deeper understanding rather than trial and error.
- Decomposition of information encoded in learned representations of recurrent neural networks
- Cédric Poutré (Université de Montréal, Manuel Morales)
- Statistical Arbitrage of Internationally Interlisted Stocks
In this project, we will investigate a novel form of statistical arbitrage that will combine artificially created financial instruments in a high-frequency world, meaning that we will operate in the millisecond timeframe. These instruments will be constructed in such a way that they will offer very interesting statistical properties that will enable us to exploit violations in the law of one price in the Canadian and American markets. This arbitraging activity is essential, since it is making them more efficient by eliminating mispricing in equities that are quoted on both markets. The novel strategy will be tested on a large basket of equities on three trading venues in North America and given that we are working in high-frequency, this means that millions of market observations are ingested and analyzed daily by our trading algorithms. In order to be proactive in the markets, to make extremely fast and accurate predictions, and because of the complex nature of financial data and its abundance, we will be relying on machine learning algorithms to guide our trading decisions.
- Statistical Arbitrage of Internationally Interlisted Stocks
- Carter Rhea (Université de Montréal, Julie Hlavacek-larrondo)
- A Novel Deep Learning Approach to High-Energy Astrophysics
Despite machine learnings recent rise to stardom in the applied sciences, the astronomy community has been reluctant to accept it. We propose to gently introduce several forms of machine learning to the community through the study of the hot gas pervasive in galaxy clusters. Currently, emission spectra from galaxy clusters are studied by fitting physical models to them and using those models to extract relavent physical parameters. Unforunately, there are several inherent pitfalls with this method. We plan to train different algorithms — from a random forest classifier to a convolutional neural network — to parse the necessary thermodynamic variables from the emission spectra. The fundamental goal of this project is to create and open-source pipeline and suite of tutorials which integrate machine learning into the study of galaxy clusters.
- A Novel Deep Learning Approach to High-Energy Astrophysics
- Charly Robinson La Rocca (Université de Montréal, Emma Frejinger)
- Learning solutions to the locomotive scheduling problem
Given a set of demands on a railway network, how should one assign locomotives to trains in order to minimize total costs and satisfy operational constraints? This question is critical for Canada’s largest railway company: Canadian National Railways. Given the size of their network, even a small relative gain in efficiency would produce significant savings. The goal of this research is to explore recent advances in machine learning in order to efficiently solve the locomotive assignment problem. The idea is to train a neural network on precomputed solutions of the problem with the aim of learning the correct configuration of locomotives for a given train. By combining both integer programming and deep learning, the computational time can be reduced by at least an order of magnitude compared to integer programming alone. This is a solution that is significantly more efficient and practical for train operators.
- Learning solutions to the locomotive scheduling problem
- Davood Wadi (HEC Montréal, Sylvain Sénécal)
- Cognition-Based Auto-Adaptive Website User Interface in Real Time
A personal message that is designed specifically for the need and taste of consumers has always been the goal of media outlets, retailers, and social activists. Here at Tech3Lab, we are launching this massive study of personalization in an unprecedented way: by analyzing neurophysiological and psychophysiological signals of the body to determine the best possible look and feel on websites to improve user experience and best convey the intended message.
Previously, auto-adaptive website personalization was carried out mostly by guesswork and theory, in which there is no real evidence for the parameters used. Thanks to the equipment in Tech3Lab, such as EEG, fNIRS, physiological measurement instruments, and eye tracking measures, we are able to base our adaptive system on direct signals from the body.
This interdisciplinary study of cognitive neuroscience, marketing, and data science has the potential to revolutionize the approach of designers, developers, and editors to website design by studying auto-adaptive websites using direct body measures.
- Cognition-Based Auto-Adaptive Website User Interface in Real Time
- Zichao Yan (McGill University, William Hamilton)
- Bridging the gap between structures and functions: learning interpretable graph neural representation of RNA secondary structures for functional characterization
Cells are the basic units of life and their activity is regulated by many delicate subcellular processes that are crucial to their survival. Therefore, it is important to gain more insights into the complex control mechanisms at play, both to obtain a better fundamental understanding of biology, and to help understand diseases caused by defects in these mechanisms. We are particularly interested in the regulatory roles played by RNA molecules in the post-transcriptional phase such as subcellular localization and RNA-protein interactions. RNA secondary structures, a representation of how RNA sequences fold onto themselves, can have a significant impact on the molecule’s regulatory functions through its interaction with various mediating agents such as proteins, RNAs and small molecules. Therefore, in order to fully exploit RNA secondary structures to better understanding of their functions, we propose a novel framework of an interpretable graph neural representation of RNAs, which may ultimately lead us to the design of RNA based therapeutics for disease such as neurodegenerative disorders and cancers, the success of which would crucially depend on our capability of understanding the relations between RNA structures and functions.
- Bridging the gap between structures and functions: learning interpretable graph neural representation of RNA secondary structures for functional characterization