The Fundamental Research Funding Program provides a framework for fostering multidisciplinary research in data science and by confirming and strengthening the IVADO community as a key player in the field. This program is also very focused on the future, encouraging the training of future data researchers and the creation of the scientific foundations of tomorrow’s fundamental and applied research.
IVADO’s commitment to equity, diversity and inclusion and note to applicants
To ensure all members of society draw equal benefit from the advancement of knowledge and opportunities in data science, IVADO promotes equity, diversity and inclusion through each of its programs. IVADO aims to provide a recruitment process and research setting that are inclusive, non-discriminatory, open and transparent.
Overview
Description
FAQ
Submission
2017 Results
2020 Results
Program description
- Program name: IVADO funding program for fundamental research projects
- Program type: Grant for multidisciplinary team
- Type of research: Fundamental or applied research
- Strategic / priority domain: Data science, data-driven innovation
Goals
The goals of this program are to:
- Promote multidisciplinary research in data science, primarily in IVADO’s areas of excellence: operation research, machine learning and decision sciences.
- Lay the groundwork for subsequent research, fundamental or applied.
Deadlines
- Application opening: November 12th, 2019 9 a.m. EST
- Submission deadline: December 11th, 2019 9 a.m EST
- Expected application notification date: April 2020 ***postponed to May 2020***
- Start date of funding: April 1st or September 1st 2020
- Criteria: See description tab
- Submission: See submission tab
- Information: programmes-excellence@ivado.ca
Research area supported
The fundamental research funding program supports research on the issues raised in the Canada First Excellence funding competition: data science in a broad sense, including methodological research in data science (machine learning, operations research, statistics) and their applications in various fields.
The research supported by this program must be collaborative and multidisciplinary, whether through several methodological approaches (machine learning, operational research, decision sciences) or through combining a methodological approach with an application field. Whether or not a research project is applied has no impact on the eligibility of projects. But projects are expected to make a fundamental contribution in data science and applications must clearly emphasize and demonstrate this aspect.
Program description
- Program name: IVADO funding program for fundamental research projects
- Program type: Grant for multidisciplinary team
- Type of research: Fundamental or applied research
- Strategic / priority domain: Data science, data-driven innovation
Goals
The goals of this program are to:
- Promote multidisciplinary research in data science, primarily in IVADO’s areas of excellence: operation research, machine learning and decision sciences.
- Lay the groundwork for subsequent research, fundamental or applied.
Deadlines
- Application opening: November 12th, 2019 9 a.m EST
- Submission deadline: December 11th, 2019 9 a.m EST
- Expected application notification date: April 2020 ***postponed to May 2020***
- Start date of funding: April 1st or September 1st 2020
Research area supported
The fundamental research funding program supports fundamental research concerning the issues described in the awarded Apogée/CFREF grant: data science in a broad sense, including methodological research in data science (machine learning, operations research, statistics) and their applications in various fields.
To achieve the objectives described in this funding request, the scientific strategy focuses on two fundamental aspects of intelligent behaviour: knowledge and the ability to use it, or, in other words, on understanding, forecasting and decision-making processes. The emphasis is therefore on the development of tools, methods, algorithms and models, ideally suited for use in a wide variety of application fields. Applied research projects are encouraged, particularly in priority areas (health, transport, logistics, resources and energy, information services and trade) but not limited to these priority areas.
The research supported by this program must be collaborative and multidisciplinary, whether through several methodological approaches (machine learning, operational research, decision sciences) or through combining a methodological approach with an application field. Whether or not a research project is applied has no impact on the eligibility of projects. But projects are expected to make a fundamental contribution in data science and applications must clearly emphasize and demonstrate this aspect.
Available funds
For this second competition, a funding between $100k and $150k per year for two years will be awarded to the selected projects. So, the maximum allowable is $300k spread over two years. A portion of the total available budget has to be allocated to student scholarships (graduate and undergraduate students) and postdoctoral fellows; applicants must submit a budget in which at least two-thirds (66%) falls into this category.
Eligibility of applicants and co-applicants
- The principal investigator must hold a faculty position as a professor at HEC Montréal, Polytechnique Montréal or Université de Montréal.
- Professors from the University of Alberta and University McGill can also be principal investigators, provided that they are full members of one of IVADO’s research groups (Mila, CIRRELT, GERAD or CERC data science for real time decision making, CRM, Tech3Lab).
- The team must be composed of at least two professors. Postdocs are allowed to be members of the team.
- Team members should not all have their primary affiliation in the same department.
- Projects that aim to enhance the value of omics data by using artificial intelligence in the field of cancer are not eligible for the IVADO competition but only for the funding competition co-organized by Génome Québec, Oncopole and IVADO.
- The principal investigator can submit only one application.
Guidelines
Generally, the Tri-Agency Financial Administration Guide (http://www.nserc-crsng.gc.ca/professors-professeurs/financialadminguide-guideadminfinancier/index_eng.asp ) and the rules of the Apogée/CFREF program (http://www.cfref-apogee.gc.ca/program-programme/admin_guide-guide_administration-eng.aspx ) will serve as guides for the program.
Eligible expenses
This program allows for the funding of:
- graduate and undergraduate students;
- postdoctoral researchers;
- professionals;
- travel expenses;
- hardware, software, databases and access to computing resources.
Financing requirements
- For projects requiring ethical approval, the funds will not be released until approval is obtained.
- The funds will be transferred to the Office of Research of the institution of the principal investigator, and the institution will administer it according to its own rules.
Availability of research data
- Funded teams will be subject to the Tri-Agency Open Access Policy on Publications (http://www.science.gc.ca/eic/site/063.nsf/eng/h_F6765465.html). Teams are encouraged to publish as much of their research productions (publications, recordings of presentations, source code, databases, etc.) in compliance with rules of intellectual property that would apply in their specific situation.
- The support of IVADO and Apogée/CFREF must be acknowledged in the dissemination of research results.
Administrative screening
Proposals will go through an administrative screening. The review committee will not receive applications:
- if they do not meet the format constraints (missing sections, excessive number of pages, etc.);
- that are not presented by an eligible professor;
- if the same principal investigator sumbitted multiple projects;
- in which all team members are from the same department.
Evaluation criteria
The evaluation of projects will be based in equal parts on the one hand:
- Research project
- Relevance to the goals and domains of the funding program:
- multidisciplinary project in the field of data science,
- primarily in the areas of excellence of IVADO: operational research, machine learning, decision sciences.
- Significance of anticipated contribution to knowledge, research excellence, originality.
- Project feasibility, appropriateness of the methodology.
- Clarity of the proposal.
- Existence of a reasonable and well justified budget.
- Relevance to the goals and domains of the funding program:
and on the other:
- Team and individuals
- Excellence of the researchers involved in the application, according to their level of advancement in their career.
- Justification of the team composition (match between the team and the project, multiplicity of expertise).
- Presence of early career researchers (recently hired professors, postdoc researchers).
- Training opportunities
- Involvement of students (undergraduate and graduate).
- Involvement of post-doctoral researchers.
- Hiring and training of research professionals.
- Consistency with the other objectives of Apogée/CFREF and IVADO
- Presence in the team
- of researchers outside of Montreal;
- of postdoctoral researchers funded by IVADO;
- of professors recruited under the IVADO program.
- International collaborations.
- Validity of the dissemination plan.
- Integration in a wider research program and with longer term goals.
- Considerations and efforts to promote an equitable and inclusive research environment.
- Presence in the team
Process
Each application will be assigned to four reviewers, chosen from among professors familiar with the field of the application. Direct collaborators (co-supervisors or recent co-publications) of applicant team members will not participate in the evaluations of these proposals.
Applications will then be divided into review sub-committees by methodological area, in order to establish a ranking based on excellence. Each sub-committee will be composed of three professors from three different universities, including a chair.
The rankings of the sub-committees will be pooled at a meeting of the Chairs to determine a ranking of the applications. The chairs will propose a list of projects to be funded according to the available budget. The results will be discussed and approved by the Scientific Committee at a special meeting of the Scientific Committee.
Application process and elements
Instruction for submitting an application are available under the “Submission” tab.
Elements of the application:
- Name and affiliation of the principal investigator.
- Name and affiliation of the other team members.
- For each participant, a CV in an unconstrained format but listing at least the publications and activities of the last five years (Common CV Canada – NSERC is suggested).
- List of keywords related to the application, specifying, among others, methodological areas, potential application areas, etc.
- Justification of the team composition (max. ½ page).
- Description of the research project (max. 4 pages).
- Reference list (max. one page).
- Requested budget, detailing:
- scholarships for undergraduate and graduate students;
- postdoctoral researchers;
- professionals;
- travelling expenses;
- hardware, software, databases and access to computing resources.
- Budget justification explaining the relationship between the research project and the funding requested (max. 2 pages).
- Dissemination Plan (max. 1 page).
- Publication goals, participation in conferences or other research dissemination events, organization of a local research day to present the outcomes of the project, workshop organization, …
- HQP training plan (max. 1 page).
Final report
At the end of the two-year funding, the principal investigator must submit a final report including:
- project review;
- list of publications;
- list of participations in events;
- list of funded students, contact details and summary of their participation (in compliance with personal data management rules);
- financial report;
- list of the organized or co-organized knowledge dissemination activities;
- new financing obtained or applied to on the basis of work carried out during this project.
Contacts
- Any questions concerning the funding program can be addressed to programmes-excellence@ivado.ca
- Please also consult the FAQ section.
FAQ
Q: What happens if my application file does not meet the requested length?
R: Your application will automatically be disqualified.
Q: When will I get a reply on my application?
R: By April 2020 (see section “Description” for more details)
More questions? Please send them to: programmes-excellence@ivado.ca
Please apply through: https://ivado.smapply.io/
The elements of the application are:
- A form to be completed on the platform. An example of filled form is available here.
- CVs of the principal investigator and team members.
- Justification of the team composition (max. ½ page).
- Description of the project (max. 4 pages).
- Reference list (max. 1 page).
- Budget.
- Budget justification (max. 2 pages).
- Dissemination Plan (max. 1 page).
- HQP training plan (max. 1 page).
All documents must be uploaded to our dedicated submission platform.
- Bram Adams (Polytechnique Montréal), Antoniol Giuliano, Jiang Zhen Ming & Sénécal Sylvain: A Real-time, Data-driven Field Decision Framework for Large-scale Software Deployments
- As large e-commerce systems need to maximize their revenue, while ensuring customer quality and minimizing IT costs, they are constantly facing major field decisions like “Would it be cost-effective for the company to deploy additional hardware resources for our premium users?” This project will build a real-time, data-driven field decision framework exploiting customer behaviour and quality of service models, release engineering and guided optimization search. It will benefit both Canadian software industry and society, by improving the quality of service experienced by Canadians.
- Jean-François Arguin (Université de Montréal), Tapp Alain, Golling Tobias, Ducu Otilia & Mochizuki Kazuya: Machine learning for the analysis of the Large Hadron Collider Data at CERN
- The Large Hadron Collider (LHC) is one of the most ambitious experiment ever conducted. It collides protons together near the speed of light to reproduce the conditions of the Universe right after the Big Bang. It possesses all the features of Big Data: 1e16 collisions are produced each year, each producing 1000 particles and each of these particle leaving a complex signature in the 100 million electronic channels of the ATLAS detector. This project will initiate a collaboration between data scientists and physicists to develop the application of machine learning to the analysis of the LHC data.
- Olivier Bahn (HEC Montréal), Caines Peter, Delage Erick, Malhamé Roland & Mousseau Normand: Valorisation des données et Optimisation Robuste pour guider la Transition Énergétique vers des réseauX intelligents à forte composante renouvelable (VORTEX)
- Une modélisation multiéchelles consistant en une famille de modèles hiérarchisés et opérant à des échelles de temps croissantes (journée / semaine à mois / horizon de trente ans), et des outils mathématiques adaptés (jeux à champ moyen répétés, apprentissage machine, optimisation convexe et robuste), sont proposés comme base pour une gestion raisonnée de la transition vers des réseaux électriques intelligents à forte composante renouvelable. Notre projet proposera en particulier des outils pour aider à la maîtrise de la demande énergétique dans un contexte régional.
- Yoshua Bengio (Université de Montréal), Cardinal Héloïse, Carvalho Margarida & Lodi Andrea: Data-driven Transplantation Science
- End-stage kidney disease is a severe condition with a rising incidence, currently affecting over 40,000 Canadians.
The decision to accept or refuse an organ for transplantation is an important one, as the donor’s characteristics are strongly associated with the long-term survival of the transplanted kidney. In partnership with their health care provider, the transplant candidates need to answer two questions: (1) How long is the kidney from this specific donor expected to last for me? (2) If I refuse this specific donor, how much longer am I expected to wait before getting a better kidney?
We propose to use deep learning to predict the success of a possible matching. The results will contribute to build a clinical decision support tool answering the two questions above and helping transplant physicians and candidates to make the best decision. In addition, the quality of the matching can be the input of optimization algorithms designed to improve social welfare of organ allocations.
- End-stage kidney disease is a severe condition with a rising incidence, currently affecting over 40,000 Canadians.
- Michel Bernier (Polytechnique Montréal), Kummert Michaël & Bahn Olivier: Développement d’une méthodologie pour l’’utilisation des données massives issues de compteurs intelligents pour modéliser un parc de bâtiments
- Les données disponibles grâce à la généralisation des compteurs communicants représentent une grande opportunité pour améliorer les modèles de parc de bâtiments et les modèles plus généraux de flux énergétiques, mais les connaissances fondamentales à ce sujet sont encore limitées. Le présent projet vise à y remédier en développant une méthodologie permettant d’’utiliser les données massives des compteurs électriques communicants pour caractériser et calibrer, notamment par modélisation inverse, des archétypes de bâtiments qui pourront être intégrés dans le modèle TIMES.
- Guillaume-Alexandre Bilodeau (Polytechnique Montréal), Aloise Daniel, Pesant Gilles, Saunier Nicolas & St-Aubin Paul: Road user tracking and trajectory clustering for intelligent transportation systems
- While traffic cameras are a mainstay of traffic management centers, video data is still most commonly watched by traffic operators for traffic monitoring and incident management. There are still few applications of computer vision in ITS, apart from integrated sensors for specific data extraction such as road users (RUs) counts. One of the most useful data to extract from video is the trajectory of all RUs, including cars, trucks, bicycles and pedestrians. Since traffic videos include many RUs, finding their individual trajectory is challenging. Our first objective is therefore to track all individual RUs. The second objective is to interpret the very large number of trajectories that can be obtained. This can be done by clustering trajectories, which provides the main motions in the traffic scene corresponding to RU activities and behaviors, along with their frequency or probability. Results of this research will be applicable for traffic monitoring in ITS and for self-driving cars.
- François Bouffard (McGill University), Anjos Miguel & Waaub Jean-Philippe: The Electricity Demand Response Potential of the Montreal Metropolitan Community: Assessment of Potential Impacts and Options
- This project will develop a clear understanding of the potential benefits and trade-offs of key stakeholders for deploying significant electric power demand response (DR) in the Montreal Metropolitan Community (MMC) area. It is motivated primarily by the desire of Hydro-Québec to increase its export potential, while at the same time by the need to assess DR deployment scenarios and their impacts on the people and businesses of the MMC. Data science is at the heart of this work which will need to discover knowledge on electricity consumption in order to learn how to leverage and control its flexibility.
- Tolga Cenesizoglu (HEC Montréal), Grass Gunnar & Jena Sanjay: Real-time Optimal Order Placement Strategies and Limit Order Trading Activity
- Our primary objective is to identify how institutional investors can reduce their risk and trading costs by optimizing when and how to execute their trades. Limit order trading activity is an important state variable for this optimization problem in today’s financial markets where most liquidity is provided by limit orders. We thus plan to first analyze how risk and trading costs are affected by limit order trading activity using a novel, large-scale, ultra-high-frequency trading data set. We will then use our findings to guide us in modeling these effects and devising real-time optimal order placement strategies.
- Laurent Charlin (HEC Montréal) & Jena Sanjay Dominik: Exploiting ML/OR Synergies for Assortment Optimization and Recommender Systems
- We propose to exploit synergies between assortment optimization and recommender systems on the application level, and the interplay between machine learning and mathematical programming on the methodological level. Rank-based choice models, estimated in a purely data-driven manner will introduce diversity into recommender systems, and supervised learning methods will improve the scalability and efficiency of assortment optimization in retail.
- Julien Cohen (Polytechnique Montréal), Kadoury Samuel, Pal Chris, Bengio Yoshua, Romero Soriano & Guilbert François: Transformative adversarial networks for medical imaging applications
- Following the concept of Generative adversarial networks (GANs), we propose to explore transformative adversarial training techniques where our goal is to transform medical imaging data to a target reference space as a way of normalizing them for image intensity, patient anatomy as well as the many other parameters associated with the variability inherent to medical images. This approach will be investigated both for data normalization and data augmentation strategy, and will be tested in several multi-center clinical data for lesion segmentation and/or classification (diagnosis).
- Patrick Cossette (Université de Montréal), Bengio Yoshua, Laviolette François & Girard Simon: Towards personalized medicine in the management of epilepsy: a machine learning approach in the interpretation of large-scale genomic data
- To date, more than 150 epilepsy genes have been identified explaining around 35% of the cases. However, conventional genomics methods have failed to explain the full spectrum of epilepsy heritability, as well as antiepileptic drug resistance. In particular, conventional studies lack the ability to capture the full complexity of the human genome, such as interactions between genomic variations (epistasis). In this project, we will investigate how we can use machine learning algorithms in the analyses of genomic data in order to detect multivariate patterns, by taking advantage of our large dataset of individual epilepsy genomes. In this multi-disciplinary project, neurologists, geneticists, bio-informaticians and computational scientists will join forces in order to use machine learning algorithms to detect genomic variants signatures in patients with pharmaco-resistant epilepsy. Having the ability to predict pharmaco-resistance will ultimately reduce the burden of the disease.
- Benoit Coulombe (Université de Montréal), Lavallée-Adam Mathieu, Gauthier Marie-Soleil, Gaspar Vanessa, Pelletier Alexander, Wong Nora & Christian Poitras: A machine learning approach to decipher protein-protein interactions in human plasma
- Proteins circulating in the human bloodstream make very useful and accessible clinical biomarkers for disease diagnostics, prognostics and theranostics. Typically, to perform their functions, proteins will interact with other molecules, including other proteins. These protein-protein interactions provide valuable insights into a protein’s role and function in humans; it can also lead to the discovery of novel biomarkers for diseases in which the protein of interest is involved. However, the identification of such interactions in human plasma is highly challenging. The lack of proper biochemical controls, which are inherently noisy, makes the confidence assessment of these interactions very difficult. We therefore propose to develop a novel machine learning approach that will extract the relevant signal from noisy controls to confidently decipher the interactome of clinically-relevant proteins circulating in the human bloodstream with the ultimate goal of identifying novel biomarkers.
- Michel Denault (HEC Montréal), Côté Pascal & Orban Dominique: Simulation and regression approaches in hydropower optimization
- We develop optimization algorithms based on dynamic programming with simulations and regression, essentially Q-learning algorithms. Our main application area is hydropower optimization, a stochastic control problem where optimal releases of water are sought at each point in time.
- Michel Desmarais (Polytechnique Montréal), Charlin Laurent & Cheung Jackie C. K: Matching individuals to review tasks based on topical expertise level
- The task of selecting an expert to review a paper addresses the general problem of finding a match between a human and an assignment based on the quality of expertise alignment between the two. State of the art approaches generally rely on modeling reviewers as a distribution of topic expertise, or as a set of keywords. Yet, two expert can have the same relative topic distribution and have wide differences in their depth of understanding. A similar argument can be made for papers. The objective of this proposal is to enhance the assignment approach to include the notions of (1) reviewer mastery of a topic, and (2) paper topic sophistication. Means to assess each aspect are proposed, along with approaches to assignments based on this additional information.
- Georges Dionne (HEC Montréal) Morales Manuel, d’Astous Philippe, Yergeau Gabriel, Rémillard Bruno & Shore Stephen H.: Asymmetric Information Tests with Dynamic Machine Learning and Panel Data
- To our knowledge, the econometric estimation of dynamic panel data models with machine learning is not very developed and tests for the presence of asymmetric information in this environment are lacking. Most often, researchers assume the presence of asymmetric information and propose models (sometimes dynamic) to reduce its effects but do not test for residual asymmetric information in final models. Potential non-optimal pricing of financial products may still be present. Moreover, it is often assumed that asymmetric information is exogenous and related to unobservable agent characteristics (adverse selection) without considering agents’ dynamic behavior over time (moral hazard). Our goal is to use machine learning models to develop new tests of asymmetric information in large panel data sets where the dynamic behavior of agents is observed. Applications in credit risk, high frequency trading, bank securitization, and insurance will be provided.
- Marc Fredette (HEC Montréal), Charlin Laurent, Léger Pierre-Majorique, Sénécal Sylvain, Courtemanche François, Labonté-Lemoyne Élise & Karran Alexander: Improving the prediction of the emotional and cognitive experience of users (UX) in interaction with technology using deep learning.
- The objective of this research project is to leverage new advances in artificial intelligence, and more specifically deep learning approaches, to improve the prediction of emotional and cognitive experience of users (UX) in interaction with technology. What users experience emotionally and cognitively when interacting with an interface is a key determinant of the success or failure of digital products and services. Traditionally, user experience has been assessed with post hoc explicit measures, (i.e. such as questionnaires. However, these measures are unable to capture the states of users while they interact with technology. Researchers are turning to neuroscience implicit measures to capture the user’s states through psychophysiological inference. Deep learning has recently enabled other fields such as image recognition to make significant progress and we expect that it will do the same for psychophysiological inference, allowing the automatic modeling of complex feature sets.
- Geneviève Gauthier (HEC Montréal), Amaya Diego, Bégin Jean-François, Cabeda Antonio & Malette-Campeau : L’utilisation des données financières à haute fréquence pour l’estimation de modèles financiers complexes
- Les modèles de marché permettant de reproduire la complexité des interactions entre l’actif sous-jacent et les options requièrent une complexité qui rend leur estimation très difficile. Ce projet de recherche propose d’utiliser les données financières d’options à haute fréquence afin de mieux mesurer et gérer les différents risques du marché.
- Michel Gendreau (Polytechnique Montréal), Potvin Jean-Yves, Aloise Daniel & Vidal Thibaut : Nouvelles approches pour la modélisation et la résolution de problèmes de livraisons à domicile.
- Ce projet porte sur le développement de nouvelles approches permettant de mieux aborder les problèmes de livraisons à domicile qui, suite à l’avènement généralisé du commerce électronique, ont connu un essor très important au cours de la dernière décennie. Une partie des travaux portera sur la modélisation même de ces problèmes, notamment en ce qui concerne les objectifs poursuivis par les expéditeurs. Le reste du projet visera sur le développement d’’heuristiques et de méta-heuristiques à la fine pointe des connaissances pour la résolution efficace de problèmes de grande taille.
- Bernard Gendron (Université de Montréal), Crainic Teodor Gabriel, Jena Sanjay Dominik & Lacoste-Julien Simon: Optimization and machine learning for fleet management of autonomous electric shuttles
- Recently, a Canada-France team of 11 researchers led by Bernard Gendron (DIRO-CIRRELT, UdeM) has submitted an NSERC-ANR strategic project “Trustworthy, Safe and Smart EcoMobility-on-Demand”, supported by private and public partners on both sides of the Atlantic: in Canada, GIRO and the City of Montreal; in France, Navya and the City of Valenciennes. The objective of this project is to develop optimization models and methods for planning and managing a fleet of autonomous electric shuttle vehicles. As a significant and valuable additional contribution to this large-scale project, we plan to study the impact of combining optimization and machine learning to improve the performance of the proposed models and methods.
- Julie Hussin (Université de Montréal), Gravel Simon, Romero Adriana & Bengio Yoshua: Deep Learning Methods in Biomedical Research: from Genomics to Multi-Omics Approaches
- Deep learning approaches represent a promising avenue to make important advances in biomedical science. Here, we propose to develop, implement and use deep learning techniques to combine genomic data with multiple types of biomedical information (eg. other omics datasets, clinical information) to obtain a more complete and actionable picture of the risk profile of a patient. In this project, we will be addressing the important problem of missing data and incomplete datasets, evaluating the potential of these approaches for prediction of relevant medical phenotypes in population and clinical samples, and developing integration strategies for large heterogeneous datasets. The efficient and integrated use of multiomic data could lead to the improvement of disease risk and treatment outcome predictions in the context of precision medicine.
- Sébastien Jacquemont (Université de Montréal), Labbe Aurélie, Bellec Pierre, Catherine Schramm, Chakravarty Mallar & Michaud Jacques: Modeling and predicting the effect of genetic variants on brain structure and function
- Neurodevelopmental disorders (NDs) represent a significant health burden. The genetic contribution to NDs is approximately 80%. Whole genome testing in pediatrics is a routine procedure and mutations contributing significantly to neurodevelopmental disorders are identified in over 400 patients every year at the Sainte Justine Hospital. However, the impact of these mutations on cognition and brain structure and function is mostly unknown. However, mounting evidence suggests that genes that share similar characteristics produce similar effects on cognitive and neural systems.
Our goal: Develop models to understand the effects of mutations, genome-wide, on cognition, brain structure and connectivity.
Models will be developed using large cohorts of individuals for whom, genetic, cognitive and neuroimaging data was collected.
Deliverable: Algorithms allowing clinicians to understand the contribution of mutations to the neurodevelopmental symptoms observed in their patients.
- Neurodevelopmental disorders (NDs) represent a significant health burden. The genetic contribution to NDs is approximately 80%. Whole genome testing in pediatrics is a routine procedure and mutations contributing significantly to neurodevelopmental disorders are identified in over 400 patients every year at the Sainte Justine Hospital. However, the impact of these mutations on cognition and brain structure and function is mostly unknown. However, mounting evidence suggests that genes that share similar characteristics produce similar effects on cognitive and neural systems.
- Karim Jerbi (Université de Montréal), Hjelm Devon, Plis Sergey, Carrier Julie, Lina Jean-Marc, Gagnon Jean-François & Dr Pierre Bellec: From data-science to brain-science: AI-powered investgation of the neuronal determinants of cognitive capacities in health, aging and dementia
- Artificial intelligence is revolutionizing science, technology and almost all aspects of our society. Learning algorithms that have shown astonishing performances in computer vision and speech recognition are also expected to lead to qualitative leaps in biological and biomedical sciences. In this multi-disciplinary research program, we propose to investigate the possibility of boosting information yield in basic and clinical neuroscience research by applying data-driven approaches, including shallow and deep learning, to electroencephalography (EEG) and magnetoencephalography (MEG) data in (a) healthy adults, and aging populations (b) with or (c) without dementia. The proposal brings together several scientists with expertise in a wide range of domains, ranging from data science, mathematics and engineering to neuroimaging, systems, cognitive and clinical neuroscience.
- Philippe Jouvet (Université de Montréal), Emeriaud Guillaume, Michel Desmarais, Farida Cheriet & Noumeir Rita: Clinical data validation processes: the example of a clinical decision support system for the management of Acute Respiratory Distress Syndrome (ARDS)
- In healthcare, data collection has been designed to document clinical activity for reporting, rather than for developing new knowledge. In this proposal, part of a research program on clinical decision support systems in real time in critical care, machine learning researchers and clinicians plan to generate algorithms to manage data corruption and data complexity using a unique research dataware house collecting hudge critically ill children data.
- Aurelie Labbe (HEC Montréal), Larocque Denis, Charlin Laurent & Miranda-Moreno: Data analytics methods for travel time estimation in transportation engineering
- Travel time is considered as one of the most important performance measures in urban mobility. It is used by both network operators and drivers as an indicator of quality
of service or as a metric influencing travel decisions. This proposal tackles the issue of travel time prediction from several angles: i) data pre-processing (map-matching), ii) short-term travel time prediction and iii) long-term travel time prediction. These tasks will require the development of new approaches in statistical and machine learning to adequately model GPS trajectory data and to quantify the prediction error.
- Travel time is considered as one of the most important performance measures in urban mobility. It is used by both network operators and drivers as an indicator of quality
- Frederic Leblond (Polytechnique Montréal), Trudel Dominique, Ménard Cynthia, Saad Fred, Jermyn Michael & Grosset Andrée-Anne: Machine learning technology applied to the discovery of new vibrational spectroscopy biomarkers for the prognostication of intermediate-risk prostate cancer patients
- Prostate cancer is the most frequent cancer among Canadian men, with approximately 25,000 diagnoses per year. Men with high risk and low risk disease almost always experience predictable disease evolution allowing optimal treatment selection. However, none of the existing clinical tests, imaging techniques or histopathology methods can be used to predict the fate of men with intermediate-risk disease. This is the source of a very important unmet clinical need, because while some of these patients remain free of disease for several years, in others cancer recurs rapidly after treatment. Using biopsy samples in tissue microarrays from 104 intermediate-risk prostate cancer patients with known outcome, we will use a newly developed Raman microspectroscopy technique along with machine learning technology to develop inexpensive prognostic tests to determine the risk of recurrence allowing clinicians to consider more aggressive treatments for patients with high recurrence risk.
- Pierre L’Ecuyer (Université de Montréal), Devroye Luc & Lacoste-Julien Simon: Monte Carlo and Quasi-Monte Carlo Methods for Optimization and Machine Learning
- The use of Monte Carlo methods (aka, stochastic simulation) has grown tremendously in the last few decades. They a now a central ingredient in many areas, including computational statistics, machine learning, and operations research. Our aim in this project is to study Monte Carlo methods and improve their efficiency, with a focus on applications to statistical modeling with big data, machine learning, and optimization. We are particularly interested in developing methods for which the error converges at a faster rate than straightforward Monte Carlo. We plan to free software that implements these methods.
- Eric Lecuyer (Université de Montréal), Blanchette Mathieu & Waldispühl Jérôme: Developing a machine learning framework to dissect gene expression control in subcellular space
- Our multidisciplinary team will develop and use an array of machine learning approaches to study a fundamental but poorly understood process in molecular biology, the subcellular localization of messenger RNAs, whereby the transcripts of different human genes are transported to various regions of the cell prior to translation. The project will entail the development of new learning approaches (learning from both RNA sequence and structure data, phylogenetically related training examples, batch active learning) combined with new biotechnologies (large-scale assays of both natural and synthetic RNA sequences) to yield mechanistic insights into the “localization code” and help understand its role in health and disease.
- Sébastien Lemieux (Université de Montréal), Bengio Yoshua , Sauvageau Guy & Cohen Joseph Paul: Deep learning for precision medicine by joint analysis of gene expression profiles measured through RNA-Seq and microarrays
- This project aims at developing domain adaptation techniques to enable the joint analysis of gene expression profiles datasets acquired using different technologies, such as RNA-Seq and microarrays. Doing so will leverage the large number of gene expression profiles publicly available, avoiding the typical problems and limitations caused by working with small datasets. More specifically, methods developed will be continuously applied to datasets available for Acute Myeloid Leukemia in which the team has extensive expertise.
- Andrea Lodi (Polytechnique Montréal), Bengio Yoshua, Charlin Laurent, Frejinger Emma & Lacoste-Julien Simon: Machine Learning for (Discrete) Optimization
- The interaction between Machine Learning and Mathematical Optimization is currently one of the most popular topics at the intersection of Computer Science and Applied Mathematics. While the role of Continuous Optimization within Machine Learning is well known, and, on the applied side, it is rather easy to name areas in which data-driven Optimization boosted by / paired with Machine Learning algorithms can have a game-changing impact, the relationship and the interaction between Machine Learning and Discrete Optimization is largely unexplored. This project concerns one aspect of it, namely the use of modern Machine Learning techniques within / for Discrete Optimization.
- Alejandro Murua (Université de Montréal), Quintana Fernando & Quinlan José: Gibbs-repulsion and determinantal processes for statistical learning
- Non-parametric Bayesian models are very popular for density estimation and clustering. However, they have a tendency to use too many mixture components due to their use of independent parameter priors. Repulsion processes priors such as determinantal processes, solve this issue by putting higher mass on parameter configurations for which the mixture components are well separated. We propose the use of Gibbs-like repulsion processes which are locally determinantal, or adaptive determinantal processes as priors for modeling density estimation, clustering, and temporal and/or spatial data.
- Marcelo Vinhal Nepomuceno (HEC Montréal), Charlin Laurent, Dantas Danilo C., & Cenesizoglu Tolga: Using machine learning to uncover how marketer-generated post content is associated with user-generated content and revenue
- This projects proposes how machine learning can be used to improve a company’s communication with its customers in order to increase sales. To that end, we will identify how broadcaster-generated content is associated with user-generated content and revenue measures. In addition, we intend to automate the identification of post content, and to propose personalized recurrent neural networks to identify the writing styles of brands and companies and automate the creation of online content.
- Dang Khoa Nguyen (Université de Montréal), Sawan Mohamad, Lesage Frédéric, Zerouali Younes & Sirpal Parikshat: Real-time detection and prediction of epileptic seizures using deep learning on sparse wavelet representations
- Epilepsy is a chronic neurological condition in which about 20% of patients do not benefit from any form of treatment. In order to diminish the impact of recurring seizures on their lives, we propose to exploit the potential of artificial intelligence techniques for predicting the occruence of seizures and detecting their early onset, such as to issue warnings to patients. The aim of this project is thus to develop an efficient algorithm based on deep neural networks for performing real-time detection and prediction of seizures. This work will pave the way for the development of intelligent implantable sensors coupled with alert systems and on-site treatment delivery.
- Jian-Yun Nie (Université de Montréal), Langlais Philippe, Tang Jian & Tapp Alain: Knowledge-based inference for question answering and information retrieval
- Question answering (QA) is a typical NLP/AI problem with wide applications. A typical approach first retrieves relevant text passages and then determines the answer from them. These steps are usually performed separately, undermining the quality of the answers. In this project, we aim at developing new methods for QA in which the two steps can benefit from each other. On one hand, inference based on a knowledge graph will be used to enhance the passage retrieval step; on the other hand, the retrieved passages will be incorporated into the second step to help infer the answer. We expect the methods to have a higher capability of determining the right answer.
- Jean-François Plante (HEC Montréal), Brown Patrick, Duschesne Thierry & Reid Nancy: Statistical modelling with distributed systems
- Statistical inference requires a large toolbox of models and algorithms that can accommodate different structures in the data. Modern datasets are often stored on distributed systems where the data are scattered across a number of nodes with limited bandwidth between them. As a consequence, many complex statistical models cannot be computed natively on those clusters. In this project, we will advance statistical modeling contributions to data science by creating solutions that are ideally suited for analysis on distributed systems.
- Doina Precup (McGill University), Bengio Yoshua & Pineau Joelle: Learning independently controllable features with application to robotics
- Learning good representations is key for intelligent systems. One intuition is that good features will disentangle distinct factors that explain variability in the data, thereby leading to the potential development of causal reasoning models. We propose to tackle this fundamental problem using deep learning and reinforcement learning. Specifically, a system will be trained to discover simultaneously features that can be controlled independently, as well as the policies that control them. We will validate the proposed methods in simulations, as well as by using a robotic wheelchair platform developed at McGill University .
- Marie-Ève Rancourt (HEC Montréal), Laporte Gilbert, Aloise Daniel, Cervone Guido, Silvestri Selene, Lang Stefan, Vedat Verter & Bélanger Valérie: Analytics and optimization in a digital humanitarian context
- When responding to humanitarian crises, the lack of information increases the overall uncertainty. This hampers relief efforts efficiency and can amplify the damages. In this context, technological advances such as satellite imaging and social networks can support data gathering and processing to improve situational awareness. For example, volunteer technical communities leverage ingenious crowdsourcing solutions to make sense of a vast volume of data to virtually support relief efforts in real time. This research project builds on such digital humanitarianism initiatives through the development of innovative tools that allow evidence-based decision making. The aim is to test the proposed methodological framework to show how data analytics can be combined with optimization to process multiple sources of data, and thus provide timely and reliable solutions. To this end, a multidisciplinary team will work on two different applications: a sudden-onset disaster and a slow-onset crisis.
- Louis-Martin Rousseau (Polytechnique Montréal), Adulyasak Yossiri, Charlin Laurent, Dorion Christian, Jeanneret Alexandre & Roberge David: Learning representations of uncertainty for decision making processes
- Decision support and optimization tools are playing an increasingly important role in today’s economy. The vast majority of such systems, however, assume the data is either deterministic or follows a certain form of theoretical probability functions. We aim to develop data driven representations of uncertainty, based on modern machine learning architectures such as probabilistic deep neural networks, to capture complex and nonlinear interactions. Such representations are then used in stochastic optimization and decision processes in the fields of cancer treatment, supply chain and finance.
- Nicolas Saunier (Polytechnique Montréal), Goulet James, Morency Catherine, Patterson Zachary & Trépanier Martin: Fundamental Challenges for Big Data Fusion and Strategic Transportation Planning
- As more and more transportation data becomes continuously available, transportation engineers and planners are ill-equipped to make use of it in a systematic and integrated way. This project aims to develop new machine learning methods to combine transportation data streams of various nature, spatial and temporal definitions and pertaining to different populations. The resulting model will provide a more complete picture of the travel demand for all modes and help better evaluate transportation plans. This project will rely on several large transportation datasets.
- Yvon Savaria (Polytechnique Montréal), David Jean-Pierre, Cohen-Adad Julien & Bengio Yoshua: Optimised Hardware-Architecture Synthesis for Deep Learning
- Deep learning requires considerable computing power. Computing power can be improved significantly by designing application specific computing engines dedicated to deep learning. The proposed project consists of designing and implementing a High Level Synthesis tool that will generate an RTL design from the code of an algorithm. This tool will optimize the architecture, the number of computing units, the length and representation of the numbers and the important parameters of the various memories generated.
- Mohamad Sawan (Polytechnique Montréal), Savaria Yvon & Bengio Yoshua: Equilibrium Propagation Framework: Analog Implementation for Improved Performances (Equipe)
- The main aim of this project is to implement the Equilibrium Propagation (EP) algorithm in analog circuits, rather than digital building blocks, to take advantage of their higher computation speed and power efficiency. EP involves minimization of an energy function, which requires a long relaxation phase that is costly (in terms of time) to simulate on digital hardware. But it can be accelerated through analog circuit implementation. Two main implementation phases in this project are: (1) Quick prototyping and proof of concep using an FPAA platform (RASP 3.0), and (2) High performance custom System-on-Chip (SoC) implementation using a standard CMOS process e.g. 65nm to optimize the area, speed, and power consumption.
- François Soumis (Polytechnique Montréal), Desrosiers Jacques, Desaulniers Guy, El Hallaoui Issmail, Lacoste-Julien Simon, Omer Jérémy & Mohammed Saddoune: Combiner l’apprentissage automatique et la recherche opérationnelle pour traiter plus rapidement les grands problèmes d’horaires d’équipages aériens
- Nous travaux récents portent sur le développement d’’algorithmes d’’optimisation exacts qui profitent de l’’information a priori sur les solutions attendues pour réduire le nombre de variables et de contraintes à traiter simultanément. L’objectif est de développer un système d’apprentissage machine pour obtenir l’’information permettant d’accélérer le plus possible ces algorithmes d’’optimisation, pour traiter de plus grands problèmes d’’horaires d’’équipages aériens. Ce projet produira en plus des avancements en R. O. des avancements en apprentissage sous contraintes et par renforcement.
- An Tang (Université de Montréal), Pal Christopher, Kadoury Samuel, Bengio Yoshua, Turcotte Simon, Nguyen Bich & Anne-Marie Mes-Masson: Predictive model of colorectal cancer liver metastases response to chemotherapy
- Colon cancer is the 2nd leading cause of mortality in Canada. In patients with colorectal liver metastases, response to chemotherapy is the main determinant of patient survival. Our multidisciplinary team will develop models based to predict response to chemotherapy and patient prognosis using the most recent innovations in deep learning architectures. We will train our model on data from an institutional biobank and validate our model on independent provincial imaging and medico-administrative databases.
- Pierre Thibault (Université de Montréal), Lemieux Sébastien, Bengio Yoshua & Perreault Claude: Matching MHC I-associated peptide spectra to sequencing reads using deep neural networks
- Identification of MHC I-associated peptides (MAPs) unique to a patient or tumor is key step in developing efficacious cancer immunotherapy. This project aims at developing a novel approach for exploiting Deep Neural Networks (DNN) for the identification of MAPS based on a combination of next-generation sequencing (RNA-Seq) and tandem mass spectrometry (MS/ MS). The proposed developments will take advantage of a unique dataset of approximately 60,000 (MS/MS – sequence) pairs assembled by our team. The project will also bring together researchers from broad horizons: mass spectrometry, bioinformatics, machine learning and cancer immunology
- Charles Audet (Polytechnique Montréal), Sébastien Le Digabel, Michael Kokkolaras, Miguel Diage Martinez: Combining machine learning and blackbox optimization for engineering design
- The efficiency of machine learning (ML) techniques relies on many mathematical foundations, one of which being optimization and its algorithms. Some aspects of ML can be approached using the simplex method, dynamic programming, line-search, Newton or quasi-Newton descent techniques. But there are many ML problems that do not posses an exploitable structure necessary for the application of the above methods. The objective of the present proposal is to merge, import, specialize and develop blackbox optimization (BBO) techniques in the context of ML. BBO considers problems in which the analytical expressions of the objective function and/or of the constraints defining an optimization are unavailable. The most frequent situation is when these functions are computed through a time-consuming simulation. These functions are often nonsmooth, contaminated by numerical noise and can fail to produce an usable output. Research in BBO is in constant growth since the last 20 years, and has seen a variety of applications in many fields. The research projects will be bidirectional. We plan to use and develop BBO techniques to improve the performance of ML algorithms. Conversely, we plan to deploy ML strategies to improve the efficiency of BBO algorithms.
- Julien Cohen-Adad (Polytechnique Montréal), Yoshua Bengio, Joseph Cohen, Nicolas Guizard, Kawin Setsompop, Anne Kerbrat, David Cadotte: Physics-informed deep learning architecture to generalize medical imaging tasks
- The field of AI has flourished in recent years; in particular deep learning has shown unprecedented performance for image analysis tasks, such as segmentation and labeling of anatomical and pathological features. Unfortunately, while dozens of deep learning papers applied to medical imaging get published every year, most methods are tested in single-center: in the rare case where the code is publicly available, the algorithm usually fails when applied to other centers, which is the “real-world” scenario. This happens because images from different centers have different features than the images used to train the algorithm (contrast, resolution, etc.). Another issue limiting the performance potential of deep learning in medical imaging is that little data and few manual labels are available, and the labels are themselves highly variable across experts. The main objective of this project is to push the generalization capabilities of medical imaging tasks by incorporating prior information from MRI physics and from the inter-rater variability into deep learning architectures. A secondary objective will be to disseminate the developed methods to research and hospital institutions via open-source software (www.ivadomed.org), in-situ training and workshops.
- Patricia Conrod (Université de Montréal), Irina Rish, Sean Spinney: A neurodevelopmentally-informed computational model of flexible human learning and decision making
- The adolescent period is characterized by significant neurodevelopmental changes which impact on reinforcement learning and the efficiency with which such learning occurs. Our team has modelled passive-avoidance learning using a bayesian reinforcement learning framework. Results indicated that parameters estimating individual differences in impulsivity, reward sensitivity, punishment sensitivity and working memory, best predicted human behaviour on the task. The model was also sensitive to year-to-year changes in performance (cognitive development), with individual components of the learning model showing different developmental growth patterns and relationships to health risk behaviours. This project aims to expand and validate this computer model of human cognition to: 1) Better measure neuropsychological age/delay; 2) understand how learning parameters contribute to human decision making processes on more complex learning tasks; 3) simulate better learning scenarios to inform development of targeted interventions that boost human learning and decision making; and 4) inform next generation artificial intelligence models of lifelong learning.
- Numa Dancause (Université de Montréal), Guillaume Lajoie, Marco Bonizzato: Novel AI driven neuroprothetics to shape stroke recovery
- Stroke is the leading cause of disability in occidental countries. After stroke, patients often have abnormally low activity in the part of the brain that controls movements, the motor cortex. However, the malfunctioning motor cortex receives connections from multiple spared brain regions. Our general hypothesis is that neuroprostheses interfacing with the brain can exploit these connections to help restore adequate motor cortex activation after stroke. In theory, brain connections can be targeted using new electrode technologies, but this problem is highly complex. It cannot be done by hand, one patient at a time. We need automated stimulation strategies to harness this potential for recovery. Our main objective is thus to develop an algorithm that efficiently finds the best residual connections to restore adequate excitation of the motor cortex after stroke. In animals, we will implant hundreds of electrodes in the diverse areas connected with the motor cortex. The algorithm will learn the pattern of stimulation that is the most effective to increase activity in the motor cortex. For the first time, machine learning will become a structural part of neuroprosthetic design. We will use these algorithms to create a new generation of neuroprotheses that act as rehabilitation catalyzers.
- Michel Denault (HEC Montréal), Dominique Orban, Pierre-OIivier Pineau: Paths to a cleaner Northeast energy system through approximate dynamic programming
- Our main research question is the design of greener energy systems for the American Northeast (Canada and USA). Some of the subquestions are as follows. How can renewable energy penetrate the markets? Are supplementary power transmission lines necessary ? Can energy storage improve the intermittency problems of wind and solar power? Which greenhouse gases (GHG) reductions are achievable ? What is the cost of such changes ? Crucially, what is the path to a better system ? To support the transition to this new energy system, our proposition is : 1. to model the evolution of the Northeast power system as a Markov Decision process (MDP), including crucial uncertainties, e.g. on technological advances and renewable energy cost; 2. to solve this decision process with dynamic programming and reinforcement learning techniques; 3. to derive energy/environmental policy intelligence from our computational results. Our methodological approach relies on two building blocks, an inter-regional energy model and a set of algorithmic tools to solve the model as an MDP.
- Vincent Grégoire (HEC Montréal), Christian Dorion, Manuel Morales, Thomas Hurtut: Learning the Dynamics of the Limit Order Book
- Modern financial markets are increasingly complex. A particular topic of interest is how this complexity affects how easily investors can buy or sell securities at a fair price. Many have also raised concerns that algorithms trading at high frequency could create excess volatility and crash risk. The central objective of our research agenda is to better understand the fundamental forces at play in those markets where trading speed is now measured in nanoseconds. Our project seeks to lay the groundwork, using big data, visualization, and machine learning, to answer some of the most fundamental questions in the literature on market structure. Ultimately, we envision an environment in which we could learn the behavior of the various types of agents in a given market. Once such an environment is obtained, it would allow us to better understand, for instance, the main drivers of major market disruptions. More importantly, it could allow us to guide regulators in the design of new regulations, by testing them in a highly realistic simulation setup, thereby avoiding the unintended consequences associated with potential flaws in the proposed regulation.
- Mehmet Gumus (McGill University), Erick Delage, Arcan Nalca, Angelos Georghiou: Data-driven Demand Learning and Sharing Strategies for Two-Sided Online Marketplaces
- The proliferation of two-sided online platforms managed by a provider is disrupting the global retail industry by enabling consumers (on one side) and sellers (on the other side) to interact in exponential ways. Evolving technologies such as artificial intelligence, big data analytics, distributed ledger technology, and machine learning are posing challenges and opportunities for the platform providers with regards to understanding the behaviors of the stakeholders – consumers, and third-party sellers. In this proposed research project, we will focus on two-sided platforms for which demand-price relationship is unknown upfront and has to be learned from accumulating purchase data, thus highlighting the importance of the information-sharing environment. In order to address this problem, we will focus on the following closely connected research objectives: 1.Identify the willingness-to-pay and purchase decisions (i.e., conversion rate) of online customers based on how they respond to the design of product listing pages, online price and promotion information posted on the page, shipping and handling prices, and stock availability information. 2.Determine how much of the consumer data is shared with the sellers and quantify the value of different information sharing configurations – given the sellers’ optimal pricing, inventory (product availability), and product assortment (variety) decisions within a setting.
- Julie Hussin (Université de Montréal), Sébastien Lemieux, Matthieu Ruiz, Yoshua Bengio, Ahmad Pesaranghader: Interpretability of Deep Learning Approaches Applied to Omics Datasets
- The high-throughput generation of molecular data (omics data) nowadays permits researchers to glance deeply into the biological variation that exists among individuals. This variation underlies the differences in risks for human diseases, as well as efficacy in their treatment. This requires combining multiple biological levels (multi-omics) through flexible computational strategies, including machine learning (ML) approaches, becoming highly popular in biology and medicine, with a particular enthusiasm for deep neural networks (DNNs). While it appears like a natural way to analyze complex multi-omics datasets, the application of such techniques to biomedical datasets poses an important challenge: the black-box problem. Once a model is trained, it can be difficult to understand why it gives a particular response to a set of data inputs. In this project, our goal is to train and apply state-of-the-art ML models to extract accurate predictive signatures from multi-omics datasets while focusing on biological interpretability. This will contribute to building the trust of the medical community in the use of these algorithms and will lead to deeper insights into the biological mechanisms underlying disease risk, pathogenesis and response to therapy.
- Jonathan Jalbert (Polytechnique Montréal), Françoise Bichai, Sarah Dorner, Christian Genest : Modélisation des surverses occasionnées par les précipitations et développement d’outils adaptés aux besoins de la Ville de Montréal
- La contamination fécale des eaux de surface constitue l’une des premières causes de maladies d’origine hydrique dans les pays industrialisés et dans les pays en voie de développement. En zone urbaine, la contamination fécale provient majoritairement des débordements des réseaux d’égouts combinés. Lors de précipitations, l’eau pluviale entre dans le réseau d’égouts et se mélange à l’eau sanitaire pour être acheminée vers la station d’épuration. Si l’intensité des précipitations dépasse la capacité de transport du réseau, le mélange des eaux pluviales et sanitaires est alors directement rejeté dans le milieu récepteur sans passer par la station d’épuration. Ces débordements constituent un risque environnemental et un enjeu de santé publique. À l’heure actuelle, les caractéristiques des événements pluvieux occasionnant des surverses sont incertaines. Ce projet de recherche vise à tirer profit des données sur les surverses récemment rendues publiques par la Ville de Montréal pour caractériser les événements de précipitations occasionnant des surverses sur son territoire. Cette caractérisation permettra, d’une part, d’estimer le nombre de surverses attendues pour le climat projeté des prochaines décennies. D’autre part, elle sera utilisée pour dimensionner les mesures de mitigation, tels que les bassins de rétention et les jardins de pluie.
- Nadia Lahrichi (Polytechnique Montréal), Sebastien Le Digabel, Andrea Matta, Nicolas Zufferey, Andrea Lodi, Chunlong Yu: Reactive/learning/self-adaptive metaheuristics for healthcare resource scheduling
- The goal of this research proposal is to develop state-of-the-art decision support tools to address the fundamental challenges of accessible and quality health services. The challenges to meeting this mandate are real, and efficient resource management is a key factor in achieving this goal. This proposal will specifically focus on applications related to patient flow. Analysis of the literature shows that most research focuses on single-resource scheduling and considers that demand is known; Patient and resource scheduling problems are often solved sequentially and independently. The research goals is to develop efficient metaheuristic algorithms to solve integrated patient and resource scheduling problems under uncertainty (e.g., demand, prole, and availability of resources). This research will be divided into three main themes, each of them investigating a different avenue to more efficient metaheuristics: A) learning approaches to better explore the search space; B) blackbox optimization for parameter tuning; and C) simulation-inspired approaches to control the noise induced by uncertainty.
- Eric Lécuyer (Université de Montréal), Mathieu Blanchette, Jérôme Waldispühl, William Hamilton: Deciphering RNA regulatory codes and their disease-associated alterations using machine learning.
- The human DNA genome serves as an instruction guide to allow the formation of all the cells and organs that make up our body over the course of our lives. Much of this genome is transcribed into RNA, termed the ‘transcriptome’, that serves as a key conveyor of genetic information and provides the template for the synthesis of proteins. The transcriptome is itself subject to many regulatory steps for which the basic rules are still poorly understood. Importantly, when these steps are improperly executed, this can lead to disease. This project aims to utilize machine learning approaches to decipher the complex regulatory code that controls the human transcriptome and to predict how these processes may go awry in different disease settings.
- Gregory Lodygensky (Université de Montréal), Jose Dolz, Josée Dubois, Jessica Wisnowski: Next generation neonatal brain segmentation built on HyperDense-Net, a fully automated real-world tool
- There is growing recognition that major breakthroughs in healthcare will result from the combination of databanks and artificial intelligence (AI) tools. This would be very helpful in the study of the neonatal brain and its alterations. For instance, the neonatal brain is extremely vulnerable to the biological consequences of prematurity or birth asphyxia, resulting in cognitive, motor, language and behavioural disorders. A key difference with adults is that key aspects of brain-related functions can only be tested several years later, hindering greatly the advancement of neonatal neuroprotection. Researchers and clinicians need objective tools to immediately assess the effectiveness of a therapy that is given to protect the brain without waiting five years to see if it succeeded. Neonatal brain magnetic resonance imaging can bridge this gap. However, it represents a real challenge as this period of life represents a unique period of intense brain growth (e.g. myelination and gyrification) and brain maturation. Thus, we plan to improve our existing neonatal brain segmentation tools (i.e. HyperDense-Net) using the latest iterations of AI tools. We will also develop a validated tool to determine objective brain maturation in newborns.
- Alexandra M. Schmidt (McGill University), Jill Baumgartner, Brian Robinson, Marília Carvalho, Oswaldo Cruz, Hedibert Lopes: Flexible multivariate spatio-temporal models for health and social sciences
- Health and social economic variables are commonly observed at different spatial scales of a region (e.g. districts of a city or provinces of a country), over a given period of time. Commonly, multiple variables are observed at a given spatial unit resulting in high dimensional data. The challenge in this case is to consider models that account for the possible correlation among variables across space or space and time. This project aims at developing statistical methodology that accounts for this complex hierarchical structure of the observed data. And inference procedure follows the Bayesian paradigm meaning that uncertainty about the unknowns in the model is naturally accounted for. The project is subdivided into four subprojects that range from the estimation of a social economic vulnerability index for a given city to the spatio-temporal modelling of multiple vector borne diseases. The statistical tools proposed here will help authorities with the understanding of the dynamics across space and time of multiple diseases, and assist with the decision making process of evaluating how urban policies and programmes will impact the urban environment and population health, through a lens of health equity.
- Adam Oberman (McGill University), Michael Rabbat, Chris Finlay, Levon Nukbekyan: Robustness and generalization guarantees for Deep Neural Networks in security and safety critical applications
- Despite impressive human-like performance on many tasks, deep neural networks are surprisingly brittle in scenarios outside their previous experience, often failing when new experiences do not closely match their previous experiences. This ‘failure to generalize’ is a major hurdle impeding the adoption of an otherwise powerful tool in security- and safety-critical applications, such as medical image classification. The issue is in part due to a lack of our theoretical understanding of why neural networks work so well. They are powerful tools but less interpretable than traditional machine learning methods which have performance guarantees but do not work as well in practice. This research program will aim to address this ‘failure to generalize’, by developing guarantees of generalization, using notions of the complexity of a regularized model, corresponding to model averaging. This approach will be tested in computer vision applications, and will have near-term applications to medical health research, through medical image classification and segmentation. More broadly, the data science methods developed under this project will be applicable to a wide variety of fields and applications, notably wherever reliability and safety are paramount.
- Liam Paull (Université de Montréal), Derek Nowrouzezahrai, James Forbes: Differentiable perception, graphics, and optimization for weakly supervised 3D perception
- An ability to perceive and understand the world is a prerequisite for almost any embodied agent to achieve almost any task in the world. Typically, world representations are hand-constructed because it difficult to learn them directly from sensor signals. In this work, we propose to build the components so that this map-building procedure is differentiable. Specifically, we will focus on the perception (grad-SLAM) and the optimization (meta-LS) components. This will allow us to backpropagate error signals from the 3D world back to the sensor inputs. This enables us to do many things, such as regularize sensor data with 3D geometry. Finally, by also building a differentiable rendering component (grad-Sim), we can leverage self-supervision through cycle consistency to learn representations with no or sparse hand-annotated labels. Combining all of these components together gives us the first method of world representation building that is completely differentiable and self-supervised.
- Gilles Pesant (Polytechnique Montréal), Siva Reddy, Sarath Chandar Anbil Parthipan: Investigating Combinations of Neural Networks and Constraint Programming for Structured Prediction
- L’intelligence artificielle occupe une place de plus en plus importante dans de nombreuses sphères d’activité et dans notre quotidien. En particulier, les réseaux de neurones arrivent maintenant à assimiler puis à accomplir des tâches auparavant réservées aux humains. Cependant lorsqu’une tâche nécessite le respect de règles structurantes complexes, un réseau de neurones éprouve parfois beaucoup de mal à apprendre ces règles. Or un autre domaine de l’intelligence artificielle, la programmation par contraintes, a précisément été conçue pour trouver des solutions respectant de telles règles. Le but de ce projet est donc d’étudier des combinaisons de ces deux approches à l’intelligence artificielle afin de plus facilement apprendre à accomplir des tâches sous contraintes. Dans le cadre du projet, nous nous concentrerons sur le domaine du traitement de la langue naturelle mais nos travaux pourront aussi s’appliquer à des tâches dans d’autres domaines.
- Jean-François Plante (HEC Montréal), Patrick Brown, Thierry Duchesne, Nancy Reid, Luc Villandré: Statistical inference and modelling for distributed systems
- Statistical inference requires a large toolbox of models and algorithms that can accommodate complex data structures. Modern datasets are often so large that they need to be stored on distributed systems, with the data stored across a number of nodes with limited bandwidth between them. Many complex statistical models cannot be used with such complex data, as they rely on the complete data being accessible. In this project, we will advance statistical modeling contributions to data science by creating solutions that are ideally suited for analysis on distributed systems. More specifically, we will develop spatio-temporal models as well as accurate and efficient approximations of general statistical models that are suitable for distributed data, and as such, scalable to massive data.
- Wei Qi (McGill University), Xue (Steve) Liu, Max Shen, Michelle Lu: Deals on Wheels: Advancing Joint ML/OR Methodologies for Enabling City-Wide, Personalized and Mobile Retail
- Moving forward to a smart-city future, cities in Canada and around the world are embracing the emergence of new retail paradigms. That is, retail channels can further diversify beyond the traditional online and offline boundaries, combining the best of the both. In this project, we focus on an emerging mobile retail paradigm in which retailers run their stores on mobile vehicles or self-driving cars. Our mission is to develop cross-disciplinary models, algorithms and data-verified insights for enabling mobile retail. We will achieve this mission by focusing on three interrelated research themes: Theme 1 – Formulating novel optimization problems of citywide siting and inventory replenishment for mobile stores. Theme 2 – Developing novel learning models for personalized demand estimation. Theme 3 – Integrating Theme 1 and Theme 2 by proposing a holistic algorithmic framework for joint and dynamic demand learning and retail operations, and for discovering managerial insights. The long-term goal is to thereby advance the synergy of operations and machine learning methodologies in the broad contexts of new retail and smart-city analytics.
- Marie-Ève Rancourt (HEC Montréal), Gilbert Laporte, Aurélie Labbe, Daniel Aloise, Valérie Bélanger, Joann de Zegher, Burcu Balcik, Marilène Cherkesly, Jessica Rodriguez Pereira: Humanitarian Supply Chain Analytics
- Network design problems lie at the heart of the most important issues faced in the humanitarian sector. However, given their complex nature, humanitarian supply chains involve the solution of difficult analytics problems. The main research question of this project is “how to better analyze imperfect information and address uncertainty to support decision making in humanitarian supply chains?”. To this end, we propose a methodological framework combining data analysis and optimization, which will be validated through real-life applications using multiple sources of data. First, we propose to build robust relief networks under uncertainty in demand and transportation accessibility, due to weather shocks and vulnerable infrastructures. We will consider two contexts: shelter location in Haiti and food aid distribution planning in Southeastern Asia. Second, we propose to embed fair cost sharing mechanisms into a collaborative prepositioning network design problem arising in the Caribbean. Classic economics methods will be adapted to solve large-scale stochastic optimization problems, and novel models based on catastrophic insurance theory will be proposed. Finally, a simulation will be developed to disguise data collection as a serious game and gather real-time information on the behavior of decision makers during disasters to extrapolate the best management strategies.
- Saibal Ray (McGill University), Maxime Cohen, James Clark, Ajung Moon: Retail Innovation Lab: Data Science for Socially Responsible Food Choices
- In this research program, we propose to investigate the use of artificial intelligence techniques, involving data, models, behavioral analysis, and decision-making algorithms, to efficiently provide higher convenience for retail customers while being socially responsible. In particular, the research objective of the multi-disciplinary team is to study, implement, and validate systems for guiding customers to make healthy food choices in a convenience store setting, while being cognizant of privacy concerns, both online and in a brick-and-mortar store environment. The creation of the digital infrastructure and decision support systems that encourage people and organizations to make health-promoting choices should hopefully result in a healthier population and reduce the costs of chronic diseases to the healthcare system. These systems should also foster the competitiveness of organizations operating in the agri-food and digital technology sectors. A distinguishing feature of this research program is that it will make use of a unique asset – a new “living-lab”, the McGill Retail Innovation Lab (MRIL). It will house a fully functioning retail store operated by a retail partner with extensive sensing, data access, and customer monitoring. The MRIL will be an invaluable source of data to use in developing and validating our approaches as well as a perfect site for running field experiments.
- Léo Raymond-Belzile (HEC Montréal), Johanna Nešlehová, Alexis Hannart, Jennifer Wadsworth: Combining extreme value theory and causal inference for data-driven flood hazard assessment
- The IPCC reports highlight increase in mean precipitation, but the impact of climate change on streamflow are not as certain and the existing methodology is ill-equipped to predict changes in flood extremes. Our project looks into climate drivers impacting flood hazard and proposes methodological advances based on extreme value theory and causal inference in order to simulate realistic streamflow extremes at high resolution. The project will also investigate how climate drivers impact the hydrological balance using tools from machine learning for causal discovery to enhance risk assessment of flood hazard.
- Nicolas Saunier (Polytechnique Montréal), Francesco Ciari, Catherine Morency, Martin Trépanier, Lijun Sun: Bridging Data-Driven and Behavioural Models for Transportation
- Transportation data is traditionally collected through travel surveys and fixed sensors, mostly on the roadways: such data is expensive to collect and has limited spatial and temporal coverage. In recent years, more and more transportation data has become available on a continuous basis from multiple new sources, including users themselves. This has fed the rise of machine learning methods that can learn models directly from data. Yet, such models often lack robustness and may be difficult to transfer to a different region or period. This can be alleviated by taking advantage of domain knowledge stemming from the properties of the flow of people moving in transportation systems with daily activities. This project aims to develop hybrid methods relying on transportation and data-driven models to predict flows for all modes at different spatial and temporal scales using multiple sources of heterogeneous data. This results in two specific objectives: 1. to learn probabilistic flow models at the link level for several modes based on heterogeneous data; 2. to develop a method bridging the flow models (objective 1) with a dynamic multi-agent transportation model at the network level. These new models and methods will be developed and tested using real transportation data.
- Yvon Savaria (Polytechnique Montréal), François Leduc-Primeau, Elsa Dupraz, Jean-Pierre David, Mohamad Sawan: Ultra-Low-Energy Reliable DNN Inference Using Memristive Circuits for Biomedical Applications (ULERIM)
- Recent advances in machine learning based on deep neural networks (DNNs) have brought powerful new capabilities for many signal processing tasks. These advances also hold great promises for several applications in healthcare. However, state-of-the-art DNN architectures may depend on hundreds of millions of parameters that must be stored and then retrieved, resulting in a large energy usage. Thus, it is essential to reduce their energy consumption to allow in-situ computations. One possible approach involves using memristor devices, a concept first proposed in 1971 but only recently put in practice. Memristors are a very promising way to implement compact and energy-efficient artificial neural networks. The aim of this research is to advance the state-of-the-art in the energy-efficient implementation of deep neural networks using memristive circuits and introducing DNN-specific methods to better manage uncertainty inherent to integrated circuit fabrication. These advances will benefit a large number of medical applications for which portable devices are required to perform a complex analysis of the state of the patient, and also benefit generally the field of machine learning by reducing the amount of energy required to apply it. Within this project, the energy improvements will be exploited to improve the signal processing performance of an embedded biomedical device for the advanced detection of epileptic seizures.
- David Stephens (McGill University), Yu Luo, Erica Moodie, David Buckeridge, Aman Verma: Statistical modelling of health trajectories and interventions
- Large amounts of longitudinal health records are now collected in private and public healthcare systems. Data from sources such as electronic health records, healthcare administrative databases and data from mobile health applications are available to inform clinical and public health decision-making. In many situations, such data enable the dynamic monitoring of the underlying disease process that governs the observations. However, this process is not observed directly and so inferential methods are needed to ascertain progression. The objective of the project is to build a comprehensive Bayesian computational framework for performing inference for large scale health data. In particular, the project will focus on the analysis of records that arise in primary and clinical care contexts to study patient health trajectories, that is, how the health status of a patient changes across time. Having been able to infer the mechanisms that influence health trajectories, we will then be able to introduce treatment intervention policies that aim to improve patient outcomes.
- An Tang (Université de Montréal), Irina Rish, Guy Wolf, Guy Cloutier, Samuel Kadoury, Eugene Belilovsky, Michaël Chassé, Bich Nguyen: Ultrasound classification of chronic liver disease with deep learning
- Chronic liver disease is one of the top ten leading causes of death in North America. The most common form is nonalcoholic fatty liver disease which may evolve to nonalcoholic steatohepatitis and cirrhosis if left untreated. In many cases, the liver may be damaged without any symptoms. A liver biopsy is currently required to evaluate the severity of chronic liver disease. This procedure requires the insertion of a needle inside the liver to remove a small piece of tissue for examination under microscope. Liver biopsy is an invasive procedure with a risk of major complications such as bleeding. Ultrasound is ideal for screening patients because it is a safe and widely available technology to image the whole liver. Our multi-disciplinary team is proposing the use of novel artificial intelligence techniques to assess the severity of chronic liver disease from ultrasound images and determine the severity of liver fat, inflammation, and fibrosis without the need for liver biopsy. This study is timely because chronic liver disease is on the rise which means that complications and mortality will continue to rise if there is no alternative technique for early detection and monitoring of disease severity.
- Guy Wolf (Université de Montréal), Will Hamilton, Jian Tang: Unified approach to graph structure utilization in data science
- While deep neural networks are at the frontier of machine learning and data science research, their most impressive results come from data with clear spatial/temporal structure (e.g., images or audio signals) that informs network architectures to capture semantic information (e.g., textures, shapes, or phonemes). Recently, multiple attempts have been made to extend such architectures to non-Euclidean structures that typically exist in data, and in particular to graphs that model data geometry or interaction between data elements. However, so far, such attempts have been separately conducted by largely-independent communities, leveraging specific tools from traditional/spectral graph theory, graph signal processing, or applied harmonic analysis. We propose a multidisciplinary unified approach (combining computer science, applied mathematics, and decision science perspectives) for understanding deep graph processing. In particular, we will establish connections between spectral and traditional graph theory applied for this task, introduce rich notions of intrinsic graph regularity (e.g., equivalent to image textures), and enable continuous-depth graph processing (i.e., treating depth as time) to capture multiresolution local structures. Our computational framework will unify the multitude of existing disparate attempts and establish rigorous foundations for the emerging field of geometric deep learning, which is a rapidly growing fields in machine learning.