Le programme de financement de projets de recherche fondamentale offre un cadre pour favoriser la recherche multidisciplinaire en science des données. Ce programme est aussi très axé sur le futur, en favorisant la formation des futur·e·s acteur·trice·s du domaine et la création des bases scientifiques de la recherche de demain, qu’elle soit fondamentale ou appliquée.
Engagement d’IVADO pour l’équité, la diversité et l’inclusion et note aux candidat(e)s :
Afin que l’avancement des connaissances et des opportunités dans le domaine de la science des données bénéficie équitablement à tous les membres de la société, IVADO promeut des principes d’équité, de diversité et d’inclusion à travers l’ensemble de ses programmes. IVADO s’engage à offrir un processus de recrutement et un milieu de recherche qui soient inclusifs, non discriminatoires, ouverts et transparents.
En bref
Description
FAQ
Soumission
Résultats 2017
Résultats 2020
Description du programme
- Nom du programme : Programme de financement IVADO pour les projets de recherche fondamentale
- Type de programme : Subvention en équipe multidisciplinaire.
- Type de recherche : Fondamentale ou appliquée.
- Domaine stratégique/prioritaire de recherche : Sciences des données au sens large, innovation pilotée par les données.
Objectifs
Les objectifs de ce programme sont de :
- Favoriser la recherche multidisciplinaire en science des données, principalement dans les domaines d’excellence des membres d’IVADO: la recherche opérationnelle, l’apprentissage automatique, les sciences de la décision.
- Poser les bases de recherche subséquente, fondamentale ou appliquée.
Échéances
- Date d’ouverture du concours : 12 novembre 2019, 9h00 HNE
- Date limite de soumission : 11 décembre 2019, 9h00 du matin HNE
- Date de notification prévue : avril 2020 ***reportée à Mai 2020***
- Date de début du financement : 1er avril ou 1er septembre 2020
- Critères : voir l’onglet description
- Soumission : voir l’onglet soumission
- Renseignements : programmes-excellence@ivado.ca
Domaines supportés
Le programme de financement de projets de recherche fondamentale d’IVADO soutient les activités de recherche concernant les enjeux soulevés dans la demande de subvention auprès d’Apogée Canada : la sciences des données au sens large, incluant des recherches méthodologiques en science des données (apprentissage automatique, recherche opérationnelle, statistiques) et leurs applications dans plusieurs domaines.
La recherche soutenue par ce programme doit être collaborative et multidisciplinaire, qu’il s’agisse de plusieurs approches méthodologiques (apprentissage machine, recherche opérationnelle, sciences de la décision), ou bien d’une approche méthodologique combinée avec une expertise dans un domaine d’application. Le fait qu’un projet de recherche soit appliqué ou non n’a aucun impact sur l’éligibilité des projets. Mais il est attendu des projets qu’ils apportent une contribution fondamentale en science des données et les candidatures doivent mettre l’emphase et démontré cet aspect.
Description du programme
- Nom du programme : Programme de financement IVADO pour les projets de recherche fondamentale
- Type de programme : Subvention en équipe multidisciplinaire.
- Type de recherche : Fondamentale ou appliquée.
- Domaine stratégique/prioritaire de recherche : Sciences des données au sens large, innovation pilotée par les données.
Objectifs
Les objectifs de ce programme sont de :
- Favoriser la recherche multidisciplinaire en science des données, principalement dans les domaines d’excellence des membres d’IVADO : la recherche opérationnelle, l’apprentissage machine, les sciences de la décision.
- Poser les bases de recherche subséquentes, fondamentales ou appliquées.
Échéances
- Annonce : Fin octobre 2019
- Date limite de soumission : 11 décembre 2019, 9h00 du matin HNE
- Date de notification prévue : avril 2020 ***reportée à Mai 2020***
- Date de début du financement : 1er avril ou 1er septembre 2020
Domaines supportés
Le programme de financement de projets de recherche fondamentale d’IVADO soutient les activités de recherche concernant les enjeux soulevés dans la demande de subvention auprès d’Apogée Canada (http://www.cfref-apogee.gc.ca/results-resultats/abstracts-resumes/competition_2/universite_de_montreal-fra.aspx) : sciences des données au sens large, incluant des recherches méthodologiques en science des données et leurs applications dans plusieurs domaines.
Pour atteindre les objectifs décrits dans cette demande, la stratégie scientifique se concentre sur deux aspects fondamentaux du comportement intelligent : la connaissance et la capacité à l’utiliser, ou, autrement dit, les processus de compréhension, de prévision et de décision. L’emphase est donc portée sur le développement d’outils, méthodes, algorithmes et modèles, idéalement utilisables dans une grande variété de domaines d’application. Les projets de recherche appliqués sont encouragés, particulièrement dans les domaines prioritaires (la santé, le transport, la logistique, les ressources et énergie, les services d’information et le commerce) mais sans s’y limiter.
La recherche soutenue par ce programme doit être collaborative et multidisciplinaire, qu’il s’agisse de plusieurs approches méthodologiques (apprentissage machine, recherche opérationnelle, sciences de la décision), ou bien d’une approche méthodologique combinée avec une expertise dans un domaine d’application. Le fait qu’un projet de recherche soit appliqué ou non n’a aucun impact sur l’éligibilité des projets. Mais il est attendu des projets qu’ils apportent une contribution fondamentale en science des données et les candidatures doivent mettre l’emphase et démontrer cet aspect.
Fonds disponibles
Pour ce deuxième concours, un financement compris entre 100k$ et 150k$ par an, pour deux ans, sera remis aux projets sélectionnés. Le maximum admissible est de 300k$ répartit sur deux ans. Une partie du budget total disponible est réservé aux bourses pour étudiant·e·s des 1er, 2ème et 3ème cycles et les stagiaires post-doctoraux; les candidat·e·s doivent présenter un budget dans lequel au moins deux tiers (66%) des dépenses entrent dans cette catégorie.
Admissibilité des candidat·e·s et co-candidat·e·s
- Le·la chercheur·euse principal·e devra être un·e professeur·e de l’un des établissements suivants : HEC, École Polytechnique ou Université de Montréal.
- Les professeur·e·s de l’Université d’Alberta et de l’Université McGill peuvent aussi être chercheur·euse·s principaux·ales à condition d’être aussi membres d’un groupe de recherche d’IVADO (Mila, CIRRELT, GERAD, Chaire d’Excellence de Recherche du Canada en prise de décision en temps réel, CRM, Tech3Lab).
- L’équipe devra être composée d’au moins deux professeur·e·s. Les post-doctorant·e·s sont admis·es à être membres de l’équipe.
- Les membres de l’équipe ne doivent pas tous avoir leur affiliation principale dans le même département.
- Les projets qui visent à valoriser les données omiques en utilisant l’intelligence artificielle dans le domaine du cancer ne seront pas admissibles au concours IVADO mais seulement au concours GQ, Oncopole, IVADO.
- Il n’est pas possible d’être le·la chercheur·euse principal·e de plusieurs demandes.
Lignes directrices
Généralement, les guides d’administration des trois fonds (http://www.nserc-crsng.gc.ca/professors-professeurs/financialadminguide-guideadminfinancier/index_fra.asp) et les règles du programme Apogée (http://www.cfref-apogee.gc.ca/program-programme/administer-administrer-fra.aspx#admissibles) serviront de guides pour ce programme.
Coûts admissibles
Ce programme permet le financement :
- d’étudiant·e·s aux 1er, 2ème et 3ème cycles;
- de stagiaires postdoctoraux;
- de professionnel·le·s;
- de frais de déplacement;
- de matériel, logiciel, bases de données et accès à des ressources de calcul.
Conditions de financement
- Pour les projets nécessitant une approbation éthique, les fonds ne seront pas débloqués tant que l’approbation ne sera pas obtenue.
- Les fonds de la subvention seront transmis au Bureau de la recherche de l’établissement d’attache du·de la chercheur·euse principal·e, et l’institution l’administrera suivant ses propres règles.
Diffusion des données de recherche
- Les équipes financées seront soumises à la Politique des trois organismes sur le libre accès aux publications (http://www.science.gc.ca/eic/site/063.nsf/fra/h_F6765465.html). Les équipes sont encouragées à rendre publiques le plus possible de leurs productions de recherche (publications, enregistrement d’interventions orales, code source, bases de données, etc.), dans le respect des règles de propriétés intellectuelle.
- Le soutien d’IVADO et du programme Apogée doit être reconnu lors de la diffusion de résultats de recherche.
Tri administratif
Les propositions passeront par une phase d’évaluation administrative qui rejettera les dossiers :
- qui ne respectent pas les contraintes de format (sections absentes, nombre de pages excessif, …);
- qui ne sont pas présentés par un·e professeur·e admissible;
- pour lesquels un·e chercheur·euse principal·e présente plusieurs demandes;
- où tous les membres de l’équipe proviennent d’un même département.
Critères d’évaluation
L’évaluation des projets sera basée à parts égales sur, d’une part :
- Projet de recherche
- Adéquation avec le thème du programme de financement :
- projet multidisciplinaire en science des données au sens large,
- impliquant les domaines d’excellence des membres d’IVADO : la recherche opérationnelle, l’apprentissage automatique, les sciences de la décision.
- Importance des impacts anticipés, excellence de la recherche, originalité.
- Faisabilité du projet, pertinence de la méthodologie.
- Clarté de la proposition.
- Existence d’un budget raisonnable et bien justifié selon les objectifs
- Adéquation avec le thème du programme de financement :
et d’autre part :
- Équipe et individus
- Excellence des chercheur·euse·s impliqué·e·s dans la demande en fonction de leur niveau d’avancement dans leur carrière.
- Justification de la constitution de l’équipe (adéquation entre l’équipe et le projet de recherche, multiplicité des champs d’expertise des membres).
- Présence de chercheur·euse·s en début de carrière (professeur·e·s récemment embauché·e·s, post-doctorant·e.s).
- Opportunités de formation
- Implication d’étudiant·e·s aux trois cycles
- Implication de chercheur·euse·s post-doctoraux
- Embauche et formation de professionnel·le·s de recherche
- Cohérence avec les autres objectifs d’Apogée et IVADO
- Présence dans l’équipe
- de chercheur·euse·s en dehors de Montréal;
- de chercheur·euse·s post-doctoraux financés par IVADO;
- de professeur·e·s recruté·e·s dans le cadre du programme IVADO.
- Collaborations internationales.
- Validité du plan de diffusion des résultats.
- Intégration dans un programme de recherche plus vaste et avec des échéances plus longues.
- Efforts sur la représentativité et la diversité dans la composition de l’équipe ou les intentions de recrutement (femmes, minorités, …). Les chercheur·euse·s sont encouragés à travailler sur la représentativité lors de la composition de l’équipe et lors du recrutement.
- Présence dans l’équipe
Processus
Chaque demande sera assignée à quatre évaluateur·trice·s, choisi·e·s parmi des personnes familières du domaine de la demande. Les collaborateur·trice·s direct·e·s (co-encadrements ou co-publications récentes) de membres d’équipe candidate ne participeront pas aux évaluations de ces dossiers.
Les demandes de financement seront ensuite réparties dans des sous-comités d’évaluation par domaine méthodologique, afin d’établir un classement des dossiers basé sur l’excellence. Chaque sous-comité sera composé de trois professeur·e·s provenant de trois universités différentes, incluant un président.
Les classements des sous-comités seront mis en commun lors d’une rencontre des président·e·s afin de déterminer un classement des dossiers. Les président·e·s proposeront une liste de projets à financer selon le budget disponible. Les résultats seront discutés et approuvés par le comité scientifique lors d’une réunion spéciale du comité scientifique.
Dépôt de la demande
Les instructions pour le dépôt de la candidature sont disponibles sous l’onglet “Soumission”.
Éléments de la demande
- Nom et affiliation du chercheur·euse principal·e
- Nom et affiliation des autres membres de l’équipe
- Pour chaque participant·e, un CV sous un format non contraint, mais répertoriant au moins les 5 dernières années de publications et d’activités (le CV commun canadien – CRSNG est suggéré)
- Liste de mots clés liés à la demande, précisant entre autres les domaines méthodologiques, les champs d’application éventuels, etc.
- Justification de la constitution de l’équipe (½ page max)
- Description du projet de recherche (4 p. max)
- Liste de références (1p. max)
- Budget demandé, détaillant :
- bourses aux 1er, 2ème et 3ème cycles;
- stagiaires postdoctoraux;
- professionnel·le·s;
- frais de déplacement;
- matériel, logiciel, bases de données et accès à des ressources de calcul.
- Justification budgétaire faisant le lien entre le projet de recherche et le financement demandé (2 pages max.)
- Plan de diffusion des résultats (1 page max.)
- Objectifs de publication, participation à des congrès, organisation locale d’une journée de clôture de projet, d’un atelier, …
- Plan de formation des PHQ (1 page max.)
Bilan
Au terme des deux années de financement, le·la chercheur·euse principal·e doit produire un rapport final comprenant :
- Bilan du projet.
- Liste des publications.
- Liste des participations à des événements.
- Liste des étudiant·e·s financé·e·s, leurs coordonnées et résumé de leur participation (dans le respect des règles de gestion de données personnelles).
- Rapport financier.
- Liste des activités de diffusions des connaissances organisées ou auxquelles l’équipe a participé ou collaboré.
- Nouveaux financements obtenus ou demandés sur la base des travaux réalisés durant ce projet.
Contacts
- Toute question concernant ce programme de financement peut être posée à programmes-excellence@ivado.ca.
- Veuillez consulter la section FAQ de cette page.
FAQ
- Je dois soumettre mon budget révisé. Quelles informations mettre dedans?
Les informations dont nous avons besoin sont réparties par années, pour les éléments suivants :
- Salaires/bourses (Post-docs, Doctorats, professionnel·le·s…)
- Diffusion des résultats de la recherche et réseautage (organisation de conférences, publications, site web, …)
- Frais de déplacement et de séjour (participation à des conférences, invitations de chercheur·euse·s, …)
- Contributions externes
- Autres
- Y a-t-il différentes catégories de membres dans l’équipe?
Nous ne faisons pas de distinction entre co-chercheur·euse·s, codemandeur·euse·s, collaborateur·trice·s, etc. Ces distinctions éventuelles sont à déterminer entre les membres de l’équipe. Les droits et devoirs de chacun en regard de la subvention sont déterminés par leur affiliation et leur statut professionnel. Par exemple, un·e étudiant·e doctoral·e membre de l’équipe peut recevoir une bourse, mais ne peut pas administrer de fonds.
- Que signifie “collaborations internationales”?
La définition est très large : tout ce qui va favoriser les relations de travail avec les autres provinces canadiennes et les autres pays du monde. Tout ce qui a un lien avec l’intégration dans un réseau international, la présence dans l’équipe de collaborateurs étrangers, de professeur·e·s invité·e·s ou en sabbatique; la planification de co-encadrements, de co-publication ou d’organisation d’événements internationaux. L’intégration des activités dans dans des conventions de partenariats existantes entre votre université et d’autres, etc.
- Quels sont les professeur·e·s éligibles à participer à une demande?
Il y a deux types de contraintes : les contraintes d’affiliation et les contraintes de statut. Les deux doivent être respectées pour être éligible.
Pour les membres de l’équipe, il n’y a pas de contrainte ni de statut ni d’affiliation.
Pour le·la chercheur·euse principal·e (demandeur), les contraintes d’affiliation sont d’être professeur·e :
- soit à HEC Montréal, Polytechnique Montréal ou l’Université de Montréal, sans autres contraintes
- soit à l’Université McGill ou l’université d’Alberta, mais ces professeur·e·s doivent en plus être membres de l’un des groupes de recherche suivants : CIRRELT, GERAD, Mila ou Chaire d’Excellence de Recherche du Canada en prise de décision en temps réel, CRM, Tech3Lab.
Pour le·la chercheur·euse principal·e, les statuts admissibles (que nous résumons par “professeur·e”) sont :
- Professeur·e adjoint·e (assistant professor), Professeur·e agrégé·e (associate or regular professor), professeur·e titulaire (full professor)
- Professeur·e sous octroi
- Professeur·e chercheur·euse
- Chercheur·euse invité
Ne sont pas admissibles, explicitement :
- Les professeur·e·s associé·e·s
Un changement de statut ou d’affiliation du·de la chercheur·euse principal·e en cours de subvention pourrait mener à l’arrêt de celle-ci.
- Comment est partagée la propriété intellectuelle?
Dans le cadre de ce programme, IVADO n’intervient pas à ce niveau. Les accords sur la propriété intellectuelle doivent être réglés entre les membres de l’équipe, avec l’aide des interlocuteur·trice·s habituel·le·s dans leurs établissements : bureaux de la recherche, services juridiques, etc.
- Les quatre domaines Apogée (Transport et logistique; Santé et recherche biomédicale; Commerce et service d’information; Énergie) sont-ils exclusifs, prioritaires, ou indicatifs?
Pour le concours 2019, ces domaines sont favorisés, mais tout autre domaine d’application est recevable. Notez qu’il n’est pas nécessaire d’avoir un domaine d’application; des travaux méthodologique fondamentaux à l’intersection des l’apprentissage automatique, de la recherche opérationnelle et des sciences de la décision sont aussi éligibles.
- Un domaine d’application hors-Canada est il recevable (p.ex., étude sur une maladie tropicale)?
Oui, tant que la recherche est faite au Canada, et qu’elle rentre par ailleurs dans les domaines financés.
- Travailler sur un modèle mathématique avancé qui ne fait pas appel à l’apprentissage automatique est-il admissible?
Oui, ce modèle peut parfaitement s’intégrer dans une approche de simulation (recherche opérationnelle) ou d’aide à la décision, par exemple. Plus l’usage du modèle s’éloignera de ces domaines, moins le projet sera prioritaire.
- Existera-t-il un période de grâce pour utiliser le financement au delà des 2 ans?
Oui, mais les conditions exactes restent à définir.
- Comment se formalise le critère d’éligibilité pour la multidisciplinarité des équipes et des projets?
Il s’agit de contraintes “faibles”, mais simples et formelles, pour aborder la multidisciplinarité
- Pour les équipes : tous les membres ne peuvent pas avoir leur affiliation dans le même département de la même université. Ils peuvent être tous membres du même groupe de recherche sans que cela n’influe sur l’éligibilité.
- Pour les projets : “La recherche soutenue par ce programme doit être collaborative et multidisciplinaire, qu’il s’agisse de plusieurs approches méthodologiques (apprentissage machine, recherche opérationnelle, sciences de la décision), ou bien d’une approche méthodologique combinée avec une expertise dans un domaine d’application.”
- A-t-on besoin de fournir les CV des membres de l’équipe étrangers, ou industriels?
Les membres de l’équipe doivent être choisis en relation avec le projet de recherche proposé et la composition de l’équipe correspond à un des critères d’évaluation. Pour pouvoir être évalués, les membres de l’équipe doivent fournir un CV, y compris les étranger·e·s et les industriel·le·s.
- À quelle date aurais-je une réponse concernant mon dossier?
Vous aurez une réponse fin avril 2020 (voir section “En bref” pour plus de détails)
D’autres questions? Veuillez les soumettre à programmes-excellence@ivado.ca.
Soumission
L’ensemble du processus de soumission de dossier se passe à travers cette plateforme dédiée : https://ivado.smapply.io/
Le dossier de candidature se compose :
- D’un questionnaire à remplir sur la plateforme. Vous pouvez consulter ici un exemplaire de questionnaire complété.
- Des CV du·de la chercheur·euse principal·e et des membres de l’équipe.
- De la justification de la constitution de l’équipe (½ page max.)
- De la description du projet (4 pages max.)
- De la liste de références (1 page max.)
- Du budget.
- De la justification budgétaire(2 pages max.)
- Du plan de diffusion des résultats (1 page max.)
- Du plan de formation des PHQ (1 page max.)
Tous les documents devront être téléversés sur notre plateforme dédiée aux soumissions
- Bram Adams (Polytechnique Montréal), Antoniol Giuliano, Jiang Zhen Ming & Sénécal Sylvain: A Real-time, Data-driven Field Decision Framework for Large-scale Software Deployments
- As large e-commerce systems need to maximize their revenue, while ensuring customer quality and minimizing IT costs, they are constantly facing major field decisions like “Would it be cost-effective for the company to deploy additional hardware resources for our premium users?” This project will build a real-time, data-driven field decision framework exploiting customer behaviour and quality of service models, release engineering and guided optimization search. It will benefit both Canadian software industry and society, by improving the quality of service experienced by Canadians.
- Jean-François Arguin (Université de Montréal), Tapp Alain, Golling Tobias, Ducu Otilia & Mochizuki Kazuya: Machine learning for the analysis of the Large Hadron Collider Data at CERN
- The Large Hadron Collider (LHC) is one of the most ambitious experiment ever conducted. It collides protons together near the speed of light to reproduce the conditions of the Universe right after the Big Bang. It possesses all the features of Big Data: 1e16 collisions are produced each year, each producing 1000 particles and each of these particle leaving a complex signature in the 100 million electronic channels of the ATLAS detector. This project will initiate a collaboration between data scientists and physicists to develop the application of machine learning to the analysis of the LHC data.
- Olivier Bahn (HEC Montréal), Caines Peter, Delage Erick, Malhamé Roland & Mousseau Normand: Valorisation des données et Optimisation Robuste pour guider la Transition Énergétique vers des réseauX intelligents à forte composante renouvelable (VORTEX)
- Une modélisation multiéchelles consistant en une famille de modèles hiérarchisés et opérant à des échelles de temps croissantes (journée / semaine à mois / horizon de trente ans), et des outils mathématiques adaptés (jeux à champ moyen répétés, apprentissage machine, optimisation convexe et robuste), sont proposés comme base pour une gestion raisonnée de la transition vers des réseaux électriques intelligents à forte composante renouvelable. Notre projet proposera en particulier des outils pour aider à la maîtrise de la demande énergétique dans un contexte régional.
- Yoshua Bengio (Université de Montréal), Cardinal Héloïse, Carvalho Margarida & Lodi Andrea: Data-driven Transplantation Science
- End-stage kidney disease is a severe condition with a rising incidence, currently affecting over 40,000 Canadians.
The decision to accept or refuse an organ for transplantation is an important one, as the donor’s characteristics are strongly associated with the long-term survival of the transplanted kidney. In partnership with their health care provider, the transplant candidates need to answer two questions: (1) How long is the kidney from this specific donor expected to last for me? (2) If I refuse this specific donor, how much longer am I expected to wait before getting a better kidney?
We propose to use deep learning to predict the success of a possible matching. The results will contribute to build a clinical decision support tool answering the two questions above and helping transplant physicians and candidates to make the best decision. In addition, the quality of the matching can be the input of optimization algorithms designed to improve social welfare of organ allocations.
- End-stage kidney disease is a severe condition with a rising incidence, currently affecting over 40,000 Canadians.
- Michel Bernier (Polytechnique Montréal), Kummert Michaël & Bahn Olivier : Développement d’une méthodologie pour l’’utilisation des données massives issues de compteurs intelligents pour modéliser un parc de bâtiments
- Les données disponibles grâce à la généralisation des compteurs communicants représentent une grande opportunité pour améliorer les modèles de parc de bâtiments et les modèles plus généraux de flux énergétiques, mais les connaissances fondamentales à ce sujet sont encore limitées. Le présent projet vise à y remédier en développant une méthodologie permettant d’’utiliser les données massives des compteurs électriques communicants pour caractériser et calibrer, notamment par modélisation inverse, des archétypes de bâtiments qui pourront être intégrés dans le modèle TIMES.
- Guillaume-Alexandre Bilodeau (Polytechnique Montréal), Aloise Daniel, Pesant Gilles, Saunier Nicolas & St-Aubin Paul: Road user tracking and trajectory clustering for intelligent transportation systems
- While traffic cameras are a mainstay of traffic management centers, video data is still most commonly watched by traffic operators for traffic monitoring and incident management. There are still few applications of computer vision in ITS, apart from integrated sensors for specific data extraction such as road users (RUs) counts. One of the most useful data to extract from video is the trajectory of all RUs, including cars, trucks, bicycles and pedestrians. Since traffic videos include many RUs, finding their individual trajectory is challenging. Our first objective is therefore to track all individual RUs. The second objective is to interpret the very large number of trajectories that can be obtained. This can be done by clustering trajectories, which provides the main motions in the traffic scene corresponding to RU activities and behaviors, along with their frequency or probability. Results of this research will be applicable for traffic monitoring in ITS and for self-driving cars.
- François Bouffard (McGill University), Anjos Miguel & Waaub Jean-Philippe: The Electricity Demand Response Potential of the Montreal Metropolitan Community: Assessment of Potential Impacts and Options
- This project will develop a clear understanding of the potential benefits and trade-offs of key stakeholders for deploying significant electric power demand response (DR) in the Montreal Metropolitan Community (MMC) area. It is motivated primarily by the desire of Hydro-Québec to increase its export potential, while at the same time by the need to assess DR deployment scenarios and their impacts on the people and businesses of the MMC. Data science is at the heart of this work which will need to discover knowledge on electricity consumption in order to learn how to leverage and control its flexibility.
- Tolga Cenesizoglu (HEC Montréal), Grass Gunnar & Jena Sanjay: Real-time Optimal Order Placement Strategies and Limit Order Trading Activity
- Our primary objective is to identify how institutional investors can reduce their risk and trading costs by optimizing when and how to execute their trades. Limit order trading activity is an important state variable for this optimization problem in today’s financial markets where most liquidity is provided by limit orders. We thus plan to first analyze how risk and trading costs are affected by limit order trading activity using a novel, large-scale, ultra-high-frequency trading data set. We will then use our findings to guide us in modeling these effects and devising real-time optimal order placement strategies.
- Laurent Charlin (HEC Montréal) & Jena Sanjay Dominik: Exploiting ML/OR Synergies for Assortment Optimization and Recommender Systems
- We propose to exploit synergies between assortment optimization and recommender systems on the application level, and the interplay between machine learning and mathematical programming on the methodological level. Rank-based choice models, estimated in a purely data-driven manner will introduce diversity into recommender systems, and supervised learning methods will improve the scalability and efficiency of assortment optimization in retail.
- Julien Cohen (Polytechnique Montréal), Kadoury Samuel, Pal Chris, Bengio Yoshua, Romero Soriano & Guilbert François: Transformative adversarial networks for medical imaging applications
- Following the concept of Generative adversarial networks (GANs), we propose to explore transformative adversarial training techniques where our goal is to transform medical imaging data to a target reference space as a way of normalizing them for image intensity, patient anatomy as well as the many other parameters associated with the variability inherent to medical images. This approach will be investigated both for data normalization and data augmentation strategy, and will be tested in several multi-center clinical data for lesion segmentation and/or classification (diagnosis).
- Patrick Cossette (Université de Montréal), Bengio Yoshua, Laviolette François & Girard Simon: Towards personalized medicine in the management of epilepsy: a machine learning approach in the interpretation of large-scale genomic data
- To date, more than 150 epilepsy genes have been identified explaining around 35% of the cases. However, conventional genomics methods have failed to explain the full spectrum of epilepsy heritability, as well as antiepileptic drug resistance. In particular, conventional studies lack the ability to capture the full complexity of the human genome, such as interactions between genomic variations (epistasis). In this project, we will investigate how we can use machine learning algorithms in the analyses of genomic data in order to detect multivariate patterns, by taking advantage of our large dataset of individual epilepsy genomes. In this multi-disciplinary project, neurologists, geneticists, bio-informaticians and computational scientists will join forces in order to use machine learning algorithms to detect genomic variants signatures in patients with pharmaco-resistant epilepsy. Having the ability to predict pharmaco-resistance will ultimately reduce the burden of the disease.
- Benoit Coulombe (Université de Montréal), Lavallée-Adam Mathieu, Gauthier Marie-Soleil, Gaspar Vanessa, Pelletier Alexander, Wong Nora & Christian Poitras: A machine learning approach to decipher protein-protein interactions in human plasma
- Proteins circulating in the human bloodstream make very useful and accessible clinical biomarkers for disease diagnostics, prognostics and theranostics. Typically, to perform their functions, proteins will interact with other molecules, including other proteins. These protein-protein interactions provide valuable insights into a protein’s role and function in humans; it can also lead to the discovery of novel biomarkers for diseases in which the protein of interest is involved. However, the identification of such interactions in human plasma is highly challenging. The lack of proper biochemical controls, which are inherently noisy, makes the confidence assessment of these interactions very difficult. We therefore propose to develop a novel machine learning approach that will extract the relevant signal from noisy controls to confidently decipher the interactome of clinically-relevant proteins circulating in the human bloodstream with the ultimate goal of identifying novel biomarkers.
- Michel Denault (HEC Montréal), Côté Pascal & Orban Dominique: Simulation and regression approaches in hydropower optimization
- We develop optimization algorithms based on dynamic programming with simulations and regression, essentially Q-learning algorithms. Our main application area is hydropower optimization, a stochastic control problem where optimal releases of water are sought at each point in time.
- Michel Desmarais (Polytechnique Montréal), Charlin Laurent & Cheung Jackie C. K: Matching individuals to review tasks based on topical expertise level
- The task of selecting an expert to review a paper addresses the general problem of finding a match between a human and an assignment based on the quality of expertise alignment between the two. State of the art approaches generally rely on modeling reviewers as a distribution of topic expertise, or as a set of keywords. Yet, two expert can have the same relative topic distribution and have wide differences in their depth of understanding. A similar argument can be made for papers. The objective of this proposal is to enhance the assignment approach to include the notions of (1) reviewer mastery of a topic, and (2) paper topic sophistication. Means to assess each aspect are proposed, along with approaches to assignments based on this additional information.
- Georges Dionne (HEC Montréal) Morales Manuel, d’Astous Philippe, Yergeau Gabriel, Rémillard Bruno & Shore Stephen H.: Asymmetric Information Tests with Dynamic Machine Learning and Panel Data
- To our knowledge, the econometric estimation of dynamic panel data models with machine learning is not very developed and tests for the presence of asymmetric information in this environment are lacking. Most often, researchers assume the presence of asymmetric information and propose models (sometimes dynamic) to reduce its effects but do not test for residual asymmetric information in final models. Potential non-optimal pricing of financial products may still be present. Moreover, it is often assumed that asymmetric information is exogenous and related to unobservable agent characteristics (adverse selection) without considering agents’ dynamic behavior over time (moral hazard). Our goal is to use machine learning models to develop new tests of asymmetric information in large panel data sets where the dynamic behavior of agents is observed. Applications in credit risk, high frequency trading, bank securitization, and insurance will be provided.
- Marc Fredette (HEC Montréal), Charlin Laurent, Léger Pierre-Majorique, Sénécal Sylvain, Courtemanche François, Labonté-Lemoyne Élise & Karran Alexander: Improving the prediction of the emotional and cognitive experience of users (UX) in interaction with technology using deep learning.
- The objective of this research project is to leverage new advances in artificial intelligence, and more specifically deep learning approaches, to improve the prediction of emotional and cognitive experience of users (UX) in interaction with technology. What users experience emotionally and cognitively when interacting with an interface is a key determinant of the success or failure of digital products and services. Traditionally, user experience has been assessed with post hoc explicit measures, (i.e. such as questionnaires. However, these measures are unable to capture the states of users while they interact with technology. Researchers are turning to neuroscience implicit measures to capture the user’s states through psychophysiological inference. Deep learning has recently enabled other fields such as image recognition to make significant progress and we expect that it will do the same for psychophysiological inference, allowing the automatic modeling of complex feature sets.
- Geneviève Gauthier (HEC Montréal), Amaya Diego, Bégin Jean-François, Cabeda Antonio & Malette-Campeau : L’utilisation des données financières à haute fréquence pour l’estimation de modèles financiers complexes
- Les modèles de marché permettant de reproduire la complexité des interactions entre l’actif sous-jacent et les options requièrent une complexité qui rend leur estimation très difficile. Ce projet de recherche propose d’utiliser les données financières d’options à haute fréquence afin de mieux mesurer et gérer les différents risques du marché.
- Michel Gendreau (Polytechnique Montréal), Potvin Jean-Yves, Aloise Daniel & Vidal Thibaut : Nouvelles approches pour la modélisation et la résolution de problèmes de livraisons à domicile.
- Ce projet porte sur le développement de nouvelles approches permettant de mieux aborder les problèmes de livraisons à domicile qui, suite à l’avènement généralisé du commerce électronique, ont connu un essor très important au cours de la dernière décennie. Une partie des travaux portera sur la modélisation même de ces problèmes, notamment en ce qui concerne les objectifs poursuivis par les expéditeurs. Le reste du projet visera sur le développement d’’heuristiques et de méta-heuristiques à la fine pointe des connaissances pour la résolution efficace de problèmes de grande taille.
- Bernard Gendron (Université de Montréal), Crainic Teodor Gabriel, Jena Sanjay Dominik & Lacoste-Julien Simon: Optimization and machine learning for fleet management of autonomous electric shuttles
- Recently, a Canada-France team of 11 researchers led by Bernard Gendron (DIRO-CIRRELT, UdeM) has submitted an NSERC-ANR strategic project “Trustworthy, Safe and Smart EcoMobility-on-Demand”, supported by private and public partners on both sides of the Atlantic: in Canada, GIRO and the City of Montreal; in France, Navya and the City of Valenciennes. The objective of this project is to develop optimization models and methods for planning and managing a fleet of autonomous electric shuttle vehicles. As a significant and valuable additional contribution to this large-scale project, we plan to study the impact of combining optimization and machine learning to improve the performance of the proposed models and methods.
- Julie Hussin (Université de Montréal), Gravel Simon, Romero Adriana & Bengio Yoshua: Deep Learning Methods in Biomedical Research: from Genomics to Multi-Omics Approaches
- Deep learning approaches represent a promising avenue to make important advances in biomedical science. Here, we propose to develop, implement and use deep learning techniques to combine genomic data with multiple types of biomedical information (eg. other omics datasets, clinical information) to obtain a more complete and actionable picture of the risk profile of a patient. In this project, we will be addressing the important problem of missing data and incomplete datasets, evaluating the potential of these approaches for prediction of relevant medical phenotypes in population and clinical samples, and developing integration strategies for large heterogeneous datasets. The efficient and integrated use of multiomic data could lead to the improvement of disease risk and treatment outcome predictions in the context of precision medicine.
- Sébastien Jacquemont (Université de Montréal), Labbe Aurélie, Bellec Pierre, Catherine Schramm, Chakravarty Mallar & Michaud Jacques: Modeling and predicting the effect of genetic variants on brain structure and function
- Neurodevelopmental disorders (NDs) represent a significant health burden. The genetic contribution to NDs is approximately 80%. Whole genome testing in pediatrics is a routine procedure and mutations contributing significantly to neurodevelopmental disorders are identified in over 400 patients every year at the Sainte Justine Hospital. However, the impact of these mutations on cognition and brain structure and function is mostly unknown. However, mounting evidence suggests that genes that share similar characteristics produce similar effects on cognitive and neural systems.
Our goal: Develop models to understand the effects of mutations, genome-wide, on cognition, brain structure and connectivity.
Models will be developed using large cohorts of individuals for whom, genetic, cognitive and neuroimaging data was collected.
Deliverable: Algorithms allowing clinicians to understand the contribution of mutations to the neurodevelopmental symptoms observed in their patients.
- Neurodevelopmental disorders (NDs) represent a significant health burden. The genetic contribution to NDs is approximately 80%. Whole genome testing in pediatrics is a routine procedure and mutations contributing significantly to neurodevelopmental disorders are identified in over 400 patients every year at the Sainte Justine Hospital. However, the impact of these mutations on cognition and brain structure and function is mostly unknown. However, mounting evidence suggests that genes that share similar characteristics produce similar effects on cognitive and neural systems.
- Karim Jerbi (Université de Montréal), Hjelm Devon, Plis Sergey, Carrier Julie, Lina Jean-Marc, Gagnon Jean-François & Dr Pierre Bellec: From data-science to brain-science: AI-powered investgation of the neuronal determinants of cognitive capacities in health, aging and dementia
- Artificial intelligence is revolutionizing science, technology and almost all aspects of our society. Learning algorithms that have shown astonishing performances in computer vision and speech recognition are also expected to lead to qualitative leaps in biological and biomedical sciences. In this multi-disciplinary research program, we propose to investigate the possibility of boosting information yield in basic and clinical neuroscience research by applying data-driven approaches, including shallow and deep learning, to electroencephalography (EEG) and magnetoencephalography (MEG) data in (a) healthy adults, and aging populations (b) with or (c) without dementia. The proposal brings together several scientists with expertise in a wide range of domains, ranging from data science, mathematics and engineering to neuroimaging, systems, cognitive and clinical neuroscience.
- Philippe Jouvet (Université de Montréal), Emeriaud Guillaume, Michel Desmarais, Farida Cheriet & Noumeir Rita: Clinical data validation processes: the example of a clinical decision support system for the management of Acute Respiratory Distress Syndrome (ARDS)
- In healthcare, data collection has been designed to document clinical activity for reporting, rather than for developing new knowledge. In this proposal, part of a research program on clinical decision support systems in real time in critical care, machine learning researchers and clinicians plan to generate algorithms to manage data corruption and data complexity using a unique research dataware house collecting hudge critically ill children data.
- Aurelie Labbe (HEC Montréal), Larocque Denis, Charlin Laurent & Miranda-Moreno: Data analytics methods for travel time estimation in transportation engineering
- Travel time is considered as one of the most important performance measures in urban mobility. It is used by both network operators and drivers as an indicator of quality
of service or as a metric influencing travel decisions. This proposal tackles the issue of travel time prediction from several angles: i) data pre-processing (map-matching), ii) short-term travel time prediction and iii) long-term travel time prediction. These tasks will require the development of new approaches in statistical and machine learning to adequately model GPS trajectory data and to quantify the prediction error.
- Travel time is considered as one of the most important performance measures in urban mobility. It is used by both network operators and drivers as an indicator of quality
- Frederic Leblond (Polytechnique Montréal), Trudel Dominique, Ménard Cynthia, Saad Fred, Jermyn Michael & Grosset Andrée-Anne: Machine learning technology applied to the discovery of new vibrational spectroscopy biomarkers for the prognostication of intermediate-risk prostate cancer patients
- Prostate cancer is the most frequent cancer among Canadian men, with approximately 25,000 diagnoses per year. Men with high risk and low risk disease almost always experience predictable disease evolution allowing optimal treatment selection. However, none of the existing clinical tests, imaging techniques or histopathology methods can be used to predict the fate of men with intermediate-risk disease. This is the source of a very important unmet clinical need, because while some of these patients remain free of disease for several years, in others cancer recurs rapidly after treatment. Using biopsy samples in tissue microarrays from 104 intermediate-risk prostate cancer patients with known outcome, we will use a newly developed Raman microspectroscopy technique along with machine learning technology to develop inexpensive prognostic tests to determine the risk of recurrence allowing clinicians to consider more aggressive treatments for patients with high recurrence risk.
- Pierre L’Ecuyer (Université de Montréal), Devroye Luc & Lacoste-Julien Simon: Monte Carlo and Quasi-Monte Carlo Methods for Optimization and Machine Learning
- The use of Monte Carlo methods (aka, stochastic simulation) has grown tremendously in the last few decades. They a now a central ingredient in many areas, including computational statistics, machine learning, and operations research. Our aim in this project is to study Monte Carlo methods and improve their efficiency, with a focus on applications to statistical modeling with big data, machine learning, and optimization. We are particularly interested in developing methods for which the error converges at a faster rate than straightforward Monte Carlo. We plan to free software that implements these methods.
- Eric Lecuyer (Université de Montréal), Blanchette Mathieu & Waldispühl Jérôme: Developing a machine learning framework to dissect gene expression control in subcellular space
- Our multidisciplinary team will develop and use an array of machine learning approaches to study a fundamental but poorly understood process in molecular biology, the subcellular localization of messenger RNAs, whereby the transcripts of different human genes are transported to various regions of the cell prior to translation. The project will entail the development of new learning approaches (learning from both RNA sequence and structure data, phylogenetically related training examples, batch active learning) combined with new biotechnologies (large-scale assays of both natural and synthetic RNA sequences) to yield mechanistic insights into the “localization code” and help understand its role in health and disease.
- Sébastien Lemieux (Université de Montréal), Bengio Yoshua , Sauvageau Guy & Cohen Joseph Paul: Deep learning for precision medicine by joint analysis of gene expression profiles measured through RNA-Seq and microarrays
- This project aims at developing domain adaptation techniques to enable the joint analysis of gene expression profiles datasets acquired using different technologies, such as RNA-Seq and microarrays. Doing so will leverage the large number of gene expression profiles publicly available, avoiding the typical problems and limitations caused by working with small datasets. More specifically, methods developed will be continuously applied to datasets available for Acute Myeloid Leukemia in which the team has extensive expertise.
- Andrea Lodi (Polytechnique Montréal), Bengio Yoshua, Charlin Laurent, Frejinger Emma & Lacoste-Julien Simon: Machine Learning for (Discrete) Optimization
- The interaction between Machine Learning and Mathematical Optimization is currently one of the most popular topics at the intersection of Computer Science and Applied Mathematics. While the role of Continuous Optimization within Machine Learning is well known, and, on the applied side, it is rather easy to name areas in which data-driven Optimization boosted by / paired with Machine Learning algorithms can have a game-changing impact, the relationship and the interaction between Machine Learning and Discrete Optimization is largely unexplored. This project concerns one aspect of it, namely the use of modern Machine Learning techniques within / for Discrete Optimization.
- Alejandro Murua (Université de Montréal), Quintana Fernando & Quinlan José: Gibbs-repulsion and determinantal processes for statistical learning
- Non-parametric Bayesian models are very popular for density estimation and clustering. However, they have a tendency to use too many mixture components due to their use of independent parameter priors. Repulsion processes priors such as determinantal processes, solve this issue by putting higher mass on parameter configurations for which the mixture components are well separated. We propose the use of Gibbs-like repulsion processes which are locally determinantal, or adaptive determinantal processes as priors for modeling density estimation, clustering, and temporal and/or spatial data.
- Marcelo Vinhal Nepomuceno (HEC Montréal), Charlin Laurent, Dantas Danilo C., & Cenesizoglu Tolga: Using machine learning to uncover how marketer-generated post content is associated with user-generated content and revenue
- This projects proposes how machine learning can be used to improve a company’s communication with its customers in order to increase sales. To that end, we will identify how broadcaster-generated content is associated with user-generated content and revenue measures. In addition, we intend to automate the identification of post content, and to propose personalized recurrent neural networks to identify the writing styles of brands and companies and automate the creation of online content.
- Dang Khoa Nguyen (Université de Montréal), Sawan Mohamad, Lesage Frédéric, Zerouali Younes & Sirpal Parikshat: Real-time detection and prediction of epileptic seizures using deep learning on sparse wavelet representations
- Epilepsy is a chronic neurological condition in which about 20% of patients do not benefit from any form of treatment. In order to diminish the impact of recurring seizures on their lives, we propose to exploit the potential of artificial intelligence techniques for predicting the occruence of seizures and detecting their early onset, such as to issue warnings to patients. The aim of this project is thus to develop an efficient algorithm based on deep neural networks for performing real-time detection and prediction of seizures. This work will pave the way for the development of intelligent implantable sensors coupled with alert systems and on-site treatment delivery.
- Jian-Yun Nie (Université de Montréal), Langlais Philippe, Tang Jian & Tapp Alain: Knowledge-based inference for question answering and information retrieval
- Question answering (QA) is a typical NLP/AI problem with wide applications. A typical approach first retrieves relevant text passages and then determines the answer from them. These steps are usually performed separately, undermining the quality of the answers. In this project, we aim at developing new methods for QA in which the two steps can benefit from each other. On one hand, inference based on a knowledge graph will be used to enhance the passage retrieval step; on the other hand, the retrieved passages will be incorporated into the second step to help infer the answer. We expect the methods to have a higher capability of determining the right answer.
- Jean-François Plante (HEC Montréal), Brown Patrick, Duschesne Thierry & Reid Nancy: Statistical modelling with distributed systems
- Statistical inference requires a large toolbox of models and algorithms that can accommodate different structures in the data. Modern datasets are often stored on distributed systems where the data are scattered across a number of nodes with limited bandwidth between them. As a consequence, many complex statistical models cannot be computed natively on those clusters. In this project, we will advance statistical modeling contributions to data science by creating solutions that are ideally suited for analysis on distributed systems.
- Doina Precup (McGill University), Bengio Yoshua & Pineau Joelle: Learning independently controllable features with application to robotics
- Learning good representations is key for intelligent systems. One intuition is that good features will disentangle distinct factors that explain variability in the data, thereby leading to the potential development of causal reasoning models. We propose to tackle this fundamental problem using deep learning and reinforcement learning. Specifically, a system will be trained to discover simultaneously features that can be controlled independently, as well as the policies that control them. We will validate the proposed methods in simulations, as well as by using a robotic wheelchair platform developed at McGill University .
- Marie-Ève Rancourt (HEC Montréal), Laporte Gilbert, Aloise Daniel, Cervone Guido, Silvestri Selene, Lang Stefan, Vedat Verter & Bélanger Valérie: Analytics and optimization in a digital humanitarian context
- When responding to humanitarian crises, the lack of information increases the overall uncertainty. This hampers relief efforts efficiency and can amplify the damages. In this context, technological advances such as satellite imaging and social networks can support data gathering and processing to improve situational awareness. For example, volunteer technical communities leverage ingenious crowdsourcing solutions to make sense of a vast volume of data to virtually support relief efforts in real time. This research project builds on such digital humanitarianism initiatives through the development of innovative tools that allow evidence-based decision making. The aim is to test the proposed methodological framework to show how data analytics can be combined with optimization to process multiple sources of data, and thus provide timely and reliable solutions. To this end, a multidisciplinary team will work on two different applications: a sudden-onset disaster and a slow-onset crisis.
- Louis-Martin Rousseau (Polytechnique Montréal), Adulyasak Yossiri, Charlin Laurent, Dorion Christian, Jeanneret Alexandre & Roberge David: Learning representations of uncertainty for decision making processes
- Decision support and optimization tools are playing an increasingly important role in today’s economy. The vast majority of such systems, however, assume the data is either deterministic or follows a certain form of theoretical probability functions. We aim to develop data driven representations of uncertainty, based on modern machine learning architectures such as probabilistic deep neural networks, to capture complex and nonlinear interactions. Such representations are then used in stochastic optimization and decision processes in the fields of cancer treatment, supply chain and finance.
- Nicolas Saunier (Polytechnique Montréal), Goulet James, Morency Catherine, Patterson Zachary & Trépanier Martin: Fundamental Challenges for Big Data Fusion and Strategic Transportation Planning
- As more and more transportation data becomes continuously available, transportation engineers and planners are ill-equipped to make use of it in a systematic and integrated way. This project aims to develop new machine learning methods to combine transportation data streams of various nature, spatial and temporal definitions and pertaining to different populations. The resulting model will provide a more complete picture of the travel demand for all modes and help better evaluate transportation plans. This project will rely on several large transportation datasets.
- Yvon Savaria (Polytechnique Montréal), David Jean-Pierre, Cohen-Adad Julien & Bengio Yoshua: Optimised Hardware-Architecture Synthesis for Deep Learning
- Deep learning requires considerable computing power. Computing power can be improved significantly by designing application specific computing engines dedicated to deep learning. The proposed project consists of designing and implementing a High Level Synthesis tool that will generate an RTL design from the code of an algorithm. This tool will optimize the architecture, the number of computing units, the length and representation of the numbers and the important parameters of the various memories generated.
- Mohamad Sawan (Polytechnique Montréal), Savaria Yvon & Bengio Yoshua: Equilibrium Propagation Framework: Analog Implementation for Improved Performances (Equipe)
- The main aim of this project is to implement the Equilibrium Propagation (EP) algorithm in analog circuits, rather than digital building blocks, to take advantage of their higher computation speed and power efficiency. EP involves minimization of an energy function, which requires a long relaxation phase that is costly (in terms of time) to simulate on digital hardware. But it can be accelerated through analog circuit implementation. Two main implementation phases in this project are: (1) Quick prototyping and proof of concep using an FPAA platform (RASP 3.0), and (2) High performance custom System-on-Chip (SoC) implementation using a standard CMOS process e.g. 65nm to optimize the area, speed, and power consumption.
- François Soumis (Polytechnique Montréal), Desrosiers Jacques, Desaulniers Guy, El Hallaoui Issmail, Lacoste-Julien Simon, Omer Jérémy & Mohammed Saddoune : Combiner l’apprentissage automatique et la recherche opérationnelle pour traiter plus rapidement les grands problèmes d’horaires d’équipages aériens
- Nous travaux récents portent sur le développement d’algorithmes d’optimisation exacts qui profitent de l’information a priori sur les solutions attendues pour réduire le nombre de variables et de contraintes à traiter simultanément. L’objectif est de développer un système d’apprentissage machine pour obtenir l’’information permettant d’accélérer le plus possible ces algorithmes d’’optimisation, pour traiter de plus grands problèmes d’’horaires d’’équipages aériens. Ce projet produira en plus des avancements en R. O. des avancements en apprentissage sous contraintes et par renforcement.
- An Tang (Université de Montréal), Pal Christopher, Kadoury Samuel, Bengio Yoshua, Turcotte Simon, Nguyen Bich & Anne-Marie Mes-Masson: Predictive model of colorectal cancer liver metastases response to chemotherapy
- Colon cancer is the 2nd leading cause of mortality in Canada. In patients with colorectal liver metastases, response to chemotherapy is the main determinant of patient survival. Our multidisciplinary team will develop models based to predict response to chemotherapy and patient prognosis using the most recent innovations in deep learning architectures. We will train our model on data from an institutional biobank and validate our model on independent provincial imaging and medico-administrative databases.
- Pierre Thibault (Université de Montréal), Lemieux Sébastien, Bengio Yoshua & Perreault Claude: Matching MHC I-associated peptide spectra to sequencing reads using deep neural networks
- Identification of MHC I-associated peptides (MAPs) unique to a patient or tumor is key step in developing efficacious cancer immunotherapy. This project aims at developing a novel approach for exploiting Deep Neural Networks (DNN) for the identification of MAPS based on a combination of next-generation sequencing (RNA-Seq) and tandem mass spectrometry (MS/ MS). The proposed developments will take advantage of a unique dataset of approximately 60,000 (MS/MS – sequence) pairs assembled by our team. The project will also bring together researchers from broad horizons: mass spectrometry, bioinformatics, machine learning and cancer immunology
- Charles Audet (Polytechnique Montréal), Sébastien Le Digabel, Michael Kokkolaras, Miguel Diage Martinez: Combining machine learning and blackbox optimization for engineering design
- The efficiency of machine learning (ML) techniques relies on many mathematical foundations, one of which being optimization and its algorithms. Some aspects of ML can be approached using the simplex method, dynamic programming, line-search, Newton or quasi-Newton descent techniques. But there are many ML problems that do not posses an exploitable structure necessary for the application of the above methods. The objective of the present proposal is to merge, import, specialize and develop blackbox optimization (BBO) techniques in the context of ML. BBO considers problems in which the analytical expressions of the objective function and/or of the constraints defining an optimization are unavailable. The most frequent situation is when these functions are computed through a time-consuming simulation. These functions are often nonsmooth, contaminated by numerical noise and can fail to produce an usable output. Research in BBO is in constant growth since the last 20 years, and has seen a variety of applications in many fields. The research projects will be bidirectional. We plan to use and develop BBO techniques to improve the performance of ML algorithms. Conversely, we plan to deploy ML strategies to improve the efficiency of BBO algorithms.
- Julien Cohen-Adad (Polytechnique Montréal), Yoshua Bengio, Joseph Cohen, Nicolas Guizard, Kawin Setsompop, Anne Kerbrat, David Cadotte: Physics-informed deep learning architecture to generalize medical imaging tasks
- The field of AI has flourished in recent years; in particular deep learning has shown unprecedented performance for image analysis tasks, such as segmentation and labeling of anatomical and pathological features. Unfortunately, while dozens of deep learning papers applied to medical imaging get published every year, most methods are tested in single-center: in the rare case where the code is publicly available, the algorithm usually fails when applied to other centers, which is the “real-world” scenario. This happens because images from different centers have different features than the images used to train the algorithm (contrast, resolution, etc.). Another issue limiting the performance potential of deep learning in medical imaging is that little data and few manual labels are available, and the labels are themselves highly variable across experts. The main objective of this project is to push the generalization capabilities of medical imaging tasks by incorporating prior information from MRI physics and from the inter-rater variability into deep learning architectures. A secondary objective will be to disseminate the developed methods to research and hospital institutions via open-source software (www.ivadomed.org), in-situ training and workshops.
- Patricia Conrod (Université de Montréal), Irina Rish, Sean Spinney: A neurodevelopmentally-informed computational model of flexible human learning and decision making
- The adolescent period is characterized by significant neurodevelopmental changes which impact on reinforcement learning and the efficiency with which such learning occurs. Our team has modelled passive-avoidance learning using a bayesian reinforcement learning framework. Results indicated that parameters estimating individual differences in impulsivity, reward sensitivity, punishment sensitivity and working memory, best predicted human behaviour on the task. The model was also sensitive to year-to-year changes in performance (cognitive development), with individual components of the learning model showing different developmental growth patterns and relationships to health risk behaviours. This project aims to expand and validate this computer model of human cognition to: 1) Better measure neuropsychological age/delay; 2) understand how learning parameters contribute to human decision making processes on more complex learning tasks; 3) simulate better learning scenarios to inform development of targeted interventions that boost human learning and decision making; and 4) inform next generation artificial intelligence models of lifelong learning.
- Numa Dancause (Université de Montréal), Guillaume Lajoie, Marco Bonizzato: Novel AI driven neuroprothetics to shape stroke recovery
- Stroke is the leading cause of disability in occidental countries. After stroke, patients often have abnormally low activity in the part of the brain that controls movements, the motor cortex. However, the malfunctioning motor cortex receives connections from multiple spared brain regions. Our general hypothesis is that neuroprostheses interfacing with the brain can exploit these connections to help restore adequate motor cortex activation after stroke. In theory, brain connections can be targeted using new electrode technologies, but this problem is highly complex. It cannot be done by hand, one patient at a time. We need automated stimulation strategies to harness this potential for recovery. Our main objective is thus to develop an algorithm that efficiently finds the best residual connections to restore adequate excitation of the motor cortex after stroke. In animals, we will implant hundreds of electrodes in the diverse areas connected with the motor cortex. The algorithm will learn the pattern of stimulation that is the most effective to increase activity in the motor cortex. For the first time, machine learning will become a structural part of neuroprosthetic design. We will use these algorithms to create a new generation of neuroprotheses that act as rehabilitation catalyzers.
- Michel Denault (HEC Montréal), Dominique Orban, Pierre-OIivier Pineau: Paths to a cleaner Northeast energy system through approximate dynamic programming
- Our main research question is the design of greener energy systems for the American Northeast (Canada and USA). Some of the subquestions are as follows. How can renewable energy penetrate the markets? Are supplementary power transmission lines necessary ? Can energy storage improve the intermittency problems of wind and solar power? Which greenhouse gases (GHG) reductions are achievable ? What is the cost of such changes ? Crucially, what is the path to a better system ? To support the transition to this new energy system, our proposition is : 1. to model the evolution of the Northeast power system as a Markov Decision process (MDP), including crucial uncertainties, e.g. on technological advances and renewable energy cost; 2. to solve this decision process with dynamic programming and reinforcement learning techniques; 3. to derive energy/environmental policy intelligence from our computational results. Our methodological approach relies on two building blocks, an inter-regional energy model and a set of algorithmic tools to solve the model as an MDP.
- Vincent Grégoire (HEC Montréal), Christian Dorion, Manuel Morales, Thomas Hurtut: Learning the Dynamics of the Limit Order Book
- Modern financial markets are increasingly complex. A particular topic of interest is how this complexity affects how easily investors can buy or sell securities at a fair price. Many have also raised concerns that algorithms trading at high frequency could create excess volatility and crash risk. The central objective of our research agenda is to better understand the fundamental forces at play in those markets where trading speed is now measured in nanoseconds. Our project seeks to lay the groundwork, using big data, visualization, and machine learning, to answer some of the most fundamental questions in the literature on market structure. Ultimately, we envision an environment in which we could learn the behavior of the various types of agents in a given market. Once such an environment is obtained, it would allow us to better understand, for instance, the main drivers of major market disruptions. More importantly, it could allow us to guide regulators in the design of new regulations, by testing them in a highly realistic simulation setup, thereby avoiding the unintended consequences associated with potential flaws in the proposed regulation.
- Mehmet Gumus (McGill University), Erick Delage, Arcan Nalca, Angelos Georghiou: Data-driven Demand Learning and Sharing Strategies for Two-Sided Online Marketplaces
- The proliferation of two-sided online platforms managed by a provider is disrupting the global retail industry by enabling consumers (on one side) and sellers (on the other side) to interact in exponential ways. Evolving technologies such as artificial intelligence, big data analytics, distributed ledger technology, and machine learning are posing challenges and opportunities for the platform providers with regards to understanding the behaviors of the stakeholders – consumers, and third-party sellers. In this proposed research project, we will focus on two-sided platforms for which demand-price relationship is unknown upfront and has to be learned from accumulating purchase data, thus highlighting the importance of the information-sharing environment. In order to address this problem, we will focus on the following closely connected research objectives: 1.Identify the willingness-to-pay and purchase decisions (i.e., conversion rate) of online customers based on how they respond to the design of product listing pages, online price and promotion information posted on the page, shipping and handling prices, and stock availability information. 2.Determine how much of the consumer data is shared with the sellers and quantify the value of different information sharing configurations – given the sellers’ optimal pricing, inventory (product availability), and product assortment (variety) decisions within a setting.
- Julie Hussin (Université de Montréal), Sébastien Lemieux, Matthieu Ruiz, Yoshua Bengio, Ahmad Pesaranghader: Interpretability of Deep Learning Approaches Applied to Omics Datasets
- The high-throughput generation of molecular data (omics data) nowadays permits researchers to glance deeply into the biological variation that exists among individuals. This variation underlies the differences in risks for human diseases, as well as efficacy in their treatment. This requires combining multiple biological levels (multi-omics) through flexible computational strategies, including machine learning (ML) approaches, becoming highly popular in biology and medicine, with a particular enthusiasm for deep neural networks (DNNs). While it appears like a natural way to analyze complex multi-omics datasets, the application of such techniques to biomedical datasets poses an important challenge: the black-box problem. Once a model is trained, it can be difficult to understand why it gives a particular response to a set of data inputs. In this project, our goal is to train and apply state-of-the-art ML models to extract accurate predictive signatures from multi-omics datasets while focusing on biological interpretability. This will contribute to building the trust of the medical community in the use of these algorithms and will lead to deeper insights into the biological mechanisms underlying disease risk, pathogenesis and response to therapy.
- Jonathan Jalbert (Polytechnique Montréal), Françoise Bichai, Sarah Dorner, Christian Genest : Modélisation des surverses occasionnées par les précipitations et développement d’outils adaptés aux besoins de la Ville de Montréal
- La contamination fécale des eaux de surface constitue l’une des premières causes de maladies d’origine hydrique dans les pays industrialisés et dans les pays en voie de développement. En zone urbaine, la contamination fécale provient majoritairement des débordements des réseaux d’égouts combinés. Lors de précipitations, l’eau pluviale entre dans le réseau d’égouts et se mélange à l’eau sanitaire pour être acheminée vers la station d’épuration. Si l’intensité des précipitations dépasse la capacité de transport du réseau, le mélange des eaux pluviales et sanitaires est alors directement rejeté dans le milieu récepteur sans passer par la station d’épuration. Ces débordements constituent un risque environnemental et un enjeu de santé publique. À l’heure actuelle, les caractéristiques des événements pluvieux occasionnant des surverses sont incertaines. Ce projet de recherche vise à tirer profit des données sur les surverses récemment rendues publiques par la Ville de Montréal pour caractériser les événements de précipitations occasionnant des surverses sur son territoire. Cette caractérisation permettra, d’une part, d’estimer le nombre de surverses attendues pour le climat projeté des prochaines décennies. D’autre part, elle sera utilisée pour dimensionner les mesures de mitigation, tels que les bassins de rétention et les jardins de pluie.
- Nadia Lahrichi (Polytechnique Montréal), Sebastien Le Digabel, Andrea Matta, Nicolas Zufferey, Andrea Lodi, Chunlong Yu: Reactive/learning/self-adaptive metaheuristics for healthcare resource scheduling
- The goal of this research proposal is to develop state-of-the-art decision support tools to address the fundamental challenges of accessible and quality health services. The challenges to meeting this mandate are real, and efficient resource management is a key factor in achieving this goal. This proposal will specifically focus on applications related to patient flow. Analysis of the literature shows that most research focuses on single-resource scheduling and considers that demand is known; Patient and resource scheduling problems are often solved sequentially and independently. The research goals is to develop efficient metaheuristic algorithms to solve integrated patient and resource scheduling problems under uncertainty (e.g., demand, prole, and availability of resources). This research will be divided into three main themes, each of them investigating a different avenue to more efficient metaheuristics: A) learning approaches to better explore the search space; B) blackbox optimization for parameter tuning; and C) simulation-inspired approaches to control the noise induced by uncertainty.
- Eric Lécuyer (Université de Montréal), Mathieu Blanchette, Jérôme Waldispühl, William Hamilton: Deciphering RNA regulatory codes and their disease-associated alterations using machine learning.
- The human DNA genome serves as an instruction guide to allow the formation of all the cells and organs that make up our body over the course of our lives. Much of this genome is transcribed into RNA, termed the ‘transcriptome’, that serves as a key conveyor of genetic information and provides the template for the synthesis of proteins. The transcriptome is itself subject to many regulatory steps for which the basic rules are still poorly understood. Importantly, when these steps are improperly executed, this can lead to disease. This project aims to utilize machine learning approaches to decipher the complex regulatory code that controls the human transcriptome and to predict how these processes may go awry in different disease settings.
- Gregory Lodygensky (Université de Montréal), Jose Dolz, Josée Dubois, Jessica Wisnowski: Next generation neonatal brain segmentation built on HyperDense-Net, a fully automated real-world tool
- There is growing recognition that major breakthroughs in healthcare will result from the combination of databanks and artificial intelligence (AI) tools. This would be very helpful in the study of the neonatal brain and its alterations. For instance, the neonatal brain is extremely vulnerable to the biological consequences of prematurity or birth asphyxia, resulting in cognitive, motor, language and behavioural disorders. A key difference with adults is that key aspects of brain-related functions can only be tested several years later, hindering greatly the advancement of neonatal neuroprotection. Researchers and clinicians need objective tools to immediately assess the effectiveness of a therapy that is given to protect the brain without waiting five years to see if it succeeded. Neonatal brain magnetic resonance imaging can bridge this gap. However, it represents a real challenge as this period of life represents a unique period of intense brain growth (e.g. myelination and gyrification) and brain maturation. Thus, we plan to improve our existing neonatal brain segmentation tools (i.e. HyperDense-Net) using the latest iterations of AI tools. We will also develop a validated tool to determine objective brain maturation in newborns.
- Alexandra M. Schmidt (McGill University), Jill Baumgartner, Brian Robinson, Marília Carvalho, Oswaldo Cruz, Hedibert Lopes: Flexible multivariate spatio-temporal models for health and social sciences
- Health and social economic variables are commonly observed at different spatial scales of a region (e.g. districts of a city or provinces of a country), over a given period of time. Commonly, multiple variables are observed at a given spatial unit resulting in high dimensional data. The challenge in this case is to consider models that account for the possible correlation among variables across space or space and time. This project aims at developing statistical methodology that accounts for this complex hierarchical structure of the observed data. And inference procedure follows the Bayesian paradigm meaning that uncertainty about the unknowns in the model is naturally accounted for. The project is subdivided into four subprojects that range from the estimation of a social economic vulnerability index for a given city to the spatio-temporal modelling of multiple vector borne diseases. The statistical tools proposed here will help authorities with the understanding of the dynamics across space and time of multiple diseases, and assist with the decision making process of evaluating how urban policies and programmes will impact the urban environment and population health, through a lens of health equity.
- Adam Oberman (McGill University), Michael Rabbat, Chris Finlay, Levon Nukbekyan: Robustness and generalization guarantees for Deep Neural Networks in security and safety critical applications
- Despite impressive human-like performance on many tasks, deep neural networks are surprisingly brittle in scenarios outside their previous experience, often failing when new experiences do not closely match their previous experiences. This ‘failure to generalize’ is a major hurdle impeding the adoption of an otherwise powerful tool in security- and safety-critical applications, such as medical image classification. The issue is in part due to a lack of our theoretical understanding of why neural networks work so well. They are powerful tools but less interpretable than traditional machine learning methods which have performance guarantees but do not work as well in practice. This research program will aim to address this ‘failure to generalize’, by developing guarantees of generalization, using notions of the complexity of a regularized model, corresponding to model averaging. This approach will be tested in computer vision applications, and will have near-term applications to medical health research, through medical image classification and segmentation. More broadly, the data science methods developed under this project will be applicable to a wide variety of fields and applications, notably wherever reliability and safety are paramount.
- Liam Paull (Université de Montréal), Derek Nowrouzezahrai, James Forbes: Differentiable perception, graphics, and optimization for weakly supervised 3D perception
- An ability to perceive and understand the world is a prerequisite for almost any embodied agent to achieve almost any task in the world. Typically, world representations are hand-constructed because it difficult to learn them directly from sensor signals. In this work, we propose to build the components so that this map-building procedure is differentiable. Specifically, we will focus on the perception (grad-SLAM) and the optimization (meta-LS) components. This will allow us to backpropagate error signals from the 3D world back to the sensor inputs. This enables us to do many things, such as regularize sensor data with 3D geometry. Finally, by also building a differentiable rendering component (grad-Sim), we can leverage self-supervision through cycle consistency to learn representations with no or sparse hand-annotated labels. Combining all of these components together gives us the first method of world representation building that is completely differentiable and self-supervised.
- Gilles Pesant (Polytechnique Montréal), Siva Reddy, Sarath Chandar Anbil Parthipan: Investigating Combinations of Neural Networks and Constraint Programming for Structured Prediction
- L’intelligence artificielle occupe une place de plus en plus importante dans de nombreuses sphères d’activité et dans notre quotidien. En particulier, les réseaux de neurones arrivent maintenant à assimiler puis à accomplir des tâches auparavant réservées aux humains. Cependant lorsqu’une tâche nécessite le respect de règles structurantes complexes, un réseau de neurones éprouve parfois beaucoup de mal à apprendre ces règles. Or un autre domaine de l’intelligence artificielle, la programmation par contraintes, a précisément été conçue pour trouver des solutions respectant de telles règles. Le but de ce projet est donc d’étudier des combinaisons de ces deux approches à l’intelligence artificielle afin de plus facilement apprendre à accomplir des tâches sous contraintes. Dans le cadre du projet, nous nous concentrerons sur le domaine du traitement de la langue naturelle mais nos travaux pourront aussi s’appliquer à des tâches dans d’autres domaines.
- Jean-François Plante (HEC Montréal), Patrick Brown, Thierry Duchesne, Nancy Reid, Luc Villandré: Statistical inference and modelling for distributed systems
- Statistical inference requires a large toolbox of models and algorithms that can accommodate complex data structures. Modern datasets are often so large that they need to be stored on distributed systems, with the data stored across a number of nodes with limited bandwidth between them. Many complex statistical models cannot be used with such complex data, as they rely on the complete data being accessible. In this project, we will advance statistical modeling contributions to data science by creating solutions that are ideally suited for analysis on distributed systems. More specifically, we will develop spatio-temporal models as well as accurate and efficient approximations of general statistical models that are suitable for distributed data, and as such, scalable to massive data.
- Wei Qi (McGill University), Xue (Steve) Liu, Max Shen, Michelle Lu: Deals on Wheels: Advancing Joint ML/OR Methodologies for Enabling City-Wide, Personalized and Mobile Retail
- Moving forward to a smart-city future, cities in Canada and around the world are embracing the emergence of new retail paradigms. That is, retail channels can further diversify beyond the traditional online and offline boundaries, combining the best of the both. In this project, we focus on an emerging mobile retail paradigm in which retailers run their stores on mobile vehicles or self-driving cars. Our mission is to develop cross-disciplinary models, algorithms and data-verified insights for enabling mobile retail. We will achieve this mission by focusing on three interrelated research themes: Theme 1 – Formulating novel optimization problems of citywide siting and inventory replenishment for mobile stores. Theme 2 – Developing novel learning models for personalized demand estimation. Theme 3 – Integrating Theme 1 and Theme 2 by proposing a holistic algorithmic framework for joint and dynamic demand learning and retail operations, and for discovering managerial insights. The long-term goal is to thereby advance the synergy of operations and machine learning methodologies in the broad contexts of new retail and smart-city analytics.
- Marie-Ève Rancourt (HEC Montréal), Gilbert Laporte, Aurélie Labbe, Daniel Aloise, Valérie Bélanger, Joann de Zegher, Burcu Balcik, Marilène Cherkesly, Jessica Rodriguez Pereira: Humanitarian Supply Chain Analytics
- Network design problems lie at the heart of the most important issues faced in the humanitarian sector. However, given their complex nature, humanitarian supply chains involve the solution of difficult analytics problems. The main research question of this project is “how to better analyze imperfect information and address uncertainty to support decision making in humanitarian supply chains?”. To this end, we propose a methodological framework combining data analysis and optimization, which will be validated through real-life applications using multiple sources of data. First, we propose to build robust relief networks under uncertainty in demand and transportation accessibility, due to weather shocks and vulnerable infrastructures. We will consider two contexts: shelter location in Haiti and food aid distribution planning in Southeastern Asia. Second, we propose to embed fair cost sharing mechanisms into a collaborative prepositioning network design problem arising in the Caribbean. Classic economics methods will be adapted to solve large-scale stochastic optimization problems, and novel models based on catastrophic insurance theory will be proposed. Finally, a simulation will be developed to disguise data collection as a serious game and gather real-time information on the behavior of decision makers during disasters to extrapolate the best management strategies.
- Saibal Ray (McGill University), Maxime Cohen, James Clark, Ajung Moon: Retail Innovation Lab: Data Science for Socially Responsible Food Choices
- In this research program, we propose to investigate the use of artificial intelligence techniques, involving data, models, behavioral analysis, and decision-making algorithms, to efficiently provide higher convenience for retail customers while being socially responsible. In particular, the research objective of the multi-disciplinary team is to study, implement, and validate systems for guiding customers to make healthy food choices in a convenience store setting, while being cognizant of privacy concerns, both online and in a brick-and-mortar store environment. The creation of the digital infrastructure and decision support systems that encourage people and organizations to make health-promoting choices should hopefully result in a healthier population and reduce the costs of chronic diseases to the healthcare system. These systems should also foster the competitiveness of organizations operating in the agri-food and digital technology sectors. A distinguishing feature of this research program is that it will make use of a unique asset – a new “living-lab”, the McGill Retail Innovation Lab (MRIL). It will house a fully functioning retail store operated by a retail partner with extensive sensing, data access, and customer monitoring. The MRIL will be an invaluable source of data to use in developing and validating our approaches as well as a perfect site for running field experiments.
- Léo Raymond-Belzile (HEC Montréal), Johanna Nešlehová, Alexis Hannart, Jennifer Wadsworth: Combining extreme value theory and causal inference for data-driven flood hazard assessment
- The IPCC reports highlight increase in mean precipitation, but the impact of climate change on streamflow are not as certain and the existing methodology is ill-equipped to predict changes in flood extremes. Our project looks into climate drivers impacting flood hazard and proposes methodological advances based on extreme value theory and causal inference in order to simulate realistic streamflow extremes at high resolution. The project will also investigate how climate drivers impact the hydrological balance using tools from machine learning for causal discovery to enhance risk assessment of flood hazard.
- Nicolas Saunier (Polytechnique Montréal), Francesco Ciari, Catherine Morency, Martin Trépanier, Lijun Sun: Bridging Data-Driven and Behavioural Models for Transportation
- Transportation data is traditionally collected through travel surveys and fixed sensors, mostly on the roadways: such data is expensive to collect and has limited spatial and temporal coverage. In recent years, more and more transportation data has become available on a continuous basis from multiple new sources, including users themselves. This has fed the rise of machine learning methods that can learn models directly from data. Yet, such models often lack robustness and may be difficult to transfer to a different region or period. This can be alleviated by taking advantage of domain knowledge stemming from the properties of the flow of people moving in transportation systems with daily activities. This project aims to develop hybrid methods relying on transportation and data-driven models to predict flows for all modes at different spatial and temporal scales using multiple sources of heterogeneous data. This results in two specific objectives: 1. to learn probabilistic flow models at the link level for several modes based on heterogeneous data; 2. to develop a method bridging the flow models (objective 1) with a dynamic multi-agent transportation model at the network level. These new models and methods will be developed and tested using real transportation data.
- Yvon Savaria (Polytechnique Montréal), François Leduc-Primeau, Elsa Dupraz, Jean-Pierre David, Mohamad Sawan: Ultra-Low-Energy Reliable DNN Inference Using Memristive Circuits for Biomedical Applications (ULERIM)
- Recent advances in machine learning based on deep neural networks (DNNs) have brought powerful new capabilities for many signal processing tasks. These advances also hold great promises for several applications in healthcare. However, state-of-the-art DNN architectures may depend on hundreds of millions of parameters that must be stored and then retrieved, resulting in a large energy usage. Thus, it is essential to reduce their energy consumption to allow in-situ computations. One possible approach involves using memristor devices, a concept first proposed in 1971 but only recently put in practice. Memristors are a very promising way to implement compact and energy-efficient artificial neural networks. The aim of this research is to advance the state-of-the-art in the energy-efficient implementation of deep neural networks using memristive circuits and introducing DNN-specific methods to better manage uncertainty inherent to integrated circuit fabrication. These advances will benefit a large number of medical applications for which portable devices are required to perform a complex analysis of the state of the patient, and also benefit generally the field of machine learning by reducing the amount of energy required to apply it. Within this project, the energy improvements will be exploited to improve the signal processing performance of an embedded biomedical device for the advanced detection of epileptic seizures.
- David Stephens (McGill University), Yu Luo, Erica Moodie, David Buckeridge, Aman Verma: Statistical modelling of health trajectories and interventions
- Large amounts of longitudinal health records are now collected in private and public healthcare systems. Data from sources such as electronic health records, healthcare administrative databases and data from mobile health applications are available to inform clinical and public health decision-making. In many situations, such data enable the dynamic monitoring of the underlying disease process that governs the observations. However, this process is not observed directly and so inferential methods are needed to ascertain progression. The objective of the project is to build a comprehensive Bayesian computational framework for performing inference for large scale health data. In particular, the project will focus on the analysis of records that arise in primary and clinical care contexts to study patient health trajectories, that is, how the health status of a patient changes across time. Having been able to infer the mechanisms that influence health trajectories, we will then be able to introduce treatment intervention policies that aim to improve patient outcomes.
- An Tang (Université de Montréal), Irina Rish, Guy Wolf, Guy Cloutier, Samuel Kadoury, Eugene Belilovsky, Michaël Chassé, Bich Nguyen: Ultrasound classification of chronic liver disease with deep learning
- Chronic liver disease is one of the top ten leading causes of death in North America. The most common form is nonalcoholic fatty liver disease which may evolve to nonalcoholic steatohepatitis and cirrhosis if left untreated. In many cases, the liver may be damaged without any symptoms. A liver biopsy is currently required to evaluate the severity of chronic liver disease. This procedure requires the insertion of a needle inside the liver to remove a small piece of tissue for examination under microscope. Liver biopsy is an invasive procedure with a risk of major complications such as bleeding. Ultrasound is ideal for screening patients because it is a safe and widely available technology to image the whole liver. Our multi-disciplinary team is proposing the use of novel artificial intelligence techniques to assess the severity of chronic liver disease from ultrasound images and determine the severity of liver fat, inflammation, and fibrosis without the need for liver biopsy. This study is timely because chronic liver disease is on the rise which means that complications and mortality will continue to rise if there is no alternative technique for early detection and monitoring of disease severity.
- Guy Wolf (Université de Montréal), Will Hamilton, Jian Tang: Unified approach to graph structure utilization in data science
- While deep neural networks are at the frontier of machine learning and data science research, their most impressive results come from data with clear spatial/temporal structure (e.g., images or audio signals) that informs network architectures to capture semantic information (e.g., textures, shapes, or phonemes). Recently, multiple attempts have been made to extend such architectures to non-Euclidean structures that typically exist in data, and in particular to graphs that model data geometry or interaction between data elements. However, so far, such attempts have been separately conducted by largely-independent communities, leveraging specific tools from traditional/spectral graph theory, graph signal processing, or applied harmonic analysis. We propose a multidisciplinary unified approach (combining computer science, applied mathematics, and decision science perspectives) for understanding deep graph processing. In particular, we will establish connections between spectral and traditional graph theory applied for this task, introduce rich notions of intrinsic graph regularity (e.g., equivalent to image textures), and enable continuous-depth graph processing (i.e., treating depth as time) to capture multiresolution local structures. Our computational framework will unify the multitude of existing disparate attempts and establish rigorous foundations for the emerging field of geometric deep learning, which is a rapidly growing fields in machine learning.