2024
Journal articles
- titre
- On the consistency of supervised learning with missing values
- auteur
- Julie Josse, Jacob M. Chen, Nicolas Prost, Gaël Varoquaux, Erwan Scornet
- article
- Statistical Papers, 2024, 65 (9), pp.5447-5479. ⟨10.1007/s00362-024-01550-4⟩
- Resume_court
- In many application settings, the data have missing entries which make analysis challenging. An abun …..
- Accès au texte intégral et bibtex
- titre
- Causal inference methods for combining randomized trials and observational studies: a review
- auteur
- Bénédicte Colnet, Imke Mayer, Guanhua Chen, Awa Dieng, Ruohong Li, Gaël Varoquaux, Jean-Philippe Vert, Julie Josse, Shu Yang
- article
- Statistical Science, In press
- Resume_court
- With increasing data availability, causal effects can be evaluated across different data sets, both …..
- Accès au texte intégral et bibtex
2023
Journal articles
- titre
- Relational Data Embeddings for Feature Enrichment with Background Information
- auteur
- Alexis Cvetkov-Iliev, Alexandre Allauzen, Gaël Varoquaux
- article
- Machine Learning, 2023, 112 (2), pp.687-720. ⟨10.1007/s10994-022-06277-7⟩
- Resume_court
- For many machine-learning tasks, augmenting the data table at hand with features built from external …..
- Accès au texte intégral et bibtex
Book sections
- titre
- Evaluating machine learning models and their diagnostic value
- auteur
- Gaël Varoquaux, Olivier Colliot
- article
- Olivier Colliot. Machine Learning for Brain Disorders, Springer, 2023
- Resume_court
- This chapter describes model validation, a crucial part of machine learning whether it is to select …..
- Accès au texte intégral et bibtex
2022
Journal articles
- titre
- Machine learning for medical imaging: methodological failures and recommendations for the future
- auteur
- Gaël Varoquaux, Veronika Cheplygina
- article
- npj Digital Medicine, 2022, 5 (1), pp.48. ⟨10.1038/s41746-022-00592-y⟩
- Resume_court
- Research in computer analysis of medical images bears many promises to improve patients’ health. H …..
- Accès au bibtex
- titre
- Causal effect on a target population: a sensitivity analysis to handle missing covariates
- auteur
- Bénédicte Colnet, Julie Josse, Gaël Varoquaux, Erwan Scornet
- article
- Journal of Causal Inference, 2022, 10 (1), pp.372-414. ⟨10.1515/jci-2021-0059⟩
- Resume_court
- Randomized Controlled Trials (RCTs) are often considered as the gold standard to conclude on the cau …..
- Accès au texte intégral et bibtex
- titre
- How to remove or control confounds in predictive models, with applications to brain biomarkers
- auteur
- Darya Chyzhyk, Gaël Varoquaux, Michael Milham, Bertrand Thirion
- article
- GigaScience, 2022, 11, ⟨10.1093/gigascience/giac014⟩
- Resume_court
- Background : With increasing data sizes and more easily available computational methods, neuroscienc …..
- Accès au texte intégral et bibtex
- titre
- Analytics on Non-Normalized Data Sources: more Learning, rather than more Cleaning
- auteur
- Alexis Cvetkov-Iliev, Alexandre Allauzen, Gaël Varoquaux
- article
- IEEE Access, In press, 10, pp.42420-42431. ⟨10.1109/ACCESS.2022.3168013⟩
- Resume_court
- Data analysis is increasingly performed over data assembled from uncontrolled sources, facing incons …..
- Accès au texte intégral et bibtex
- titre
- Benchmarking missing-values approaches for predictive models on health databases
- auteur
- Alexandre Perez-Lebel, Gaël Varoquaux, Marine Le Morvan, Julie Josse, Jean-Baptiste Poline
- article
- GigaScience, In press, ⟨10.1093/gigascience/giac013⟩
- Resume_court
- BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they …..
- Accès au texte intégral et bibtex
2021
Journal articles
- titre
- Preventing dataset shift from breaking machine-learning biomarkers
- auteur
- Jérôme Dockès, Gaël Varoquaux, Jean-Baptiste Poline
- article
- GigaScience, In press, ⟨10.1093/gigascience/giab055⟩
- Resume_court
- Machine learning brings the hope of finding new biomarkers extracted from cohorts with rich biomedic …..
- Accès au texte intégral et bibtex
Conference papers
- titre
- AI as statistical methods for imperfect theories
- auteur
- Gaël Varoquaux
- article
- NeurIPS 2021 – 35th Conference on Neural Information Processing Systems. Workshop: AI for Science, Dec 2021, Virtual, France
- Resume_court
- Science has progressed by reasoning on what models could not predict because they were missing impor …..
- Accès au texte intégral et bibtex
- titre
- What’s a good imputation to predict with missing values?
- auteur
- Marine Le Morvan, Julie Josse, Erwan Scornet, Gaël Varoquaux
- article
- NeurIPS 2021 – 35th Conference on Neural Information Processing Systems, Dec 2021, Virtual, France. ⟨10.48550/arXiv.2106.00311⟩
- Resume_court
- How to learn a good predictor on data with missing values? Most efforts focus on first imputing as w …..
- Accès au texte intégral et bibtex
- titre
- Accounting for variance in machine learning benchmarks
- auteur
- Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent
- article
- MLsys 2021 – 4th Conference on Machine Learning and Systems, Apr 2021, San Francisco (virtual), United States
- Resume_court
- Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally ca …..
- Accès au texte intégral et bibtex
- titre
- A lightweight neural model for biomedical entity linking
- auteur
- Lihu Chen, Gaël Varoquaux, Fabian Suchanek
- article
- AAAI 2021 – The Thirty-Fifth Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence, Feb 2021, Palo Alto (virtual), United States. pp.12657-12665
- Resume_court
- Biomedical entity linking aims to map biomedical mentions, such as diseases and drugs, to standard e …..
- Accès au texte intégral et bibtex
2020
Journal articles
- titre
- Tropical Cyclone Track Forecasting using Fused Deep Learning from Aligned Reanalysis Data
- auteur
- Sophie Giffard-Roisin, Mo Yang, Guillaume Charpiat, Christina Kumler Bonfanti, Balázs Kégl, Claire Monteleoni
- article
- Frontiers in Big Data, 2020, 3, pp.1. ⟨10.3389/fdata.2020.00001⟩
- Resume_court
- The forecast of tropical cyclone trajectories is crucial for the protection of people and property. …..
- Accès au texte intégral et bibtex
- titre
- An Experimental Study of State-of-the-Art Entity Alignment Approaches
- auteur
- Xiang Zhao, Weixin Zeng, Jiuyang Tang, Wei Wang, Fabian Suchanek
- article
- IEEE Transactions on Knowledge and Data Engineering, 2020, ⟨10.1109/TKDE.2020.3018741⟩
- Resume_court
- Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs) …..
- Accès au texte intégral et bibtex
- titre
- Encoding high-cardinality string categorical variables
- auteur
- Patricio Cerda, Gaël Varoquaux
- article
- IEEE Transactions on Knowledge and Data Engineering, In press, ⟨10.1109/TKDE.2020.2992529⟩
- Resume_court
- Statistical models usually require vector representations of categorical variables, using for instan …..
- Accès au texte intégral et bibtex
Conference papers
- titre
- NeuMiss networks: differentiable programming for supervised learning with missing values
- auteur
- Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gaël Varoquaux
- article
- NeurIPS 2020 – 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada
- Resume_court
- The presence of missing values makes supervised learning much more challenging. Indeed, previous wor …..
- Accès au texte intégral et bibtex
- titre
- Linear predictor on linearly-generated data with missing values: non consistency and solutions
- auteur
- Marine Le Morvan, Nicolas Prost, Julie Josse, Erwan Scornet, Gaël Varoquaux
- article
- AISTATS 2020 – International Conference on Artificial Intelligence and Statistics, Aug 2020, Online, France. pp.3165-3174
- Resume_court
- We consider building predictors when the data have missing values. We study the seemingly-simple cas …..
- Accès au texte intégral et bibtex
2019
Conference papers
- titre
- Comparing distributions: $l1$ geometry improves kernel two-sample testing
- auteur
- Meyer Scetbon, Gaël Varoquaux
- article
- NeurIPS 2019 – 33th Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
- Accès au texte intégral et bibtex
2018
Journal articles
- titre
- Atlases of cognition with large-scale human brain mapping
- auteur
- Gaël Varoquaux, Yannick Schwartz, Russell A Poldrack, Baptiste Gauthier, Danilo Bzdok, Jean-Baptiste Poline, Bertrand Thirion
- article
- PLoS Computational Biology, 2018, 14 (11), pp.e1006565. ⟨10.1371/journal.pcbi.1006565⟩
- Resume_court
- To map the neural substrate of mental function, cognitive neuroimaging relies on controlled psycholo …..
- Accès au texte intégral et bibtex
- titre
- Similarity encoding for learning with dirty categorical variables
- auteur
- Patricio Cerda, Gaël Varoquaux, Balázs Kégl
- article
- Machine Learning, 2018, ⟨10.1007/s10994-018-5724-2⟩
- Resume_court
- For statistical learning, categorical variables in a table are usually considered as discrete entiti …..
- Accès au texte intégral et bibtex