Publications

Publications HAL du projet ANR. ANR-17-CE23-0018

2022

Journal articles

titre
Benchmarking missing-values approaches for predictive models on health databases
auteur
Alexandre Perez-Lebel, Gaël Varoquaux, Marine Le Morvan, Julie Josse, Jean-Baptiste Poline
article
GigaScience, BioMed Central, In press
Resume_court
BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03526292/file/Benchmarking%20missing-values%20approaches%20for%20predictive%20models%20on%20health%20databases.pdf BibTex

2021

Journal articles

titre
Preventing dataset shift from breaking machine-learning biomarkers
auteur
Jérôme Dockès, Gaël Varoquaux, Jean-Baptiste Poline
article
GigaScience, BioMed Central, In press
Resume_court
Machine learning brings the hope of finding new biomarkers extracted from cohorts with rich biomedic …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03293375/file/main.pdf BibTex

Conference papers

titre
AI as statistical methods for imperfect theories
auteur
Gaël Varoquaux
article
NeurIPS 2021 – 35th Conference on Neural Information Processing Systems. Workshop: AI for Science, Dec 2021, Virtual, France
Resume_court
Science has progressed by reasoning on what models could not predict because they were missing impor …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03474791/file/paper.pdf BibTex
titre
What’s a good imputation to predict with missing values?
auteur
Marine Le Morvan, Julie Josse, Erwan Scornet, Gaël Varoquaux
article
NeurIPS 2021 – 35th Conference on Neural Information Processing Systems, Dec 2021, Virtual, France
Resume_court
How to learn a good predictor on data with missing values? Most efforts focus on first imputing as w …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03243931/file/LeMorvan2021_ImputeThenRegress.pdf BibTex
titre
Accounting for variance in machine learning benchmarks
auteur
Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent
article
MLsys 2021 – 4th Conference on Machine Learning and Systems, Apr 2021, San Francisco (virtual), United States
Resume_court
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally ca …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03177159/file/main.pdf BibTex
titre
A Lightweight Neural Model for Biomedical Entity Linking
auteur
Lihu Chen, Gaël Varoquaux, Fabian Suchanek
article
The Thirty-Fifth AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence, Feb 2021, Palo Alto, United States. pp.14
Resume_court
Biomedical entity linking aims to map biomedical mentions, such as diseases and drugs, to standard e …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03086044/file/Biomedical_Entity_Linking.pdf BibTex

Preprints, Working Papers, …

titre
Causal effect on a target population: a sensitivity analysis to handle missing covariates
auteur
Bénédicte Colnet, Julie Josse, Erwan Scornet, Gaël Varoquaux
article
2021
Resume_court
Randomized Controlled Trials (RCTs) are often considered as the gold standard to conclude on the cau …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03473691/file/missing-cov.pdf BibTex

2020

Journal articles

titre
Encoding high-cardinality string categorical variables
auteur
Patricio Cerda, Gaël Varoquaux
article
IEEE Transactions on Knowledge and Data Engineering, Institute of Electrical and Electronics Engineers, In press, ⟨10.1109/TKDE.2020.2992529⟩
Resume_court
Statistical models usually require vector representations of categorical variables, using for instan …..
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-02171256/file/article.pdf BibTex

Conference papers

titre
NeuMiss networks: differentiable programming for supervised learning with missing values
auteur
Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gaël Varoquaux
article
NeurIPS 2020 – 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada
Resume_court
The presence of missing values makes supervised learning much more challenging. Indeed, previous wor …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-02888867/file/main.pdf BibTex
titre
Linear predictor on linearly-generated data with missing values: non consistency and solutions
auteur
Marine Le Morvan, Nicolas Prost, Julie Josse, Erwan Scornet, Gaël Varoquaux
article
AISTATS 2020 – International Conference on Artificial Intelligence and Statistics, Aug 2020, Online, France. pp.3165-3174
Resume_court
We consider building predictors when the data have missing values. We study the seemingly-simple cas …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-02464569/file/aistats.pdf BibTex

Preprints, Working Papers, …

titre
Causal inference methods for combining randomized trials and observational studies: a review
auteur
Bénédicte Colnet, Imke Mayer, Guanhua Chen, Awa Dieng, Ruohong Li, Gaël Varoquaux, Jean-Philippe Vert, Julie Josse, Shu Yang
article
2020
Resume_court
With increasing data availability, treatment causal effects can be evaluated across different datase …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-03008276/file/main.pdf BibTex
titre
On the consistency of supervised learning with missing values
auteur
Julie Josse, Nicolas Prost, Erwan Scornet, Gaël Varoquaux
article
2020
Resume_court
In many application settings, the data have missing entries which make analysis challenging. An abun …..
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-02024202/file/main.pdf BibTex

2019

Conference papers

titre
Comparing distributions: $l1$ geometry improves kernel two-sample testing
auteur
Meyer Scetbon, Gaël Varoquaux
article
NeurIPS 2019 – 33th Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-02292545/file/NIPS_L1_test-HAL-v2%20%281%29.pdf BibTex

2018

Journal articles

titre
Atlases of cognition with large-scale human brain mapping
auteur
Gaël Varoquaux, Yannick Schwartz, Russell Poldrack, Baptiste Gauthier, Danilo Bzdok, Jean-Baptiste Poline, Bertrand Thirion
article
PLoS Computational Biology, Public Library of Science, 2018, 14 (11), pp.e1006565. ⟨10.1371/journal.pcbi.1006565⟩
Resume_court
To map the neural substrate of mental function, cognitive neuroimaging relies on controlled psycholo …..
Accès au texte intégral et bibtex
https://www.hal.inserm.fr/inserm-02146700/file/journal.pcbi.1006565.pdf BibTex
titre
Similarity encoding for learning with dirty categorical variables
auteur
Patricio Cerda, Gaël Varoquaux, Balázs Kégl
article
Machine Learning, Springer Verlag, 2018, ⟨10.1007/s10994-018-5724-2⟩
Resume_court
For statistical learning, categorical variables in a table are usually considered as discrete entiti …..
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01806175/file/article_hal.pdf BibTex

Comments are closed.