SWAGR (2018-2020)
Statistical Workforce for Advanced Genomics using RNAseq
Principal Investigators
- Boris Hejblum – Inria Project-Team SISTM (Center of Bordeaux)
- Denis Agniel (RAND Corporation)
The two principal investigators have met at Harvard University during their postdoctoral fellowships. Their continuous collaboration was strenghened by the selection of their joint project SWAGR in the framework of the Inria Associate Teams program and the members of their respective research teams joined the project.
Research topic
The SWAGR Associate Team aims at bringing together a statistical workforce for advanced methods development for analyzing high-dimensional genomics data in the context of vaccine studies. SWAGR combines the expertise of the SISTM team from Inria BSO and of the Statistics group at the RAND Corporation in an effort to improve RNAseq data analysis methods by developing a flexible, robust, and mathematically principled framework for detecting differential gene expression.
Gene expression (measured through the RNAseq technology) has the potential of revealing deep and complex biological mechanisms underlying human health. However, there is currently a critical limitation in widely adopted approaches for the analysis of such data, leading to an inflation of false positives in analysis results. This problem is exacerbated when studying singlecell RNA-seq data where sample sizes are much larger due to the finer cellular resolution. False positives are an important issue in all of science. In particular in biomedical research when costly studies are failing to reproduce earlier results, this is a pressing issue.
SWAGR has fostered the development of several statistical methodologies for the analysis of high dimensional genomics data. In particular, 3 methods have been developed within SWAGR:
- dearseq, for the analysis of RNA-seq data without introducing false positives,
- ccdf, for the analysis of single-cell RNA-seq data without assuming any distribution
on the zero counts,
- crossurr, for the evaluation of high-dimensional surrogate markers such as gene
expression.
All 3 methods are distribution-free and do not rely on probabilistic distribution assumptions regarding the data generation process for genomics data. This key idea at the center of the research developed in SWAGR.
Outcomes
Boris Hejblum: “As two junior researchers both starting in our respective position, SWAGR has helped us structure and advance our research agenda, giving us credibility and some financial autonomy. It materialized our previously informal collaboration and enabled us to involved interns and students in our projects, giving them the opportunity to travel and experience an international research environment.
While the impact of SWAGR has been somewhat hindered by the pandemic, cutting short our plan to have a workshop on statistical genomics applied to vaccine research (a timely topic), it has led to several software and method developments as well as scientific articles, one being already published while two are still ongoing work.
The Associate Teams is a very good program, especially for junior researcher. Making limited funding available for small research project this way was particularly helpful to stabilize and strengthen our collaboration.”
Future of the partnership
The fruitful collaboration will be continuing in the future. The teams will redefine their collaborative research objectives, and subsequently submit a new associate team project in 2021.
- To know more about this Associate Team
- To find out about the other Associate Teams of Inria@SIliconValley
- And the Inria program