Logan Search
In collaboration with Téo Lemane from CEA Genoscope, with our partner Rayan Chikhi from the Sequences Bioinformatics team at Pasteur Institute, and the Artem Babaian‘s group we introduced “Logan Search” which allows you to search for any DNA sequence in minutes, bringing Earth’s largest genomic resource to your fingertips.
Under the hood, we built a 1 petabyte k-mer index for all 27 million sequencing datasets in the SRA up to 12-2023.
Logan Search transforms your query to its k-mers (k=31), and in the time it takes to brew a coffee, it will retrieve every dataset containing your k-mers. It’s the only service working at this scale.
The output datasets are easily visualized with custom plots in Logan Search, which accesses a harmonized set of query and SRA meta-data including sequencing technology, type of molecule, geographic distribution, and sample origins. Learn more about your sequence.
Logan Search returns a list of SRA accessions, not alignments. To bring you closer to the data we’ve also created a microservice to instantly retrieve Logan contigs matching your search.
Back to sequences
We proposed a simple yet useful tool when dealing with large genomic datasets. This is “Back to sequences: Find the origin of k-mers [1]”.
[1] Baire et al., (2024). Back to sequences: Find the origin of k-mers. Journal of Open Source Software, 9(101), 7066, https://doi.org/10.21105/joss.07066