

{"id":222,"date":"2018-08-24T15:00:14","date_gmt":"2018-08-24T13:00:14","guid":{"rendered":"https:\/\/project.inria.fr\/treerecs\/?page_id=222"},"modified":"2020-05-26T15:34:54","modified_gmt":"2020-05-26T13:34:54","slug":"tutorial","status":"publish","type":"page","link":"https:\/\/project.inria.fr\/treerecs\/tutorial\/","title":{"rendered":"Tutorial"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p><em>Treerecs<\/em> allows for the correction of one or more gene trees with respect to a reference species tree. These trees are thus required as inputs.<br \/>\nIn addition, the leaves from the gene tree(s) must be associated to their corresponding species in the species tree. This can be achieved in different ways which we will walk you through further on.<\/p>\n<p>Required input:<\/p>\n<ul>\n<li>A reference species tree<\/li>\n<li>One or more gene tree(s) to be corrected<\/li>\n<\/ul>\n<p>Ouput:<\/p>\n<ul>\n<li>One or more corrected gene tree(s) for each input gene tree<\/li>\n<\/ul>\n<p><!-- -------------------- Simple case -------------------- --><\/p>\n<h2 id=\"simple-example\">A simple example<\/h2>\n<p>You will find a very simple example in the <em>examples\/tutorial\/1-simple<\/em> directory<\/p>\n<h4>Input<\/h4>\n<p>Let&#8217;s go into this directory and checkout its content:<br \/>\n<code>$ cd path\/to\/treerecs\/examples\/tutorial\/1-simple<br \/>\n$ ls<br \/>\ngene_tree  species_tree<br \/>\n$ cat species_tree<br \/>\n((a, b), c);<br \/>\n$ cat gene_tree<br \/>\n(a1:2.52, a2:2.15, b1:1.61, b2:1.93, b3:1.81, c1:3.4);<br \/>\n<\/code><br \/>\nFor those of you who are not familiar with the newick format, here&#8217;s a graphical representation of these trees:<\/p>\n<div class=\"alignleft\">\n  <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/treerecs\/files\/2018\/08\/species_tree-1.png\" alt=\"\" width=\"100\" height=\"130\" \/><\/p>\n<p class=\"wp-caption-text\">Species tree<\/p>\n<\/div>\n<div class=\"alignleft\">\n  <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/treerecs\/files\/2018\/08\/gene_tree.png\" alt=\"\" width=\"100\" height=\"130\" \/><\/p>\n<p class=\"wp-caption-text\">Gene tree<\/p>\n<\/div>\n<div style=\"clear:both;\"><\/p>\n<h4>Run Treerecs<\/h4>\n<p>Let&#8217;s run the following command and check the result:<br \/>\n<code>$ treerecs -s species_tree -g gene_tree<br \/>\nTreerecs, Inria - team Beagle, 2017<br \/>\nSolution saved in treerecs_output\/gene_tree_recs.nwk<br \/>\nTotal elapsed time 0.011 seconds.<br \/>\n<\/code><\/p>\n<h4>Inspect output<\/h4>\n<p><code>$ cat treerecs_output\/gene_tree_recs.nwk<br \/>\n> family 1 tree 1 (total cost = 4, duplications = 2, losses = 0, contraction threshold = 0, execution time = 0.003 s.)<br \/>\n(c1:1.7,((a2:2.15,b3:1.81):1e-06,(a1:2.52,(b1:1.61,b2:1.93):1e-06):1e-06):1.7);<br \/>\n<\/code><\/p>\n<p>Graphical representation:<\/p>\n<div class=\"alignnone\">\n  <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/treerecs\/files\/2018\/08\/corrected_gene_tree-1.png\" alt=\"\" width=\"400\" height=\"380\" \/><\/p>\n<p class=\"wp-caption-text\">Corrected gene tree<\/p>\n<\/div>\n<h4>Graphical output<\/h4>\n<p>Here, we have used an external newick visualization tool to generate the graphical representation of our graphs.<br \/>\nThere is actually an option to get a graphical output straight from <em>Treerecs<\/em>. It outputs an SVG displaying the gene trees <em>embedded in the species tree<\/em>.<\/p>\n<p>To get this output, use the -O switch  with value svg:<br \/>\n<code>$ treerecs -s species_tree -g gene_tree -O svg -o treerecs_output_with_svg<br \/>\nTreerecs, Inria - team Beagle, 2017<br \/>\nSolution saved in treerecs_output_with_svg\/gene_tree_recs.svg<br \/>\nTotal elapsed time 0.01 seconds.<br \/>\n<\/code><\/p>\n<p>And here&#8217;s the generated SVG (the star represents the gene tree root, squares represent gene duplications):<br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/project.inria.fr\/treerecs\/files\/2018\/08\/gene_tree_reconciliation-300x226.png\" alt=\"\" width=\"300\" height=\"226\" class=\"alignnone\" \/><\/p>\n<p><!-- -------------------- Gene-Species mapping -------------------- --><\/p>\n<h2 id=\"mapping-example\">Specifying the gene-species mapping<\/h2>\n<h4>When no explicit mapping is needed<\/h4>\n<p>In the simple example above, we did not have to worry about the gene-species mapping, <em>i.e.<\/em> about specifying which species each gene corresponds to. This is because all the genes in the gene tree had the corresponding species name embedded in their own name. For example, gene <em>a1<\/em> corresponds to species <em>a<\/em> (&#8220;a&#8221; is a substring of &#8220;a1&#8221;). Usually, this is more explicit: for example, gene <em>ENSP00000364946<\/em> belonging to species <em>Homo-sapiens<\/em> can be named Homo-sapiens_ENSP00000364946.<\/p>\n<p>To make it short, whichever format is used for the gene names, if for each gene the corresponding species name is a substring of the gene name, Treerecs can build the gene-species mapping automatically from this information.<br \/>\nAlternatively, Treerecs can use the NHX &#8216;S&#8217; (species name) tag to build the mapping.<\/p>\n<h4>Explicit mapping<\/h4>\n<p>If your data does not allow for automatic mapping, you will have to provide the gene-species mapping explicitly as a separate file.<br \/>\nYou will find an example in the <em>examples\/tutorial\/2-mapping<\/em> directory<\/p>\n<p>If you try running <em>Treerecs<\/em> as we did before with only the gene tree and species tree as input, <em>Treerecs<\/em> will complain it can&#8217;t map some genes to any species:<br \/>\n<code>$ cd path\/to\/treerecs\/examples\/tutorial\/2-mapping<br \/>\n$ treerecs -s species_tree -g gene_tree<br \/>\nTreerecs, Inria - team Beagle, 2017<br \/>\nError during gene <> species mapping, some gene leaves cannot be mapped:<br \/>\na1, a2.<br \/>\n<\/code><br \/>\nIndeed, in this example, the species names are <em>aaa<\/em>, <em>b<\/em> and <em>c<\/em>. So genes <em>a1<\/em> and <em>a2<\/em> don&#8217;t include a species name in their own name and thus can&#8217;t be mapped automatically.<\/p>\n<p><em>Treerecs<\/em> can be used in such a setting by handing it a mapping file. This file must contain one line per gene, each line consisting of the gene name, a white space and the name of the corresponding species.<br \/>\nFor example, in our example, the mapping file contains:<br \/>\n<code>a1 aaa<br \/>\na2 aaa<br \/>\nb1 b<br \/>\nb2 b<br \/>\nb3 b<br \/>\nc1 c<br \/>\n<\/code><br \/>\nYou can then run <em>Treerecs<\/em> with the -S option as follows:<br \/>\n<code>$ treerecs -s species_tree -g gene_tree -S mapping<br \/>\nTreerecs, Inria - team Beagle, 2017<br \/>\nSolution saved in treerecs_output\/gene_tree_recs.svg<br \/>\nTotal elapsed time 0.01 seconds.<\/code><\/p>\n<p><!-- -------------------- Output formats -------------------- --><br \/>\n<!--\n\n\n<h2 id=\"output-formats-example\">Output formats and multiple outputs<\/h2>\n\n\n\n<em>Treerecs<\/em> supports several different formats.\n\n\n<table dir=\"auto\">\n  \n\n<thead>\n    \n\n<tr>\n      \n\n<th>Format<\/th>\n\n\n      \n\n<th>Branch lengths<\/th>\n\n\n      \n\n<th>Duplications annotated<\/th>\n\n\n      \n\n<th>Gene losses<\/th>\n\n<\/tr>\n\n\n  <\/thead>\n\n\n  \n\n<tbody>\n    \n\n<tr>\n      \n\n<td>Newick<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>NO<\/td>\n\n\n\n<td>NO<\/td>\n\n<\/tr>\n\n\n      \n\n<tr>\n\n<td>NHX<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>NO<\/td>\n\n<\/tr>\n\n\n      \n\n<tr>\n\n<td>NHX+<\/td>\n\n\n\n<td>NO<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>YES<\/td>\n\n<\/tr>\n\n\n      \n\n<tr>\n\n<td>PhyloXML<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>NO<\/td>\n\n\n\n<td>NO<\/td>\n\n<\/tr>\n\n\n      \n\n<tr>\n\n<td>RecPhyloXML<\/td>\n\n\n\n<td>NO<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>NO<\/td>\n\n<\/tr>\n\n\n      \n\n<tr style=\"border-top:2px solid grey;\">\n\n<td>SVG<\/td>\n\n\n\n<td>NO<\/td>\n\n\n\n<td>YES<\/td>\n\n\n\n<td>YES<\/td>\n\n<\/tr>\n\n\n  <\/tbody>\n\n\n<\/table>\n\n\n--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Treerecs allows for the correction of one or more gene trees with respect to a reference species tree. These trees are thus required as inputs. In addition, the leaves from the gene tree(s) must be associated to their corresponding species in the species tree. This can be achieved in\u2026<\/p>\n<p> <a class=\"continue-reading-link\" href=\"https:\/\/project.inria.fr\/treerecs\/tutorial\/\"><span>Continue reading<\/span><i class=\"crycon-right-dir\"><\/i><\/a> <\/p>\n","protected":false},"author":972,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-222","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/pages\/222","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/users\/972"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/comments?post=222"}],"version-history":[{"count":17,"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/pages\/222\/revisions"}],"predecessor-version":[{"id":380,"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/pages\/222\/revisions\/380"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/treerecs\/wp-json\/wp\/v2\/media?parent=222"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}