3.4. Tree Manipulation and Restructuring

The Tree class provides both low-level and high-level methods for manipulating tree structure.

Note

In versions of DendroPy prior to 3.8.0, some of the functionality described here were available as standalone functions in the treemanip module. With version 3.8.0, this functionality has been refactored into native instance methods of the Tree class. The functions are still available in the treemanip module, but these will soon be deprecated. All new code carrying out any of the operations described below should be written using native Tree methods, rather than the standalone functions in the treemanip module.

Low-level methods are associated with Node objects, and allow to restructure the relationships between nodes at a fine level: add_child, new_child, remove_child, etc.

In most cases, however, you will be using high-level methods to restructure Tree objects.

In all cases, if any part of the Tree object’s structural relations change, and you are interested in calculating any metrics or statistics on the tree or comparing the tree to another tree, you need to call update_splits on the object to update the internal splits hash representation. This is not done for you automatically because there is a computational cost associated with the operation, and the splits hashes are not always needed. Furthermore, even when needed, if there are a number of structural changes to be made to a Tree object before calculations/comparisions, it makes sense to postpone the splits rehashing until there all the tree manipulations are completed. Most methods that affect the tree structure that require the splits hashes to updated take a update_splits argument. By specifying True for this, the Tree object will recalculate the splits hashes after the changes have been made.

3.4.1. Rooting, Derooting and Rerooting

3.4.1.1. Setting the Rooting State

All Tree objects have a boolean property, is_rooted that DendroPy uses to track whether or not the tree should be treated as rooted. The property is_unrooted is also defined, and these two properties are synchronized. Thus setting is_rooted to True will result in is_rooted being set to False and vice versa.

The state of a Tree object’s rootedness flag does not modify any internal structural relationship between nodes. It simply determines how its splits hashes are calculated, which in turn affects a broad range of comparison and metric operations. Thus you need to update the splits hashes after modifying the is_rooted property by calling the update_splits before carrying out any calculations on or with the Tree object. Note that calling update_splits on an unrooted tree will force the basal split to be a trifurcation. So if the original tree was bifurcating, the end result will be a tree with a trifurcation at the root. This can be prevented by passing in the keyword argument delete_outdegree_one=False to update_splits.

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A, (B, (C, (D, E))));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Original:")
print(tree.as_ascii_plot())

tree.is_rooted = False
print("After `is_rooted=False`:")
print(tree.as_ascii_plot())

tree.update_splits()
print("After `update_splits()`:")
print(tree.as_ascii_plot())

tree2 = dendropy.Tree.get_from_string(
        tree_str,
        "newick")
tree2.is_rooted = False
tree2.update_splits(delete_outdegree_one=False)
print("After `update_splits(delete_outdegree_one=False)`:")
print(tree2.as_ascii_plot())

will result in:

Original:
/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After `is_rooted=False`:
/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After `update_splits()`:
/---------------------------------------------------- A
|
+---------------------------------------------------- B
|
|                /----------------------------------- C
\----------------+
                 |                 /----------------- D
                 \-----------------+
                                   \----------------- E


After `update_splits(delete_outdegree_one=False)`:
/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E

3.4.1.2. Derooting

To deroot a rooted Tree, you can also call the deroot method, which collapses the root to a trifurcation if it is bifurcation and sets the is_rooted to False. The deroot method has the same structural and semantic affect of is_rooted to False and then calling update_splits. You would use the former if you are not going to be doing any tree comparisons or calculating tree metrics, and thus do not want to calculate the splits hashes.

3.4.1.3. Rerooting

To reroot a Tree along an existing edge, you can use the reroot_at_edge method. This method takes an Edge object as as its first argument. This rerooting is a structural change that will require the splits hashes to be updated before performing any tree comparisons or calculating tree metrics. If needed, you can do this yourself by calling update_splits later, or you can pass in True as the second argument to the reroot_at_edge method call, which instructs DendroPy to automatically update the splits for you.

As an example, the following reroots the tree along an internal edge (note that we do not recalculate the splits hashes, as we are not carrying out any calculations or comparisons with the Tree):

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A, (B, (C, (D, E))));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
mrca = tree.mrca(taxon_labels=["D", "E"])
tree.reroot_at_edge(mrca.edge, update_splits=False)
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

and results in:

Before:
[&R] (A,(B,(C,(D,E))));

/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After:
[&R] ((D,E),(C,(B,A)));

                                   /----------------- D
/----------------------------------+
|                                  \----------------- E
+
|                /----------------------------------- C
\----------------+
                 |                 /----------------- B
                 \-----------------+
                                   \----------------- A

Another example, this time rerooting along an edge subtending a tip instead of an internal edge:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A, (B, (C, (D, E))));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
node_D = tree.find_node_with_taxon_label("D")
tree.reroot_at_edge(node_D.edge, update_splits=False)
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

which results in:

Before:
[&R] (A,(B,(C,(D,E))));

/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
            |             /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After:
[&R] (D,(E,(C,(B,A))));

/---------------------------------------------------- D
+
|            /--------------------------------------- E
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- B
                          \------------+
                                       \------------- A

To reroot a Tree at a node instead, you can use the reroot_at_node method:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A, (B, (C, (D, E))));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
mrca = tree.mrca(taxon_labels=["D", "E"])
tree.reroot_at_node(mrca, update_splits=False)
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

which results in:

Before:
[&R] (A,(B,(C,(D,E))));

/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After:
[&R] (D,E,(C,(B,A)));

/---------------------------------------------------- D
|
+---------------------------------------------------- E
|
|                /----------------------------------- C
\----------------+
                 |                 /----------------- B
                 \-----------------+
                                   \----------------- A

You can also reroot the tree such that a particular node is moved to the outgroup position using the to_outgroup_position, which takes a Node as the first argument. Again, you can update the splits hashes in situ by passing True to the second argument, and again, here we do not because we are not carrying out any calculations. For example:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A, (B, (C, (D, E))));"

tree = dendropy.Tree.get_from_string(
    tree_str,
    "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
outgroup_node = tree.find_node_with_taxon_label("C")
tree.to_outgroup_position(outgroup_node, update_splits=False)
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

which will result in:

Before:
[&R] (A,(B,(C,(D,E))));

/---------------------------------------------------- A
+
|            /--------------------------------------- B
\------------+
             |            /-------------------------- C
             \------------+
                          |            /------------- D
                          \------------+
                                       \------------- E


After:
[&R] (C,(D,E),(B,A));

/---------------------------------------------------- C
|
|                         /-------------------------- D
+-------------------------+
|                         \-------------------------- E
|
|                         /-------------------------- B
\-------------------------+
                          \-------------------------- A

If you have a tree with edge lengths specified, you can reroot it at the midpoint, using the reroot_at_midpoint method:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] (A:0.55, (B:0.82, (C:0.74, (D:0.42, E:0.64):0.24):0.15):0.20):0.3;"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot(plot_metric='length'))
tree.reroot_at_midpoint(update_splits=False)
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot(plot_metric='length'))

which results in:

Before:
[&R] (A:0.55,(B:0.82,(C:0.74,(D:0.42,E:0.64):0.24):0.15):0.2):0.3;

          /------------------- A
          +
          |      /---------------------------- B
          \------+
                 |    /-------------------------- C
                 \----+
                      |        /-------------- D
                      \--------+
                               \---------------------- E


After:
[&R] ((C:0.74,(D:0.42,E:0.64):0.24):0.045,(B:0.82,A:0.75):0.105):0.3;

               /------------------------------- C
             /-+
             | |         /------------------ D
             | \---------+
             +           \---------------------------- E
             |
             |   /------------------------------------ B
             \---+
                 \-------------------------------- A

3.4.2. Pruning Subtrees and Tips

To remove a set of tips from a Tree, you cna use either the prune_taxa or the prune_taxa_with_labels methods. The first takes a container of TaxonSet objects as an argument, while the second takes container of strings. In both cases, nodes associated with the specified taxa (as given by the TaxonSet objects directly in the first case, or TaxonSet objects with labels given in the list of string in the second case) will e removed from the tree. For example:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] ((A, (B, (C, (D, E)))),(F, (G, H)));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
tree.prune_taxa_with_labels(["A", "C", "G"])
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

which results in:

Before:
[&R] ((A,(B,(C,(D,E)))),(F,(G,H)));

          /------------------------------------------- A
/---------+
|         |          /-------------------------------- B
|         \----------+
|                    |          /--------------------- C
|                    \----------+
+                               |          /---------- D
|                               \----------+
|                                          \---------- E
|
|                               /--------------------- F
\-------------------------------+
                                |          /---------- G
                                \----------+
                                           \---------- H


After:
[&R] ((B,(D,E)),(F,H));

                  /----------------------------------- B
/-----------------+
|                 |                 /----------------- D
|                 \-----------------+
+                                   \----------------- E
|
|                                   /----------------- F
\-----------------------------------+
                                    \----------------- H

Alternatively, the tree can be pruned based on a set of taxa that you want to keep. This can be affected through the use of the counterpart “retain” methods, retain_taxa and retain_taxa_with_labels. For example:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] ((A, (B, (C, (D, E)))),(F, (G, H)));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
tree.retain_taxa_with_labels(["A", "C", "G"])
print("After:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

which results in:

Before:
[&R] ((A,(B,(C,(D,E)))),(F,(G,H)));

          /------------------------------------------- A
/---------+
|         |          /-------------------------------- B
|         \----------+
|                    |          /--------------------- C
|                    \----------+
+                               |          /---------- D
|                               \----------+
|                                          \---------- E
|
|                               /--------------------- F
\-------------------------------+
                                |          /---------- G
                                \----------+
                                           \---------- H


After:
[&R] ((A,C),G);

                           /-------------------------- A
/--------------------------+
+                          \-------------------------- C
|
\----------------------------------------------------- G

Again, it should be noted that, as these operations modify the structure of the tree, you need to call update_splits to update the internal splits hashes, before carrying out any calculations, comparisons, or metrics.

3.4.3. Rotating

You can ladderize trees (sort the child nodes in order of the number of their children) by calling the ladderize method. This method takes one argument, ascending. If ascending=True, which is the default, then the nodes are sorted in ascending order (i.e., nodes with fewer children sort before nodes with more children). If ascending=False, then the nodes are sorted in descending order (i.e., nodes with more children sorting before nodes with fewer children). For example:

#! /usr/bin/env python

import dendropy

tree_str = "[&R] ((A, (B, (C, (D, E)))),(F, (G, H)));"

tree = dendropy.Tree.get_from_string(
        tree_str,
        "newick")

print("Before:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
tree.ladderize(ascending=True)
print("Ladderize, ascending=True:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())
tree.ladderize(ascending=False)
print("Ladderize, ascending=False:")
print(tree.as_string('newick'))
print(tree.as_ascii_plot())

results in:

Before:
[&R] ((A,(B,(C,(D,E)))),(F,(G,H)));

          /------------------------------------------- A
/---------+
|         |          /-------------------------------- B
|         \----------+
|                    |          /--------------------- C
|                    \----------+
+                               |          /---------- D
|                               \----------+
|                                          \---------- E
|
|                               /--------------------- F
\-------------------------------+
                                |          /---------- G
                                \----------+
                                           \---------- H


Ladderize, ascending=True:
[&R] ((F,(G,H)),(A,(B,(C,(D,E)))));

                                /--------------------- F
/-------------------------------+
|                               |          /---------- G
|                               \----------+
+                                          \---------- H
|
|         /------------------------------------------- A
\---------+
          |          /-------------------------------- B
          \----------+
                     |          /--------------------- C
                     \----------+
                                |          /---------- D
                                \----------+
                                           \---------- E


Ladderize, ascending=False:
[&R] (((((D,E),C),B),A),((G,H),F));

                                           /---------- D
                                /----------+
                     /----------+          \---------- E
                     |          |
          /----------+          \--------------------- C
          |          |
/---------+          \-------------------------------- B
|         |
|         \------------------------------------------- A
+
|                                          /---------- G
|                               /----------+
\-------------------------------+          \---------- H
                                |
                                \--------------------- F

Tree rotation operations do not actually change the tree structure, at least in so far as splits are concerned, so it is not neccessary to update the splits hashes.

Table Of Contents

Previous topic

3.3. Tree Statistics, Metrics, and Calculations

Next topic

3.5. Tree Simulation and Generation

Documentation

Obtaining

AnnouncementsGoogle Groups

Join the "DendroPy Announcements" group to receive announcements of new releases, updates, changes and other news of interest to DendroPy users and developers.

Enter your e-mail address in the box above and click the "subscribe" button to subscribe to the "dendropy-announce" group, or click here to visit this group page directly.

DiscussionGoogle Groups

Join the "DendroPy Users" group to follow and participate in discussion, troubleshooting, help, information, suggestions, etc. on the usage and development of the DendroPy phylogenetic computing library.

Enter your e-mail address in the box above and click the "subscribe" button to subscribe to the "dendropy-users" group, or click here to visit this group page directly.